This project aims to enhance the current native SDN controller in supporting Xen/XCP. Implementing more network services for OVS Guest Network is also a great job. Further more, I'll do integrate an opensource SDN controller (eg: floodlight).
For two weeks on the community bonding period, I took a look into Ovs Guest Network plug-in, deployed with CloudStack lastest code on master branch + XCP 1.6, and tried to understand what happen. Now I make a short report about the design architecture and flow sequence.
Thanks to Sebgoa, Hugo (my mentor), and Chiradeep Vittal. I saved much of time when discussed with all of you and read your documents.
Firstly, all of the extending network solutions have to contain these components below:
Network Gurus are responsible for:
I do a short description about Network Guru's functions and steps in the XMind form below. (Source from Chiradeep: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Extending+CloudStack+Networking)
Network Gurus manage the Virtual Network lifecycle: design - implement - reserve - release - shutdown - trash.
With the master branch source code, I do an incomplete description about Network Guru hierarchy, also in XMind form below. Calling "incomplete" because I only focus on SDN solution plug-ins for XenServer. Midonet for KVM so I haven't read it yet.
Network Element represent components in a CloudStack network. They are networking services including:
Many "Element" implemented these above networking services. They are:
Other elements are:
I made a mapping table between Networking Service and Network Element below:
Networking Services |
Network Element |
---|---|
Connectivity |
BigSwitchVns |
DhcpService |
BareMetal |
FirewallService |
CiscoVnmc |
LoadBalancingService |
ElasticLB |
NetworkACLService |
VpcVirtualRouter |
PortForwardingService |
CiscoVnmc |
SourceNatService |
CiscoVnmc |
StaticNatService |
CiscoVnmc |
UserDataService |
BareMetal |
VpcService |
VpcVirtualRouter |
Ovs is now still support L2 networking, so we don't see it in this table.
Some important abstract methods that Network elements should override are:
Network Manager handle the resources managed by the network elements. They are also implemented as many other "resource" managers in CloudStack.
For instance, the manager for setting up L2-in-L3 networks with Open vSwitch is com.cloud.network.ovs.OvsTunnelManagerImpl, whereas Virtual Router lifecycle is managed by com.cloud.network.router.VirtualApplianceManagerImpl.
About the networking flow and how to code, Chiradeep talked completely detailed so I don't repeat. See his doc: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Extending+CloudStack+Networking
Ovs Guest Network, implemented as a Network Plug-in, now only supports the GRE isolation method for Guest network and hasn't supported L3 networking services yet. Because of a networking solution, it also has the Network Guru, Network Element and Network Manager. An user want to use GRE isolation has to set sdn.ovs.controller to true in Global Config, creates a zone with Advanced Networking and creates a physical network using GRE isolation method. Remember to set VLAN range because Ovs uses VnetID for creating GRE key.
Ovs Guest Network uses OpenvSwitch to manage L2 networking services.
When an user completely makes these above actions, he will need to make his VMs running on this network. Firstly he has to create a guest network for his VMs. At the moment, the design() method is called to all configured network gurus. All gurus have an internal canhandle() method to check if they should act. Because of all above configuration steps, Ovs Guest Network Guru will be taken. The guru returns a network object contained some first config params, such as:
After designing network, user need to run the first VM on. During the VM starting, the implementNetwork() method in NetworkManager is called. It then calls to OvsGuestNetworkGuru implement() method. At that time, the Guru allocates a VnetID for network. This number will be taken from DataCenterVnet data object. The network object then sets BroadcastUri attribute to BroadcastDomainType.Vswitch.toUri(vnet).
This phase also implements network elements and applies all the network rules. That means NetworkManager implementes the defined network services (using network offering) into the network by calling to corresponding implement() functions of defined network elements. For implementing service, Ovs Guest Network Element always does nothing because it hasn't support any L3 services yet.
public void implementNetworkElementsAndResources() { ... element.implement(network, offering, dest, context) ... }
Implement phase done.
When the vm is started in the network, Network Element is asked to prepare for the NIC. For other SDN solutions (Eg: Nicira), they already created a logical switch before, at implement stage. However, Ovs Guest Network creates the switch at this time. The Element calls to VmCheckAndCreateTunnel() method of Ovs Network Manager. This stage does some check and creates the GRE tunnel. It need to get the GRE key from BroadcastUri attribute and also get GreEndPointIP. With the first VM, the Manager send Xapi plugin call: OvsSetupBridgeCommand to Xenserver Host. From the second VM onward, the Manager checks the exist tunnels and creates more if it's not enought using xapi plugin call: OvsCreateTunnelCommand. The record of Tunnels is located on TunnelNetwork table.
In setup Bridge period, manager define network name = "OvsTunnel" + key, and set otherConfig parameter = "ovs-host-setup".
Another note here, there is a tricky to create network in xenserver. if you create a network then create bridge by brctl or openvswitch yourself, then you will get an exception that is "REQUIRED_NETWORK" when you start a vm with this network. The solution is, create a vif of dom0 and plug it in network, xenserver will create the bridge on behalf of you.
After creating the switch, the manager will configure it for being used as a L2-in-L3 tunneled network via configureTunnelNetwork() method. This method calls to Xapi plug-in ovstunnel, method setup_ovs_bridge(). The bridge is created with:
# create a bridge with the same name as the xapi network # also associate gre key in other config attribute res = lib.do_cmd([lib.VSCTL_PATH, "--", "--may-exist", "add-br", bridge, "--", "set", "bridge", bridge, "other_config:gre_key=%s" % key]) logging.debug("Bridge has been manually created:%s" % res) # TODO: Make sure xs-network-uuid is set into external_ids lib.do_cmd([lib.VSCTL_PATH, "set", "Bridge", bridge, "external_ids:xs-network-uuid=%s" % xs_nw_uuid]) # Non empty result means something went wrong if res: result = "FAILURE:%s" % res else: # Verify the bridge actually exists, with the gre_key properly set res = lib.do_cmd([lib.VSCTL_PATH, "get", "bridge", bridge, "other_config:gre_key"]) if key in res: result = "SUCCESS:%s" % bridge else: result = "FAILURE:%s" % res # Finally note in the xenapi network object that the network has # been configured xs_nw_uuid = lib.do_cmd([lib.XE_PATH, "network-list", "bridge=%s" % bridge, "--minimal"]) lib.do_cmd([lib.XE_PATH, "network-param-set", "uuid=%s" % xs_nw_uuid, "other-config:is-ovs-tun-network=True"]) conf_hosts = lib.do_cmd([lib.XE_PATH, "network-param-get", "uuid=%s" % xs_nw_uuid, "param-name=other-config", "param-key=ovs-host-setup", "--minimal"]) conf_hosts = cs_host_id + (conf_hosts and ',%s' % conf_hosts or '') lib.do_cmd([lib.XE_PATH, "network-param-set", "uuid=%s" % xs_nw_uuid, "other-config:ovs-host-setup=%s"
Prepare phase done. the VM is ready to use.
NiciraNVP is a more complete solution than Ovs. It has a Nicira NVP Controller to create and manage logical switches, although at that time NiciraNVP doesn't support too many networking services. Like other networking solutions, it also has three main components: Network Guru, Network Element and Network Manager.
NiciraNVP Guest Network also uses OpenvSwitch to manage L2 networking services.
I will talk about the flow sequence, each step will have the corresponding design.
For using Nicira network, user has to create Physical Network with STT isolation method. Then create a new network offering with services. User has to declare the NVP controller. When creating guest network, it sends a design call to Guru. At the beginning, Nicira Guru checks the NVP Controller. If it doesn't exist, exception is thrown and the design process is terminated. Nicira NVP has to make sure there is at least one active controller. After this checking, the Guru creates a Network data object with some default initial params.
One notice at this step is the Guru override the broadcast domain type to Lswitch
// Override the broadcast domain type networkObject.setBroadcastDomainType(BroadcastDomainType.Lswitch);
At this step, the Guru creates a logical switch inside the NVP Controller. Nicira tags a name representing the account name, which executes this implement, for each logical switch.
// Tags set to scope cs_account and account name List<NiciraNvpTag> tags = new ArrayList<NiciraNvpTag>(); tags.add(new NiciraNvpTag("cs_account",cmd.getOwnerName())); logicalSwitch.setTags(tags);
Creating logical switch command is executed via HTTP POST method, using the uri = "/ws.v1/lswitch".
public LogicalSwitch createLogicalSwitch(LogicalSwitch logicalSwitch) throws NiciraNvpApiException { String uri = "/ws.v1/lswitch"; LogicalSwitch createdLogicalSwitch = executeCreateObject(logicalSwitch, new TypeToken<LogicalSwitch>(){}.getType(), uri, Collections.<String,String>emptyMap()); return createdLogicalSwitch; }
This phase also implements network services and applies rules. NiciraNVP creates a logical router at this time, tags it with a name representing the account name. It's also a HTTP POST request with uri = "/ws.v1/lrouter". There are some magical things in this method that I haven't read carefully yet. They completely implemented routing services with their own router in com.cloud.network.nicira. I'll read it soon and ask my mentor about some specific details. <incompleted>
when the VM is started in the network all elements are asked for preparing the NIC. NVP Element receives a prepare() call with the NIC detail. Creating logical switch port command is executed via HTTP POST request with uri = "/ws.v1/lswitch/" + logicalSwitchUuid + "/lport". The port also has a tag correlated the account name.
Prepare phase done. the VM is ready to use.