Bug Reference

CLOUDSTACK-241

Branch

regions

https://git-wip-us.apache.org/repos/asf?p=incubator-cloudstack.git;a=shortlog;h=refs/heads/regions

Introduction

Purpose

The objective of this feature is to add AWS EC2 like Regions implementation into CloudStack.  Regions are dispersed and located in separate geographic areas. Availability Zones(or Zones in CloudStack) are distinct locations within a Region that are engineered to be isolated from failures in other Zones and provide inexpensive, low latency network connectivity to other Zones in the same Region

Regions would provide the following benefits:

  • Higher availability of the services: users can deploy services across AZs and even if one of the AZ goes down the services are still available to the end-user through VMs deployed in other zones.
  • Higher availability of the MS: Since each MS Cluster only manages a single Region, if that MS Cluster goes down, only that particular Region is impacted. Admin should be able to access all the other Regions.
  • Scalability: The scalability limit of CloudStack dramatically improves, as the scalability limit of MS Cluster is limited to a single Region.
  • Object Store: With Regions construct, CloudStack would also allow users to define Object Store (Secondary Storage) across AZs. This helps users easily deploy VMs in different AZs using the same template, offerings.  
  • Geographical Grouping: Regions allow admins to group AZs (that have low latency and are geographically located nearby) into a broader region construct.

References

Requirements Doc: Regions requirements

Feature Specifications

Introduce the notion of Regions, which will be managed independently by a separate management servers.  With Regions, infrastructure will be organized as follows:

Region -> Zone -> Pod -> Cluster

 Accounts

 User Accounts will be available across Regions. User should be able to use the same account in all Regions. Switching between Regions should not require the user to sign-in again

Templates

 CS Templates are currently zone specific. Templates should be available to all zones within a region. User should be able to migrate templates from one Region to another. This requires S3 like object store

Snapshots

Snapshots also should be available to all zones within a Region. CS currently stores snapshots in secondary storage. Snapshots should instead be stored in the object-store

S3 like Object-store/Integration

Current secondary storage is at zone level. S3 like is object-store or integration with other object-stores(like Hadoop) is required to move data across Regions. Each Region should have object store and the data within it should be available to all zones within the region. Utility to move data across Regions should be supported. Templates and Snapshots will be stored in this object store. Requirements for  S3 based secondary storage.  This is tracked in ticket CLOUDSTACK-714 separately

ELB

ELB will be elevated to Region level. Jira ticket CLOUDSTACK-653

EIP

EIP enhancements CLOUDSTACK-652

Authentication Service

 An authentication service will available for components like object-store to authenticate and get account information using getUser API

Assumptions

Object Store: This feature assumes that S3 like object store is available. Either implemented in CloudStack or through integration.

Templates, Snapshots, EIP and ELP work is not part of this spec. There are being tracked separately as per JIRA tickets mentioned above.

Account Provisioning: Typically account provisioning is external to CloudStack. Adding accounts in all regions and keeping the data in sync is not handled by CloudStack. Only events are generated when changes are made.

Usage

  1. Usage records will be generated by each Region separately. Portal layer above CS should combine usage across Regions into a consolidated invoice.

Use cases

(Using external provisioning system)

  • Create Account
    1. Create Account in the external provisioning system with unique Id
    2. Using CS API create account in all regions using the unique Id created in step1 as UUID. CreateAccount API takes UUID as an optional parameter. When UUID is provided, CloudStack does not generate a UUID and instead will use external ID.
    3. External Id(UUID) for the Account should be same in all Regions. Account DB Ids could be different for each Region.
  • Update Account (update/disable/enable Account)
    1. Update account in external provisioning system
    2. Update account in all regions via API.
  • Delete Account
    1. Delete account in external provisioning system
    2. Delete account in all regions via API

Similarly for users and domains.

Architecture and Design description

Database

Each Region will have a separate database. Below data will be common to all databases across Regions:

  1. Account
  2. User
  3. Domain

Remaining data is expected to be per-Region(including projects, global config, resource limits) and not shared across databases in other Regions. 

New Tables

Table

Columns

Description

region

id
name
end_point 

Integer - Unique Id of the Region. Number regions are expected to be small, hence using integer instead of long
Name of the Region. Should be unique.
Region end_point. e.g http://10.147.30.11:8080/client

 

 

 

Event framework

Events are published using the event framework.

Authentication

Once logged in User should be able to switch from one Region to another without providing credentials again

web services APIs

New APIs

addRegion

Registers a region into another region

Request Parameters:

  1. id: (int) Region Id 
  2. name: (String) Name of the Region
  3. endpoint: (String) Region end point. e.g. http://10.147.30.11:8080/client 

Response Parameters:

  1. id: Region Id 
  2. name: Name of the Region
  3. endpoint: Region end point

updateRegion

Updates region details

Request Parameters:

  1. id: (int)Region Id
  2. name: (String)Name of the Region
  3. endpoint: (String)Region end point. e.g. https://10.147.30.11:8080/client

Response Parameters:

  1. id: Region Id 
  2. name: Name of the Region
  3. endpoint: Region end point

removeRegion

Removes region from current region.

Request Parameters:

  1. id: (int)id of the Region to  be removed

Response Parameters:

  1. Boolean success

listRegions

list all Regions. Can be filtered by id or name

Request Parameters:

  1. id: list by Region Id
  2. name: (String)list by Region Name

Response Parameters (List):

  1. id: Region Id 
  2. name: Name of the Region
  3. endpoint: Region end point

getUser

Admin only API. Get user details by api_key. 

Request Parameters:

  1. userapikey: (String)Get user details by API key 

Changes to existing APIs

  • createAccount *# New parameters
    1. *## accountid : UUID of the account
      1. userid: UUID for the User
  • createUser
    • New parameter
      • UUID for the User. 
  • createDomain
    • New parameter
      • domainid: UUID for the Domain

Limitations

1. Account/User/Domain data propogation/sync has to be handled outside cloudstack

2. Only events will be generated by cloudstack

Upgrade

During upgrade, flexibility will be provided to move existing zones into any Region.

Upgrade scenarios:

  • Each zone becomes part of a separate Region: Copy existing DB to all Regions and disable all zones except one in each Region.
    • Upgrade existing MS to 4.1. This MS will become region 1
    • Disable all the zones in region 1
    • Dump region 1 DB
      • mysqldump -u cloud -p -h <region1_db_host> cloud  > region1.sql
    • Install 4.1 MS in the remaining Regions
      • Set the region_id while installing the DB
      • cloud-setup-databases cloud:<dbpassword>@localhost --deploy-as=root:<password> -e <encryption_type> -m <management_server_key> -k <database_key-r <region_id>
    • Copy region 1 DB to other regions*** mysql -u cloud -p -h <region2_db_host> cloud < region1.sql
    • Start all management servers. At this point all the zones are disabled
    • Selectively enable zones in the required regions
  • All zones remain in the same Region: Nothing to do here. 
  • Zones are divided among Regions:  Copy existing DB to all regions and disabling the zones in all Regions other than the selected Region

Scripts

Offline scripts will be provided to move data across regions to ease

  • upgrade 
  • add new region
  • remove region

UI flow

  • User/Admin should be able to view all Regions by logging into a MS of any of the Regions. User then should be able to select a specific Region to view details of that Region.
  • Users should be able to switch between various regions for UI using Single Sign-On.

Sample Workflow

Single Region

If an environment has only 1 region, functionality will be same as the current CS installation.  Id of this region will be 1. All accounts/users/domains will have region_id as 1

Default local region will be added with name "Local" and end_point "http://localhost:8080/client". Use updateRegion API to set a different name or end_point for this region

Adding 2nd Region

1. Install a 2nd CS instance.

2. While installing database set region_id using -r option in cloud-setup-databases script (Make sure database_key is same across all regions).

cloud-setup-databases cloud:<dbpassword>@localhost --deploy-as=root:<password> -e <encryption_type> -m <management_server_key> -k <database_key-r <region_id>

3. Start mgmt server

4. Using addRegion API, add region 1 to region 2 and also region 2 to region 1.

5. copy account/user/domain tables from Region1 DB to Region2 DB:

  • mysqldump -u cloud -p -h <region1_db_host> cloud account user domain > region1.sql
  • mysql -u cloud -p -h <region2_db_host> cloud < region1.sql      

6. Remove project accounts after copying: 

  • mysql> delete from account where type = 5; 

7. Set default zone as null 

  • mysql> update account set default_zone_id = null; 

8. Restart mgmt servers in region 2

Adding 3rd and subsequent Regions

1. Install CS in all new regions

2. While installing database set region_id using -r option in cloud-setup-databases script (Make sure _database_key _is same across all regions).

cloud-setup-databases cloud:<dbpassword>@localhost --deploy-as=root:<password> -e <encryption_type> -m <management_server_key> -k <database_key-r <region_id>

3. Start mgmt server

4. Using addRegion API, add existing regions to region n and also region n to all existing regions

5. copy account/user/domain tables from any existing Region DB to RegionN DB. 

  • mysqldump -u cloud -p -h <region1_db_host> cloud account user domain > region1.sql
  • mysql -u cloud -p -h <regionN_db_host> cloud < region1.sql  

6. Remove project accounts after copying:

  • mysql> delete from account where type = 5;

7. Set default zone as null

  • mysql> update account set default_zone_id = null;

8. Restart mgmt servers in region N

Remove Region

1. Remove region from all other regions using removeRegion API

Appendix

Appendix A:

Appendix B:

  • No labels

6 Comments

  1. Also please update the limitations section.

  2. Questions related to Regions Feature are posted in the comment below. Please notice the open questions present in the comment and kindly update the FS with the Information asked.

    Kishan,

    Wanted  to confirm the following again since FS does not state these assumptions explicitly but the Requirements docs mentions about them:

    1. There is going to be no sync for Global configurations across regions since all global configurations are region specific and applied to only that region.

    2. Projects will not be synced across region.

    3. NFS as secondary storage will continue to be supported at Zone level and no support for NFS at Region level.

    Could you please state them explicitly in the FS ?

    Also can you please update FS with the exact work flow involved when adding a region and deleting a region?

    Since we will continue to extend support for EC2 soap calls , could you please include details regarding soap call authentication changes to support regions?

    -Thanks

    Sangeetha

    ----Original Message----

    From: Manan Shah [mailto:manan.shah@citrix.com]

    Sent: Friday, January 18, 2013 10:27 AM

    To: cloudstack-dev@incubator.apache.org

    Subject: Re: Questions related to Regions Feature

    Hi Kishan,

    First of all thanks for responding to my questions. I have some additional questions / comments below:

    Comments:

    1. I am assuming you will update FS based on all these answers 2. I am assuming that you will document all the workflows, manual processes and other items you mentioned below in FS so that the tech pubs person can update the manuals

    Questions:

    1. If on a region re-start, CCP can go to other regions and grab the latest information, why can't we do the same thing on region creation? In fact, it might be easier as the new region would just have to go to one of the existing regions and do a bulk copy. Am I missing something here?

    Regards,

    Manan Shah



    On 1/17/13 4:58 AM, "Kishan Kavala" <Kishan.Kavala@citrix.com> wrote:

    >Manan,

    > Please find my answers inline.

    >> ----Original Message----

    >> From: Manan Shah [mailto:manan.shah@citrix.com]

    >> Sent: Wednesday, 16 January 2013 1:57 AM

    >> To: cloudstack-dev@incubator.apache.org

    >> Subject: FW: Questions related to Regions Feature

    >>

    >> Kishan,

    >>

    >> I reviewed the FS and I have quite a few questions. I have also

    >>reviewed  questions posted by Sangeetha and tried to cover all of her

    >>questions as well.

    >> Please see the questions below and let us know your thoughts.

    >>

    >> We should try and capture all of these items in the Regions FS /

    >>Design spec if

    >> possible:

    >>

    >>

    >> 1. Assumption is that we will support both NFS as well as ObjectStore

    >>as a  secondary storage. This also means that all templates stored in

    >>NFS storage

    >> (Region-wide) should be available for all zones within a region.

    >[KK] Object store is Region-wide. Secondary storage will remain at the

    >zone level the way it is now. Migration will be required only when

    >someone using NFS secondary storage in 4.0 moves to object store in 4.1.

    >This migration will be manual process which has to be documented along

    >with some scripts to migrate.

    >>

    >> 2. Assumption is that we will continue to support NFS as a secondary

    >>storage  at the zone level as well as add support for NFS as secondary

    >>storage at the  region level

    >[KK] NFS secondary storage will be supported at the zone level only.

    >There will be no NFS secondary storage at Region level. Support for

    >object store at region level will be added. Using object store is

    >optional. During upgrade if someone wants to use object store, data in

    >NFS secondary storage has to be migrated to object store as mentioned

    >above.

    >>

    >> 3. Addition of a new Region to a existing Cloud:

    >> A. New Region Addition:

    >>           * Current functionality is to add a new Region to every existing

    >>region.

    >> This is undesirable. We should replicate the regions DB table just

    >>like  Domain/Accounts, etc so that end users have to add it only in 1

    >>place

    >[KK] It is a good to have functionality. Add Region is a one-time

    >operation and we can live with this limitation for 4.1 release.

    >>           * Please update the FS with the expected admin workflow B. Sync of

    >> Domain / Account / etc:

    >[KK]  I'll add these to FS

    >>           * You had mentioned that this would be done only on a as-needed

    >>basis.

    >> This seems to be confusing. We need to clearly indicate when would

    >>the DB  tables be synced. Our expectation was that when a new Region

    >>is added, all  necessary DB tables will get populated  from sync'd DB

    >>Table list C.

    >>Sync of

    >[KK] When a new Region is added, existing Account/User/Domain details

    >have to copied to new Region manually. This will be documented in FS

    >with steps to copy the data. Any changes after adding Region will be

    >propagated immediately.

    >> Projects:

    >>           * This is in requirements but seems to be missing in FS

    >>

    >[KK]  Projects won't be available across regions.

    >> 4. Sync of Domain / Account when a Region goes down and comes back up:

    >> * You seem to indicate that this would be done on a on-demand basis.

    >> Not clear of the use cases. FS needs to document the details.

    >[KK] It is the responsibility of the source region to ensure that

    >changes are propagated to all regions. I'm still exploring on how to ensure this.

    >>

    >> 5. Removal of Region:

    >> * On Region deletion, what happens to all of the objects that are

    >>owned by  that Region (Domains/Accounts/Projects)

    >

    >[KK] Ownership of the deleted Region objects has to be manually changed

    >to another Region. This again will be documented along with scripts to

    >make this change.

    >> 6. Steps to add / remove Regions:

    >> * Please document the procedure to add/remove regions.

    >[KK] Add/Remove will be through addRegion and removeRegion APIs. I'll

    >add workflows to FS explaining the same.

    >

    >> 7. Sync of Global Params:

    >> * Assuming that account/domain/etc related global configs will be

    >>propagated. Please list all of the global params that will be

    >>propagated.

    >> Global Param changes require a re-start of Mgmt servers. So, if a

    >>domain  related global config is changed, would we  display a message

    >>for all regions  to re-start mgmt servers?

    >>

    >[KK] Global configs will be per Region. Configs need not be synced

    >across regions.

    >>

    >> 8. Resource Limits at the Global level: For example, if a user is

    >>authorized to  spin 5 VMs,  that should be 5 VMs for the entire cloud

    >>and not 5 VMs for a  Region

    >[KK]  Limits again will be per Region.

    >>

    >> 9. API Related changes:

    >> * Please indicate in FS all API changes (new APIs as well as changes

    >>made to  existing APIs)

    >> * What about createTemplate(), registerTemplate(),extractTemplate()

    >>APIs?

    >> How will the copyTemplate() API change?

    >>

    >[KK]  Regiona API changes are already added to FS. Template related API

    >changes will be part of object store work and should probably be

    >discussed in that spec.

    >>

    >> 10. DB Changes:

    >> * Can you please document all DB related changes? New tables and

    >>existing  table changes?

    >>

    >[KK] Sure, I'll add DB changes to spec.

    >> 11. SSVM behavior changes:

    >> * Are there any SSVM behavior changes?

    >> * If a VM is being launched in Zone 1 whose template is in secondary

    >>storage  accessible to zone 1 but physically located in zone 2, would

    >>the SSVM from  zone 1 be able to fetch template from secondary storage

    >>in zone 2?

    >>

    >[KK]  This again should be part of object store related work and

    >discussed separately.

    >>

    >> 12. I understand that the EC2 SOAP support requires another

    >>authentication  mechanism. Assuming we will support this as well.

    >> 

    >[KK] I'm not sure about this requirement

    >

    >> 13. Upgrade Support:

    >> * Assuming we will support all current zones to be in 1 region (with

    >>zone-

    >> wide secondary storage)

    >> * Assuming we will support mix-and-match use case where users can

    >>pick  which zones belong to which regions?

    >> * How will the DB be replicated and split apart?

    >> * Assuming we support mix-and-match, please document the steps that

    >>the  admins would have to go through

    >>

    >[KK] DB replication and disabling certain zones require manual steps.

    >This will be documented in detail as part of upgrade procedure.

    >>

    >> 14. You have mentioned some details in FS related to authentication.

    >>Can you

    >> elaborate this or remove it?

    >[KK] UI should display drop-down list of Regions and when a new Region

    >is selection, end_point should change to the selected Region without

    >requiring authentication.

    >>

    >> Regards,

    >> Manan Shah


  3. Hi Kishan,

    Wanted to confirm that there is going to be no change in the overall functionality even after starting to use “events framework” for syncing data between regions ?

    Will all the meta data relating to regions still be needed ? Are all of these parameters still needed when using “events framework” mechanism for syncing ? Could you explain at high level how different is this approach compared to how it was implemented initially thru API calls ?

    Also , I have the initial set of test cases posted at - https://cwiki.apache.org/confluence/display/CLOUDSTACK/AWS+Regions+Test+Plan . Test plan is still in progress.
    Thanks
    Sangeetha

  4. Kishan,

    Had the following questions:

    1. Why is there a need for getUser API ? When will this get used ?

    2. Can you include the response parameters for all the new API calls ?

    3. In case of update/delete action from a region that is not the owner , API call will succeed as long as it has been able to forward the request to owner region and the owner region actually executed the command . Even after receiving a success from the API call, the changes will not be committed in this region. This seems like a confusing experience . Can we expect any message indicating what has happened to the user?

    -Thanks
    Sangeetha

  5. Since CloudStack is NOT responsible for propagation of domain/account/user data between regions , Why should we still maintain the concept of being able to perform delete/edit operations only by owners ? Is there a need to have a concept of a region being the owner for domain/account/user ?

  6. In Sample Workflow - Single Region Section ,
    Following mention of region_id needs rto be removed - "All accounts/users/domains will have region_id as 1.

    Wanted to the confirm that we should have the encryption_type , management_server_key and database_key to be the same across different regions ?

    cloud-setup-databases cloud:<dbpassword>@localhost --deploy-as=root:<password> -e <encryption_type> -m <management_server_key> -k <database_key>

    If this is the case , Can you please update the FS with this restriction ?