Introduction

There are many global parameters which are used to set values/limits/boolean for various operations, but these parameters effects all zones/clusters/pools/accounts/storage based on the parameter.

There is a need to provide admin some parameters with more granularity so that values can be customized at each granular level.

Zone level parameters

 Here are 8 parameters need to be changed to 'per Zone' level.

1) pool.storage.allocated.capacity.disablethreshold: Percentage (as a value between 0 and 1) of allocated storage utilization above which allocators will disable using the pool for low allocated storage available

2) pool.storage.capacity.disablethreshold: Percentage (as a value between 0 and 1) of storage utilization above which allocators will disable using the pool for low storage available

3) vm.allocation.algorithm: 'random', 'firstfit', 'userdispersing', 'userconcentratedpod_random', 'userconcentratedpod_firstfit' : Order in which hosts within a cluster will be considered for VM/volume allocation

4) network.throttling.rate: Default data transfer rate in megabits per second allowed in network.

5) router.template.id:

           This parameter was not in use. So removing this parameter and adding 5 parameters per each hypevisor.

            router.template.xen: Name of the default router template on Xenserver

            router.template.kvm: Name of the default router template on KVM

            router.template.vmware: Name of the default router template on Vmware

            router.template.hyperv: Name of the default router template on Hyperv

            router.template.lxc: Name of the default router template on LXC

These are initially set to default to SYSTEM template.

Admin can register new routing templates, so that these can be used to deploy router. These templates are marked as ROUTING type in the DB. We introduce a API change in the register/update template so that ADMIN can register ROUTING templates.

This registered template is only used for deploying router(not SSVM and CPVM). SSVM and CPVM uses the default system template.

6) Guest Domain Prefix: Default domain name for vms inside virtualized networks fronted by router

7) External DNS Usage: Bypass internal dns, use external dns1 and dns2

8) storage.cleanup.interval: The interval (in seconds) to wait before running the storage cleanup thread

Schema:

We will add the name value pairs of these zone level parameters in the data_center_details table*.*

Cluster level parameters

CPU/RAM notifications and capacity thresholds need to be changed to cluster level, so that these values can be defined or modified for each cluster individually.

The parameters include:

1) cluster.cpu.allocated.capacity.disablethreshold: Percentage (as a value between 0 and 1) of cpu utilization above which allocators will disable using the cluster for low cpu available

2) cluster.cpu.allocated.capacity.notificationthreshold: Percentage (as a value between 0 and 1) of cpu utilization above which alerts will be sent about low cpu available

3) cluster.memory.allocated.capacity.disablethreshold: Percentage (as a value between 0 and 1) of memory utilization above which allocators will disable using the cluster for low memory available

4) cluster.memory.allocated.capacity.notificationthreshold: Percentage (as a value between 0 and 1) of memory utilization above which alerts will be sent about low memory available

5) cluster.storage.allocated.capacity.notificationthreshold: Percentage (as a value between 0 and 1) of allocated storage utilization above which alerts will be sent about low storage available

6) cluster.storage.capacity.notificationthreshold: Percentage (as a value between 0 and 1) of storage utilization above which alerts will be sent about low storage available

Schema:

We will add the name value pairs of these cluster level parameters in the cluster_details table as key value pairs*.*

Account level parameters

The list of parameters that are needed to make per account level are

1) allow.public.user.templates: If false, users will not be able to create public templates

2) remote.access.vpn.client.iprange: The range of ips to be allocated to remote access vpn clients. The first ip in the range is used by the VPN server

Schema:

We will add the name value pairs of these account level parameters in the account_details table as key value pairs*.*

Storage level parameters

There is only one parameter that needs to change to storage level.

1) storage.overprovisioning.factor: Used for storage overprovisioning calculation; available storage will be (actualStorageSize * overprovisioningfactor)

Schema:

We will add the name value pairs of these storge level parameters in the storage_pool_details table as key value pairs*.*

API Changes

We use existing updateConfugration and listConfguration APIs to update/list the configuration parameters at any level (Global/zone/cluster/pool/account).

We introduce 4 additional parameters corresponding to 4 scopes(zone/cluster/storagepool/account).

 - If id is not mentioned then it is considered to update/list the global configuration.

 - If id of zone/cluster/pool/account id given, then it considers at the specific level and update/list the corresponding parameter at that level.

 - When updating the parameter with given id, first it validates the scope of the parameter corresponding to the id provided.

 - checks the resource details table and updates there, if not present then will fetch from the global configuration parameters and create entry in the details table.

- If the value is set to null at a particular scope then it means to use the global value.

API name

Existing API Parameters

New optional API Parameters

API Response

updateConfiguration

name: the name of the configuration
value: the value of the configuration

zoneid: ID of the zone
clusterid: ID of the cluster
storageid: ID of the storagepool
accountid: ID of the acount

updateconfigurationresponse with scope and updated value 

listConfiguration

category: lists configurations by category
name: lists configuration by name

zoneid: ID of the zone
clusterid: ID of the cluster
storageid: ID of the storagepool
accountid: ID of the acount

listconfigurationsresponse with scope and value at the corresponding scope of the parameters.

registertemplate/updatetemplate

 

isrouting: true if the template type is routing i.e., if template is used to deploy router

 

The API change in registerTemplate and updatetemplate is in support of newly introduced zone/global level parameters router.template.xen/vmware/kvm/lxc/hyperv.

The optional parameter isrouting can be set only by root admin.

Here are the global config parameters and their proposed granularity.

Global Parameter

Description

Proposed change       

Development

Comments or clarifications needed

allow.public.user.templates

If false, users will not be able to create public templates.

Per Account, overrides the global, e.g. for resellers not end customers

The implementation goes like adding value in the account_details table. Whenever public template is created we need to check in the account details table and then proceed.

 

storage.overprovisioning.factor

Used for storage overprovisioning calculation; available storage will be (actualStorageSize * storage.overprovisioning.factor)

per storage (per sysvol/datavol usage), overrides global

storage_pool_table will hold the key value pair of storage overprovisioning factor. During the check whether the storage pool has enough space for the volume we need to consider per storage level overprovisioning factor and while creating and updating the capacity entry in the for the storage.

 

remote.access.vpn.client.iprange

The range of ips to be allocated to remote access vpn clients. The first ip in the range is used by the VPN server

Per Account, overrides the global

account_details table will hold the key, value pair. In RemoteAccessVPNManager we need to take the value from the account_details table, validate it and then use.

 

storage.cleanup.interval

The interval (in seconds) to wait before running the storage cleanup thread.

Per AZ, eg for Private/Reseller AZ to offer differing storage service

 

Do we need to run different storage GC threads for different zones?
Or start single thread with the global value and within that check for each zone whether the interval is completed by storing the last cleanup time per zone and current time. If completed clenup the storage otherwise skip the zone ?

CPU/Memory/Storage Notification and Capacity Thresholds:

1) cluster.cpu.allocated.capacity.disablethreshold 
2) cluster.cpu.allocated.capacity.notificationthreshold
3) cluster.memory.allocated.capacity.disablethreshold
4)cluster.memory.allocated.capacity.notificationthreshold
5) pool.storage.allocated.capacity.disablethreshold
6) pool.storage.capacity.disablethreshold

 

Per cluster

CPU and Memory notification and capacity threshold values per cluster can be stored in the cluster_details table and these values can be used during capacity checking and raising alerts. Since as part of CPU and RAM overcommit feature cpu and ram are made at cluster level, so changing the notifications and capacity thresholds per cluster seems to be relevant.


VM Allocation Algorithm: vm.allocation.algorithm

If 'random', hosts within a pod will be randomly considered for VM/volume allocation. If 'firstfit', they will be considered on a first-fit basis.

Per Zone

data_center_details table holds the 'name, value' pairs of these parameters.
During the allocation for VM/volume this value corresponding to zone is to be considered.
Logic need to be changed for volume allocation too since it also allocation based on this paramter.

 

VR Network Throttling Rate:
network.throttling.rate

Default data transfer rate in megabits per second allowed in network

Per Zone

For the domRs Guest and Public networks we take network throttling from the corresponding Guest Virtual networkOffering. In case the value is not specified in the network offering it should consider the value from the global parameter network.throttling.rate. But right now there is a bug that noticed, guest and public networks are always considering network rate from the global parameter.
Apart from this, we need to store this value per each zone and in the router vm life cycle we need to consider the zone level parameter. This value can be stored in the datacenter details table

In case of user VMs for non default network the throttling rate is considered from the network offering, if this is not specified it considers the value from global value network.throttling.rate.
For default network it consider value from vm.network.throttling.rate.

Since both these parameters(network.throttling.rate, vm.network.throttling.rate) are considered in case of guest vm networks,
Is that fine to change only network.throttling.rate per zone?

Router Template ID: router.template.id

Default ID for template

Per Zone

In the current implementation, the details of the templates are stored in the vm_templates table. These include all systemVM templates for all hypervisors, During the router vm deployment it gets the list of templates based on the hypervisor type and takes the 1st template from the list. Otherwise we are not using this parameter to select the template.
So the task is to make this paramter work and provide the value per zone.
Design: In the datacenter_details table we maintain the router.template.id for each hypervisor. During the router vm deployment we consider the template id from this table based on the hypervisor type

 

Guest Domain Suffix : guest.domain.suffix

Default domain name for vms inside virtualized networks fronted by router

Per Zone

The value can be stored in the data_center_details_table and  during guest network and vpc creation we can consider the value per zone level from the data_center_details table.


External DNS Usage: use.external.dns

Bypass internal dns, use external dns1 and dns2

Per Zone

The value can be stored in the data_center_details table and during finalizing the virtual machine's profile this value can be taken per zone corresponding to the deploy destination of the vm.

 

 

 

 

 

 

Some other redundant parameters:

Global Parameter

Description

Proposed change

Development

 

capacity.skipcounting.hours

Time (in seconds) to wait before release VM's cpu and memory when VM in stopped state

per cluster, overrides global

Need to maintain the value in cluster_details table and while updating the capacity for host the value in the cluster_details need to be checked with time since vm is stopped

 

cpu.capacity.threshold

Percentage (as a value between 0 and 1) of cpu utilization above which alerts will be sent about low cpu available.

per cluster, overrides global


This is a bit ambiguous, if we change this to cluster level then CPUCapacityDisableThreshold value also need to be changed to cluster level as they are relevant to each other.

network.gc.interval

Seconds to wait before checking for networks to shutdown

per cluster, overrides global

 

This doesn't make sense

network.gc.wait

Time (in seconds) to wait before shutting down a network that's not in used

per cluster, overrides global

 

This doesn't make sense

network.redundantrouter

Per Account (also per project?), overrides the global or via network offering

 

 

Redundant router has been deprecated.

vm.allocation.algorithm

If 'random', hosts within a pod will be randomly considered for VM/volume allocation. If 'firstfit', they will be considered on a first-fit basis.

per cluster, per account additional granularity to override global

The value will be stored in the cluster_details table and account_details table. APIs need to be changed are addCluster, updateCluster, addAccount and updateAccount.

Which one is more granular cluster or account? Who will get priority while both are set to different values? Why is this at cluster/account level, why not zone level? I want to know the  impact of changing this parameter ?

remote.access.vpn.psk.length

The length of the ipsec preshared key (minimum 8, maximum 256)

Per Account, overrides the global

account_details table will hold the key, value pair. In RemoteAccessVPNManagerImpl we need to take the value from the account_details table, validate it and then use.

 

remote.access.vpn.user.limit

The maximum number of VPN users that can be created per account

Per Account, overrides the global

account_details table will hold the key, value pair. In RemoteAccessVPNManagerImpl we need to take the value from the account_details table, validate it and then use.

 

capacity.skipcounting.hours

Time (in seconds) to wait before release VM's cpu and memory when VM in stopped state

per cluster, overrides global

cluster_details table will hold the value and in the capacityManagerImpl while updating the host's capacity we can get the cluster in which host is there and get the capacity.skipcounting.hours value per cluster.

 

  • No labels