Configuring live migration - Live migration | Virtualization

Configuring live migration limits and timeouts
Configure live migration for heavy workloads
Additional resources
Live migration policies
- Creating a live migration policy by using the command line
Additional resources

You can configure live migration settings to ensure that the migration processes do not overwhelm the cluster.

You can configure live migration policies to apply different migration configurations to groups of virtual machines (VMs).

Configuring live migration limits and timeouts

Configure live migration limits and timeouts for the cluster by updating the HyperConverged custom resource (CR), which is located in the kubevirt-hyperconverged namespace.

Procedure

Edit the HyperConverged CR and add the necessary live migration parameters:

$ oc edit hyperconverged kubevirt-hyperconverged -n kubevirt-hyperconverged

Example configuration file

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: kubevirt-hyperconverged
spec:
  liveMigrationConfig:
    bandwidthPerMigration: 64Mi (1)
    completionTimeoutPerGiB: 800 (2)
    parallelMigrationsPerCluster: 5 (3)
    parallelOutboundMigrationsPerNode: 2 (4)
    progressTimeout: 150 (5)
    allowPostCopy: false (6)

1	Bandwidth limit of each migration, where the value is the quantity of bytes per second. For example, a value of `2048Mi` means 2048 MiB/s. Default: `0`, which is unlimited.
2	The migration is canceled if it has not completed in this time, in seconds per GiB of memory. For example, a VM with 6GiB memory times out if it has not completed migration in 4800 seconds. If the `Migration Method` is `BlockMigration`, the size of the migrating disks is included in the calculation.
3	Number of migrations running in parallel in the cluster. Default: `5`.
4	Maximum number of outbound migrations per node. Default: `2`.
5	The migration is canceled if memory copy fails to make progress in this time, in seconds. Default: `150`.
6	If a VM is running a heavy workload and the memory dirty rate is too high, this can prevent the migration from one node to another from converging. To prevent this, you can enable post copy mode. By default, `allowPostCopy` is set to `false`.

You can restore the default value for any spec.liveMigrationConfig field by deleting that key/value pair and saving the file. For example, delete progressTimeout: <value> to restore the default progressTimeout: 150.

Configure live migration for heavy workloads

When migrating a VM running a heavy workload (for example, database processing) with higher memory dirty rates, you need a higher bandwidth to complete the migration.

If the dirty rate is too high, the migration from one node to another does not converge. To prevent this, enable post copy mode.

Post copy mode triggers if the initial pre-copy phase does not complete within the defined timeout. During post copy, the VM CPUs pause on the source host while transferring the minimum required memory pages. Then the VM CPUs activate on the destination host, and the remaining memory pages transfer into the destination node at runtime.

Configure live migration for heavy workloads by updating the HyperConverged custom resource (CR), which is located in the kubevirt-hyperconverged namespace.

Procedure

Edit the HyperConverged CR and add the necessary parameters for migrating heavy workloads:

$ oc edit hyperconverged kubevirt-hyperconverged -n kubevirt-hyperconverged

Example configuration file

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: kubevirt-hyperconverged
spec:
  liveMigrationConfig:
    bandwidthPerMigration: 0Mi (1)
    completionTimeoutPerGiB: 150 (2)
    parallelMigrationsPerCluster: 5 (3)
    parallelOutboundMigrationsPerNode: 1 (4)
    progressTimeout: 150 (5)
    allowPostCopy: true (6)

1	Bandwidth limit of each migration, where the value is the quantity of bytes per second. The default is `0`, which is unlimited.
2	The migration is canceled if it is not completed in this time, and triggers post copy mode, when post copy is enabled. This value is measured in seconds per GiB of memory. You can lower `completionTimeoutPerGiB` to trigger post copy mode earlier in the migration process, or raise the `completionTimeoutPerGiB` to trigger post copy mode later in the migration process.
3	Number of migrations running in parallel in the cluster. The default is `5`. Keeping the `parallelMigrationsPerCluster` setting low is better when migrating heavy workloads.
4	Maximum number of outbound migrations per node. Configure a single VM per node for heavy workloads.
5	The migration is canceled if memory copy fails to make progress in this time. This value is measured in seconds. Increase this parameter for large memory sizes running heavy workloads.
6	Use post copy mode when memory dirty rates are high to ensure the migration converges. Set `allowPostCopy` to `true` to enable post copy mode.

Optional: If your main network is too busy for the migration, configure a secondary, dedicated migration network.

Post copy mode can impact performance during the transfer, and should not be used for critical data, or with unstable networks.

Additional resources

Configuring a dedicated network for live migration

Live migration policies

You can create live migration policies to apply different migration configurations to groups of VMs that are defined by VM or project labels.

You can create live migration policies by using the OKD web console.

Creating a live migration policy by using the command line

You can create a live migration policy by using the command line. KubeVirt applies the live migration policy to selected virtual machines (VMs) by using any combination of labels:

VM labels such as size, os, or gpu
Project labels such as priority, bandwidth, or hpc-workload

For the policy to apply to a specific group of VMs, all labels on the group of VMs must match the labels of the policy.

If multiple live migration policies apply to a VM, the policy with the greatest number of matching labels takes precedence.

If multiple policies meet this criteria, the policies are sorted by alphabetical order of the matching label keys, and the first one in that order takes precedence.

Procedure

Edit the VM object to which you want to apply a live migration policy, and add the corresponding VM labels.

Open the YAML configuration of the resource:
```
$ oc edit vm <vm_name>
```

Adjust the required label values in the .spec.template.metadata.labels section of the configuration. For example, to mark the VM as a production VM for the purposes of migration policies, add the kubevirt.io/environment: production line:

apiVersion: migrations.kubevirt.io/v1alpha1
kind: VirtualMachine
metadata:
  name: <vm_name>
  namespace: default
  labels:
    app: my-app
    environment: production
spec:
  template:
    metadata:
      labels:
        kubevirt.io/domain: <vm_name>
        kubevirt.io/size: large
        kubevirt.io/environment: production
# ...

Save and exit the configuration.

Configure a MigrationPolicy object with the corresponding labels. The following example configures a policy that applies to all VMs that are labeled as production:

apiVersion: migrations.kubevirt.io/v1alpha1
kind: MigrationPolicy
metadata:
  name: <migration_policy>
spec:
  selectors:
    namespaceSelector: (1)
      hpc-workloads: "True"
      xyz-workloads-type: ""
    virtualMachineInstanceSelector: (2)
      kubevirt.io/environment: "production"

1	Specify project labels.
2	Specify VM labels.

Create the migration policy by running the following command:
```
$ oc create -f <migration_policy>.yaml
```

Additional resources

Configuring a dedicated Multus network for live migration