×

Extending a timeout allows complex or resource-intensive processes to complete successfully without premature termination. This configuration can reduce errors, retries, or failures.

Ensure that you balance timeout extensions in a logical manner so that you do not configure excessively long timeouts that might hide underlying issues in the process. Consider and monitor an appropriate timeout value that meets the needs of the process and the overall system performance.

The following OADP timeouts show instructions of how and when to implement these parameters:

Restic timeout

The spec.configuration.nodeAgent.timeout parameter defines the Restic timeout. The default value is 1h.

Use the Restic timeout parameter in the nodeAgent section for the following scenarios:

  • For Restic backups with total PV data usage that is greater than 500GB.

  • If backups are timing out with the following error:

    level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete"
Procedure
  • Edit the values in the spec.configuration.nodeAgent.timeout block of the DataProtectionApplication custom resource (CR) manifest, as shown in the following example:

    apiVersion: oadp.openshift.io/v1alpha1
    kind: DataProtectionApplication
    metadata:
     name: <dpa_name>
    spec:
      configuration:
        nodeAgent:
          enable: true
          uploaderType: restic
          timeout: 1h
    # ...

Velero resource timeout

resourceTimeout defines how long to wait for several Velero resources before timeout occurs, such as Velero custom resource definition (CRD) availability, volumeSnapshot deletion, and repository availability. The default is 10m.

Use the resourceTimeout for the following scenarios:

  • For backups with total PV data usage that is greater than 1TB. This parameter is used as a timeout value when Velero tries to clean up or delete the Container Storage Interface (CSI) snapshots, before marking the backup as complete.

    • A sub-task of this cleanup tries to patch VSC and this timeout can be used for that task.

  • To create or ensure a backup repository is ready for filesystem based backups for Restic or Kopia.

  • To check if the Velero CRD is available in the cluster before restoring the custom resource (CR) or resource from the backup.

Procedure
  • Edit the values in the spec.configuration.velero.resourceTimeout block of the DataProtectionApplication CR manifest, as in the following example:

    apiVersion: oadp.openshift.io/v1alpha1
    kind: DataProtectionApplication
    metadata:
     name: <dpa_name>
    spec:
      configuration:
        velero:
          resourceTimeout: 10m
    # ...

Velero default item operation timeout

defaultItemOperationTimeout defines how long to wait on asynchronous BackupItemActions and RestoreItemActions to complete before timing out. The default value is 1h.

Use the defaultItemOperationTimeout for the following scenarios:

  • Only with Data Mover 1.2.x.

  • To specify the amount of time a particular backup or restore should wait for the Asynchronous actions to complete. In the context of OADP features, this value is used for the Asynchronous actions involved in the Container Storage Interface (CSI) Data Mover feature.

  • When defaultItemOperationTimeout is defined in the Data Protection Application (DPA) using the defaultItemOperationTimeout, it applies to both backup and restore operations. You can use itemOperationTimeout to define only the backup or only the restore of those CRs, as described in the following "Item operation timeout - restore", and "Item operation timeout - backup" sections.

Procedure
  • Edit the values in the spec.configuration.velero.defaultItemOperationTimeout block of the DataProtectionApplication CR manifest, as in the following example:

    apiVersion: oadp.openshift.io/v1alpha1
    kind: DataProtectionApplication
    metadata:
     name: <dpa_name>
    spec:
      configuration:
        velero:
          defaultItemOperationTimeout: 1h
    # ...

Data Mover timeout

timeout is a user-supplied timeout to complete VolumeSnapshotBackup and VolumeSnapshotRestore. The default value is 10m.

Use the Data Mover timeout for the following scenarios:

  • If creation of VolumeSnapshotBackups (VSBs) and VolumeSnapshotRestores (VSRs), times out after 10 minutes.

  • For large scale environments with total PV data usage that is greater than 500GB. Set the timeout for 1h.

  • With the VolumeSnapshotMover (VSM) plugin.

  • Only with OADP 1.1.x.

Procedure
  • Edit the values in the spec.features.dataMover.timeout block of the DataProtectionApplication CR manifest, as in the following example:

    apiVersion: oadp.openshift.io/v1alpha1
    kind: DataProtectionApplication
    metadata:
     name: <dpa_name>
    spec:
      features:
        dataMover:
          timeout: 10m
    # ...

CSI snapshot timeout

CSISnapshotTimeout specifies the time during creation to wait until the CSI VolumeSnapshot status becomes ReadyToUse, before returning error as timeout. The default value is 10m.

Use the CSISnapshotTimeout for the following scenarios:

  • With the CSI plugin.

  • For very large storage volumes that may take longer than 10 minutes to snapshot. Adjust this timeout if timeouts are found in the logs.

Typically, the default value for CSISnapshotTimeout does not require adjustment, because the default setting can accommodate large storage volumes.

Procedure
  • Edit the values in the spec.csiSnapshotTimeout block of the Backup CR manifest, as in the following example:

    apiVersion: velero.io/v1
    kind: Backup
    metadata:
     name: <backup_name>
    spec:
     csiSnapshotTimeout: 10m
    # ...

Item operation timeout - restore

ItemOperationTimeout specifies the time that is used to wait for RestoreItemAction operations. The default value is 1h.

Use the restore ItemOperationTimeout for the following scenarios:

  • Only with Data Mover 1.2.x.

  • For Data Mover uploads and downloads to or from the BackupStorageLocation. If the restore action is not completed when the timeout is reached, it will be marked as failed. If Data Mover operations are failing due to timeout issues, because of large storage volume sizes, then this timeout setting may need to be increased.

Procedure
  • Edit the values in the Restore.spec.itemOperationTimeout block of the Restore CR manifest, as in the following example:

    apiVersion: velero.io/v1
    kind: Restore
    metadata:
     name: <restore_name>
    spec:
     itemOperationTimeout: 1h
    # ...

Item operation timeout - backup

ItemOperationTimeout specifies the time used to wait for asynchronous BackupItemAction operations. The default value is 1h.

Use the backup ItemOperationTimeout for the following scenarios:

  • Only with Data Mover 1.2.x.

  • For Data Mover uploads and downloads to or from the BackupStorageLocation. If the backup action is not completed when the timeout is reached, it will be marked as failed. If Data Mover operations are failing due to timeout issues, because of large storage volume sizes, then this timeout setting may need to be increased.

Procedure
  • Edit the values in the Backup.spec.itemOperationTimeout block of the Backup CR manifest, as in the following example:

    apiVersion: velero.io/v1
    kind: Backup
    metadata:
     name: <backup_name>
    spec:
     itemOperationTimeout: 1h
    # ...