Issues with Velero and admission webhooks - OADP Application backup and restore | Backup and restore

Restoring workarounds for Velero backups that use admission webhooks
- Restoring Knative resources
- Restoring IBM AppConnect resources
OADP plugins known issues
- Velero plugin panics during imagestream backups due to a missing secret
- OpenShift ADP Controller segmentation fault
Velero plugins returning "received EOF, stopping recv loop" message

Velero has limited abilities to resolve admission webhook issues during a restore. If you have workloads with admission webhooks, you might need to use an additional Velero plugin or make changes to how you restore the workload.

Typically, workloads with admission webhooks require you to create a resource of a specific kind first. This is especially true if your workload has child resources because admission webhooks typically block child resources.

For example, creating or restoring a top-level object such as service.serving.knative.dev typically creates child resources automatically. If you do this first, you will not need to use Velero to create and restore these resources. This avoids the problem of child resources being blocked by an admission webhook that Velero might use.

Restoring workarounds for Velero backups that use admission webhooks

You need additional steps to restore resources for several types of Velero backups that use admission webhooks.

Restoring Knative resources

You might encounter problems using Velero to back up Knative resources that use admission webhooks.

You can avoid such problems by restoring the top level Service resource first whenever you back up and restore Knative resources that use admission webhooks.

Procedure

Restore the top level service.serving.knavtive.dev Service resource:

$ velero restore <restore_name> \
  --from-backup=<backup_name> --include-resources \
  service.serving.knavtive.dev

Restoring IBM AppConnect resources

If you experience issues when you use Velero to a restore an IBM® AppConnect resource that has an admission webhook, you can run the checks in this procedure.

Procedure

Check if you have any mutating admission plugins of kind: MutatingWebhookConfiguration in the cluster:
```
$ oc get mutatingwebhookconfigurations
```
Examine the YAML file of each kind: MutatingWebhookConfiguration to ensure that none of its rules block creation of the objects that are experiencing issues. For more information, see the official Kubernetes documentation.
Check that any spec.version in type: Configuration.appconnect.ibm.com/v1beta1 used at backup time is supported by the installed Operator.

OADP plugins known issues

The following section describes known issues in OpenShift API for Data Protection (OADP) plugins:

Velero plugin panics during imagestream backups due to a missing secret

When the backup and the Backup Storage Location (BSL) are managed outside the scope of the Data Protection Application (DPA), the OADP controller, meaning the DPA reconciliation does not create the relevant oadp-<bsl_name>-<bsl_provider>-registry-secret.

When the backup is run, the OpenShift Velero plugin panics on the imagestream backup, with the following panic error:

024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item"
backup=openshift-adp/<backup name> error="error executing custom action (groupResource=imagestreams.image.openshift.io,
namespace=<BSL Name>, name=postgres): rpc error: code = Aborted desc = plugin panicked:
runtime error: index out of range with length 1, stack trace: goroutine 94…

Workaround to avoid the panic error

To avoid the Velero plugin panic error, perform the following steps:

Label the custom BSL with the relevant label:

$ oc label backupstoragelocations.velero.io <bsl_name> app.kubernetes.io/component=bsl

After the BSL is labeled, wait until the DPA reconciles.

You can force the reconciliation by making any minor change to the DPA itself.
When the DPA reconciles, confirm that the relevant oadp-<bsl_name>-<bsl_provider>-registry-secret has been created and that the correct registry data has been populated into it:
```
$ oc -n openshift-adp get secret/oadp-<bsl_name>-<bsl_provider>-registry-secret -o json | jq -r '.data'
```

OpenShift ADP Controller segmentation fault

If you configure a DPA with both cloudstorage and restic enabled, the openshift-adp-controller-manager pod crashes and restarts indefinitely until the pod fails with a crash loop segmentation fault.

You can have either velero or cloudstorage defined, because they are mutually exclusive fields.

If you have both velero and cloudstorage defined, the openshift-adp-controller-manager fails.
If you have neither velero nor cloudstorage defined, the openshift-adp-controller-manager fails.

For more information about this issue, see OADP-1054.

OpenShift ADP Controller segmentation fault workaround

You must define either velero or cloudstorage when you configure a DPA. If you define both APIs in your DPA, the openshift-adp-controller-manager pod fails with a crash loop segmentation fault.

Velero plugins returning "received EOF, stopping recv loop" message

Velero plugins are started as separate processes. After the Velero operation has completed, either successfully or not, they exit. Receiving a received EOF, stopping recv loop message in the debug logs indicates that a plugin operation has completed. It does not mean that an error has occurred.

Additional resources