×

You can back up and restore etcd on a hosted cluster on Amazon Web Services (AWS) to fix failures.

Taking a snapshot of etcd for a hosted cluster

To back up etcd for a hosted cluster, you must take a snapshot of etcd. Later, you can restore etcd by using the snapshot.

This procedure requires API downtime.

Procedure
  1. Pause reconciliation of the hosted cluster by entering the following command:

    $ oc patch -n clusters hostedclusters/<hosted_cluster_name> -p '{"spec":{"pausedUntil":"true"}}' --type=merge
  2. Stop all etcd-writer deployments by entering the following command:

    $ oc scale deployment -n <hosted_cluster_namespace> --replicas=0 kube-apiserver openshift-apiserver openshift-oauth-apiserver
  3. To take an etcd snapshot, use the exec command in each etcd container by entering the following command:

    $ oc exec -it <etcd_pod_name> -n <hosted_cluster_namespace> -- env ETCDCTL_API=3 /usr/bin/etcdctl --cacert /etc/etcd/tls/etcd-ca/ca.crt --cert /etc/etcd/tls/client/etcd-client.crt --key /etc/etcd/tls/client/etcd-client.key --endpoints=localhost:2379 snapshot save /var/lib/data/snapshot.db
  4. To check the snapshot status, use the exec command in each etcd container by running the following command:

    $ oc exec -it <etcd_pod_name> -n <hosted_cluster_namespace> -- env ETCDCTL_API=3 /usr/bin/etcdctl -w table snapshot status /var/lib/data/snapshot.db
  5. Copy the snapshot data to a location where you can retrieve it later, such as an S3 bucket. See the following example.

    The following example uses signature version 2. If you are in a region that supports signature version 4, such as the us-east-2 region, use signature version 4. Otherwise, when copying the snapshot to an S3 bucket, the upload fails.

    Example
    BUCKET_NAME=somebucket
    FILEPATH="/${BUCKET_NAME}/${CLUSTER_NAME}-snapshot.db"
    CONTENT_TYPE="application/x-compressed-tar"
    DATE_VALUE=`date -R`
    SIGNATURE_STRING="PUT\n\n${CONTENT_TYPE}\n${DATE_VALUE}\n${FILEPATH}"
    ACCESS_KEY=accesskey
    SECRET_KEY=secret
    SIGNATURE_HASH=`echo -en ${SIGNATURE_STRING} | openssl sha1 -hmac ${SECRET_KEY} -binary | base64`
    
    oc exec -it etcd-0 -n ${HOSTED_CLUSTER_NAMESPACE} -- curl -X PUT -T "/var/lib/data/snapshot.db" \
      -H "Host: ${BUCKET_NAME}.s3.amazonaws.com" \
      -H "Date: ${DATE_VALUE}" \
      -H "Content-Type: ${CONTENT_TYPE}" \
      -H "Authorization: AWS ${ACCESS_KEY}:${SIGNATURE_HASH}" \
      https://${BUCKET_NAME}.s3.amazonaws.com/${CLUSTER_NAME}-snapshot.db
  6. To restore the snapshot on a new cluster later, save the encryption secret that the hosted cluster references.

    1. Get the secret encryption key by entering the following command:

      $ oc get hostedcluster <hosted_cluster_name> -o=jsonpath='{.spec.secretEncryption.aescbc}'
      {"activeKey":{"name":"<hosted_cluster_name>-etcd-encryption-key"}}
    2. Save the secret encryption key by entering the following command:

      $ oc get secret <hosted_cluster_name>-etcd-encryption-key -o=jsonpath='{.data.key}'

      You can decrypt this key when restoring a snapshot on a new cluster.

Next steps

Restore the etcd snapshot.

Restoring an etcd snapshot on a hosted cluster

If you have a snapshot of etcd from your hosted cluster, you can restore it. Currently, you can restore an etcd snapshot only during cluster creation.

To restore an etcd snapshot, you modify the output from the create cluster --render command and define a restoreSnapshotURL value in the etcd section of the HostedCluster specification.

The --render flag in the hcp create command does not render the secrets. To render the secrets, you must use both the --render and the --render-sensitive flags in the hcp create command.

Prerequisites

You took an etcd snapshot on a hosted cluster.

Procedure
  1. On the aws command-line interface (CLI), create a pre-signed URL so that you can download your etcd snapshot from S3 without passing credentials to the etcd deployment:

    ETCD_SNAPSHOT=${ETCD_SNAPSHOT:-"s3://${BUCKET_NAME}/${CLUSTER_NAME}-snapshot.db"}
    ETCD_SNAPSHOT_URL=$(aws s3 presign ${ETCD_SNAPSHOT})
  2. Modify the HostedCluster specification to refer to the URL:

    spec:
      etcd:
        managed:
          storage:
            persistentVolume:
              size: 4Gi
            type: PersistentVolume
            restoreSnapshotURL:
            - "${ETCD_SNAPSHOT_URL}"
        managementType: Managed
  3. Ensure that the secret that you referenced from the spec.secretEncryption.aescbc value contains the same AES key that you saved in the previous steps.