Redeploying Certificates

Overview
Checking Certificate Expirations
Redeploying Certificates

Overview

OKD uses certificates to provide secure connections for the following components:

masters (API server and controllers)
etcd
nodes
registry
router

You can use Ansible playbooks provided with the installer to automate checking expiration dates for cluster certificates. Playbooks are also provided to automate backing up and redeploying these certificates, which can fix common certificate errors.

Possible use cases for redeploying certificates include:

The installer detected the wrong host names and the issue was identified too late.
The certificates are expired and you need to update them.
You have a new CA and want to create certificates using it instead.

Checking Certificate Expirations

You can use the installer to warn you about any certificates expiring within a configurable window of days and notify you about any certificates that have already expired. Certificate expiry playbooks use the Ansible role openshift_certificate_expiry.

Certificates examined by the role include:

Master and node service certificates
Router and registry service certificates from etcd secrets
Master, node, router, registry, and kubeconfig files for cluster-admin users
etcd certificates (including embedded)

Role Variables

The openshift_certificate_expiry role uses the following variables:

Table 1. Core Variables
Variable Name	Default Value	Description
`openshift_certificate_expiry_config_base`	`/etc/origin`	Base OKD configuration directory.
`openshift_certificate_expiry_warning_days`	`30`	Flag certificates that will expire in this many days from now.
`openshift_certificate_expiry_show_all`	`no`	Include healthy (non-expired and non-warning) certificates in results.

Table 2. Optional Variables
Variable Name	Default Value	Description
`openshift_certificate_expiry_generate_html_report`	`no`	Generate an HTML report of the expiry check results.
`openshift_certificate_expiry_html_report_path`	`/tmp/cert-expiry-report.html`	The full path for saving the HTML report.
`openshift_certificate_expiry_save_json_results`	`no`	Save expiry check results as a JSON file.
`openshift_certificate_expiry_json_results_path`	`/tmp/cert-expiry-report.json`	The full path for saving the JSON report.

Running Certificate Expiration Playbooks

The OKD installer provides a set of example certificate expiration playbooks, using different sets of configuration for the openshift_certificate_expiry role.

These playbooks must be used with an inventory file that is representative of the cluster. For best results, run ansible-playbook with the -v option.

Using the easy-mode.yaml example playbook, you can try the role out before tweaking it to your specifications as needed. This playbook:

Produces JSON and stylized HTML reports in /tmp/.
Sets the warning window very large, so you will almost always get results back.
Includes all certificates (healthy or not) in the results.

easy-mode.yaml Playbook

- name: Check cert expirys
  hosts: nodes:masters:etcd
  become: yes
  gather_facts: no
  vars:
    openshift_certificate_expiry_warning_days: 1500
    openshift_certificate_expiry_save_json_results: yes
    openshift_certificate_expiry_generate_html_report: yes
    openshift_certificate_expiry_show_all: yes
  roles:
    - role: openshift_certificate_expiry

To run the easy-mode.yaml playbook:

$ ansible-playbook -v -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/easy-mode.yaml

Other Example Playbooks

The other example playbooks are also available to run directly out of the /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/ directory.

Table 3. Other Example Playbooks
File Name	Usage
*default.yaml*	Produces the default behavior of the `openshift_certificate_expiry` role.
*html_and_json_default_paths.yaml*	Generates HTML and JSON artifacts in their default paths.
*longer_warning_period.yaml*	Changes the expiration warning window to 1500 days.
*longer-warning-period-json-results.yaml*	Changes the expiration warning window to 1500 days and saves the results as a JSON file.

To run any of these example playbooks:

$ ansible-playbook -v -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/<playbook>

Output Formats

As noted above, there are two ways to format your check report. In JSON format for machine parsing, or as a stylized HTML page for easy skimming.

HTML Report

An example of an HTML report is provided with the installer. You can open the following file in your browser to view it:

/usr/share/ansible/openshift-ansible/roles/openshift_certificate_expiry/examples/cert-expiry-report.html

JSON Report

There are two top-level keys in the saved JSON results: data and summary.

The data key is a hash where the keys are the names of each host examined and the values are the check results for the certificates identified on each respective host.

The summary key is a hash that summarizes the total number of certificates:

examined on the entire cluster
that are OK
expiring within the configured warning window
already expired

For an example of the full JSON report, see /usr/share/ansible/openshift-ansible/roles/openshift_certificate_expiry/examples/cert-expiry-report.json.

The summary from the JSON data can be easily checked for warnings or expirations using a variety of command-line tools. For example, using grep you can look for the word summary and print out the two lines after the match (-A2):

$ grep -A2 summary /tmp/cert-expiry-report.json
    "summary": {
        "warning": 16,
        "expired": 0

If available, the jq tool can also be used to pick out specific values. The first two examples below show how to select just one value, either warning or expired. The third example shows how to select both values at once:

$ jq '.summary.warning' /tmp/cert-expiry-report.json
16

$ jq '.summary.expired' /tmp/cert-expiry-report.json
0

$ jq '.summary.warning,.summary.expired' /tmp/cert-expiry-report.json
16
0

Use the following playbooks to redeploy master, etcd, node, registry, and router certificates on all relevant hosts. You can redeploy all of them at once using the current CA, redeploy certificates for specific components only, or redeploy a newly generated or custom CA on its own.

Just like the certificate expiry playbooks, these playbooks must be run with an inventory file that is representative of the cluster.

In particular, the inventory must specify or override all host names and IP addresses set via the following variables such that they match the current cluster configuration:

openshift_hostname
openshift_public_hostname
openshift_ip
openshift_public_ip
openshift_master_cluster_hostname
openshift_master_cluster_public_hostname

The playbooks you need are provided by:

# yum install atomic-openshift-utils

The validity (length in days until they expire) for any certificates auto-generated while redeploying can be configured via Ansible as well. See Configuring Certificate Validity.

OKD CA and etcd certificates expire after five years. Signed OKD certificates expire after two years.

Redeploying All Certificates Using the Current OKD and etcd CA

The redeploy-certificates.yml playbook does not regenerate the OKD CA certificate. New master, etcd, node, registry, and router certificates are created using the current CA certificate to sign new certificates.

This also includes serial restarts of:

etcd
master services
node services

To redeploy master, etcd, and node certificates using the current OKD CA, run this playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-certificates.yml

Redeploying a New or Custom OKD CA

The redeploy-openshift-ca.yml playbook redeploys the OKD CA certificate by generating a new CA certificate and distributing an updated bundle to all components including client kubeconfig files and the node’s database of trusted CAs (the CA-trust).

This also includes serial restarts of:

master services
node services
docker

Additionally, you can specify a custom CA certificate when redeploying certificates instead of relying on a CA generated by OKD.

When the master services are restarted, the registry and routers can continue to communicate with the master without being redeployed because the master’s serving certificate is the same, and the CA the registry and routers have are still valid.

To redeploy a newly generated or custom CA:

If you want to use a custom CA, set the following variable in your inventory file. To use the current CA, skip this step.

# Configure custom ca certificate
# NOTE: CA certificate will not be replaced with existing clusters.
# This option may only be specified when creating a new cluster or
# when redeploying cluster certificates with the redeploy-certificates
# playbook.
openshift_master_ca_certificate={'certfile': '</path/to/ca.crt>', 'keyfile': '</path/to/ca.key>'}

If the CA certificate is issued by an intermediate CA, the bundled certificate must contain the full chain (the intermediate and root certificates) for the CA in order to validate child certificates.

For example:

$ cat intermediate/certs/intermediate.cert.pem \
      certs/ca.cert.pem >> intermediate/certs/ca-chain.cert.pem

Run the redeploy-openshift-ca.yml playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml

With the new OKD CA in place, you can then use the redeploy-certificates.yml playbook at your discretion whenever you want to redeploy certificates signed by the new CA on all components.

Redeploying a New etcd CA

The redeploy-etcd-ca.yml playbook redeploys the etcd CA certificate by generating a new CA certificate and distributing an updated bundle to all etcd peers and master clients.

This also includes serial restarts of:

etcd
master services

The redeploy-etcd-ca.yml playbook is only available for OKD v3.5.91-1 and above.

To redeploy a newly generated etcd CA:

Run the redeploy-etcd-ca.yml playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-etcd-ca.yml

With the new etcd CA in place, you can then use the redeploy-etcd-certificates.yml playbook at your discretion whenever you want to redeploy certificates signed by the new etcd CA on etcd peers and master clients. Alternatively, you can use the redeploy-certificates.yml playbook to redeploy certificates for OKD components in addition to etcd peers and master clients.

Redeploying Master Certificates Only

The redeploy-master-certificates.yml playbook only redeploys master certificates. This also includes serial restarts of master services.

To redeploy master certificates, run this playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-master-certificates.yml

After running this playbook, regenerate any service signing certificate or key pairs by deleting existing secrets that contain service serving certificates or removing and re-adding annotations to appropriate services.

Redeploying etcd Certificates Only

The redeploy-etcd-certificates.yml playbook only redeploys etcd certificates including master client certificates.

This also include serial restarts of:

etcd
master services.

To redeploy etcd certificates, run this playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-etcd-certificates.yml

Redeploying Node Certificates Only

The redeploy-node-certificates.yml playbook only redeploys node certificates. This also include serial restarts of node services.

To redeploy node certificates, run this playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-node-certificates.yml

Redeploying Registry or Router Certificates Only

The redeploy-registry-certificates.yml and redeploy-router-certificates.yml playbooks replace installer-created certificates for the registry and router. If custom certificates are in use for these components, see Redeploying Custom Registry or Router Certificates to replace them manually.

Redeploying Registry Certificates Only

To redeploy registry certificates, run the following playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-registry-certificates.yml

Redeploying Router Certificates Only

To redeploy router certificates, run the following playbook, specifying your inventory file:

$ ansible-playbook -i <inventory_file> \
    /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/redeploy-router-certificates.yml

Redeploying Custom Registry or Router Certificates

When nodes are evacuated due to a redeployed CA, registry and router pods are restarted. If the registry and router certificates were not also redeployed with the new CA, this can cause outages because they cannot reach the masters using their old certificates.

The playbooks for redeploying certificates cannot redeploy custom registry or router certificates, so to address this issue, you can manually redeploy the registry and router certificates.

Redeploying Registry Certificates Manually

To redeploy registry certificates manually, you must add new registry certificates to a secret named registry-certificates, then redeploy the registry:

Switch to the default project for the remainder of these steps:
```
$ oc project default
```

If your registry was initially created on OKD 3.1 or earlier, it may still be using environment variables to store certificates (which has been deprecated in favor of using secrets).

Run the following and look for the OPENSHIFT_CA_DATA, OPENSHIFT_CERT_DATA, OPENSHIFT_KEY_DATA environment variables:
```
$ oc env dc/docker-registry --list
```

If they do not exist, skip this step. If they do, create the following ClusterRoleBinding:

$ cat <<EOF |
apiVersion: v1
groupNames: null
kind: ClusterRoleBinding
metadata:
  creationTimestamp: null
  name: registry-registry-role
roleRef:
  kind: ClusterRole
  name: system:registry
subjects:
- kind: ServiceAccount
  name: registry
  namespace: default
userNames:
- system:serviceaccount:default:registry
EOF
oc create -f -

Then, run the following to remove the environment variables:

$ oc env dc/docker-registry OPENSHIFT_CA_DATA- OPENSHIFT_CERT_DATA- OPENSHIFT_KEY_DATA- OPENSHIFT_MASTER-

Set the following environment variables locally to make later commands less complex:

$ REGISTRY_IP=`oc get service docker-registry -o jsonpath='{.spec.clusterIP}'`
$ REGISTRY_HOSTNAME=`oc get route/docker-registry -o jsonpath='{.spec.host}'`

Create new registry certificates:

$ oc adm ca create-server-cert \
    --signer-cert=/etc/origin/master/ca.crt \
    --signer-key=/etc/origin/master/ca.key \
    --hostnames=$REGISTRY_IP,docker-registry.default.svc,docker-registry.default.svc.cluster.local,$REGISTRY_HOSTNAME
    --cert=/etc/origin/master/registry.crt \
    --key=/etc/origin/master/registry.key \
    --signer-serial=/etc/origin/master/ca.serial.txt

Run oc adm commands only from the first master listed in the Ansible host inventory file, by default /etc/ansible/hosts.

Update the registry-certificates secret with the new registry certificates:

$ oc secret new registry-certificates \
    /etc/origin/master/registry.crt \
    /etc/origin/master/registry.key \
    -o json | oc replace -f -

Redeploy the registry:
```
$ oc deploy dc/docker-registry --latest
```

Redeploying Router Certificates Manually

To redeploy router certificates manually, you must add new router certificates to a secret named router-certs, then redeploy the router:

Switch to the default project for the remainder of these steps:
```
$ oc project default
```
If your router was initially created on OKD 3.1 or earlier, it might still use environment variables to store certificates, which has been deprecated in favor of using service serving certificate secret.
1. Run the following command and look for the OPENSHIFT_CA_DATA, OPENSHIFT_CERT_DATA, OPENSHIFT_KEY_DATA environment variables:
  $ oc env dc/router --list
2. If those variables exist, create the following ClusterRoleBinding:
  $ cat <<EOF | apiVersion: v1 groupNames: null kind: ClusterRoleBinding metadata: creationTimestamp: null name: router-router-role roleRef: kind: ClusterRole name: system:router subjects: - kind: ServiceAccount name: router namespace: default userNames: - system:serviceaccount:default:router EOF oc create -f -
3. If those variables exist, run the following command to remove them:
  $ oc env dc/router OPENSHIFT_CA_DATA- OPENSHIFT_CERT_DATA- OPENSHIFT_KEY_DATA- OPENSHIFT_MASTER-
Obtain a certificate.
- If you use an external Certificate Authority (CA) to sign your certificates, create a new certificate and provide it to OKD by following your internal processes.
- If you use the internal OKD CA to sign certificates, run the following commands:
  
  The following commands generate a certificate that is internally signed. It will be trusted by only clients that trust the OKD CA.
  $ cd /root $ mkdir cert ; cd cert $ oc adm ca create-server-cert \ --signer-cert=/etc/origin/master/ca.crt \ --signer-key=/etc/origin/master/ca.key \ --signer-serial=/etc/origin/master/ca.serial.txt \ --hostnames='*.hostnames.for.the.certificate' \ --cert=router.crt \ --key=router.key \
  These commands generate the following files:
  - A new certificate named router.crt.
  - A copy of the signing CA certificate chain, /etc/origin/master/ca.crt. This chain can contain more than one certificate if you use intermediate CAs.
  - A corresponding private key named router.key.

Create a new file that concatenates the generated certificates:

$ cat router.crt /etc/origin/master/ca.crt router.key > router.pem

Before you generate a new secret, back up the current one:

$ oc export secret router-certs > ~/old-router-certs-secret.yaml

Create a new secret to hold the new certificate and key, and replace the contents of the existing secret:
```
$ oc create secret tls router-certs --cert=router.pem \ (1)
    --key=router.key -o json --dry-run | \
    oc replace -f -
```
1 router.pem is the file that contains the concatenation of the certificates that you generated.
Redeploy the router:
```
$ oc rollout latest dc/router
```
When routers are initially deployed, an annotation is added to the router’s service that automatically creates a service serving certificate secret named router-metrics-tls.

To redeploy router-metrics-tls certificates manually, that service serving certificate can be triggered to be recreated by deleting the secret, removing and re-adding annotations to the router service, then redeploying the router-metrics-tls secret:

Remove the following annotations from the router service:

$ oc annotate service router \
    service.alpha.openshift.io/serving-cert-secret-name- \
    service.alpha.openshift.io/serving-cert-signed-by-

Remove the existing router-metrics-tls secret.
```
$ oc delete secret router-metrics-tls
```

Re-add the annotations:

$ oc annotate service router \
    service.alpha.openshift.io/serving-cert-secret-name=router-metrics-tls