You can configure your OKD cluster to use Red Hat Gluster Storage as persistent storage for containerized applications. There are two deployment solutions available when using Red Hat Gluster Storage, using either a containerized or dedicated storage cluster. This topic focuses mainly on the the persistent volume plug-in solution using a dedicated Red Hat Gluster Storage cluster.
Starting with the Red Hat Gluster Storage 3.1 update 3 release, you can deploy containerized Red Hat Gluster Storage directly on OKD. Containerized Red Hat Gluster Storage converged with OKD addresses the use case where containerized applications require both shared file storage and the flexibility of a converged infrastructure with compute and storage instances being scheduled and run from the same set of hardware.
Step-by-step instructions for this containerized solution are provided separately in the following Red Hat Gluster Storage documentation:
OKD offers container native storage (CNS) storage, which makes it easier for OKD users to fulfill their storage needs. With CNS, solution users and administrators are empowered to have storage and application pods running together on the same infrastructure and sharing the same resources.
See Container-Native Storage for OpenShift Container Platform for configuring CNS as part of an OKD cluster.
Building environment storage can influence the time it takes for an application to start. For example, if the application pod requires a persistent volume claim (PVC), then extra time might have to be considered for CNS to be created and bound to the corresponding PVC. This effects the build time for an application pod to start.
Creation time of CNS volumes scales linearly up to 100 volumes. In the latest tests, each volume took approximately 6 seconds to be created, allocated, and bound to a pod.
All tests were performed on one trusted storage pool (TSP), using hardware configuration for CNS per the Container-Native Storage for OpenShift Container Platform guidelines.
Dynamic storage provisioning and storage classes were also configured and used when provisioning the PVC.
When you delete a PVC that is used by an application pod, then that action will trigger the deletion of the CNS volume that was used by the PVC.
PVCs will disappear immediately from the oc get pvc
output. However, the time
to delete and recycle CNS volumes depends on the number of CNS volumes. In the
latest tests, the deletion time of CNS volumes proved to scale linearly up to
100 volumes.
Deletion time does not affect application users. CNS deletion behavior serves as orientation for CNS storage administrators to be able to estimate how long it will approximately take for CNS volumes to be removed from a CNS cluster. |
The recommended memory requirements are 32 GB per OKD node hosting CNS pods.
Follow the planning guidelines when planning hardware for a CNS storage environment to ensure that you have enough memory. |
If you have a dedicated Red Hat Gluster Storage cluster available in your environment, you can configure OKD’s Gluster volume plug-in. The dedicated storage cluster delivers persistent Red Hat Gluster Storage file storage for containerized applications over the network. The applications access storage served out from the storage clusters through common storage protocols.
You can also dynamically provision volumes in a dedicated Red Hat Gluster Storage cluster that are enabled by Heketi. See Managing Volumes Using Heketi in the Red Hat Gluster Storage 3.3 Administration Guide for more information.
This solution is a conventional deployment where containerized compute applications run on an OKD cluster. The remaining sections in this topic provide the step-by-step instructions for the dedicated Red Hat Gluster Storage solution.
This topic presumes some familiarity with OKD and GlusterFS:
See the Persistent Storage topic for details on the OKD PV framework in general.
See the Red Hat Gluster Storage 3.3 Administration Guide for more on GlusterFS.
High-availability of storage in the infrastructure is left to the underlying storage provider. |
The following requirements must be met to create a supported integration of Red Hat Gluster Storage and OKD.
The following table lists the supported versions of OKD with Red Hat Gluster Storage Server.
Red Hat Gluster Storage | OKD |
---|---|
3.1.3 |
3.1 or later |
The environment requirements for OKD and Red Hat Gluster Storage are described in this section.
All installations of Red Hat Gluster Storage must have valid subscriptions to Red Hat Network channels and Subscription Management repositories.
Red Hat Gluster Storage installations must adhere to the requirements laid out in the Red Hat Gluster Storage 3.3 Installation Guide.
Red Hat Gluster Storage installations must be completely up to date with the latest patches and upgrades. Refer to the Red Hat Gluster Storage 3.3 Installation Guide to upgrade to the latest version.
The versions of OKD and Red Hat Gluster Storage integrated must be compatible, according to the information in Supported Operating Systems.
A fully-qualified domain name (FQDN) must be set for each hypervisor and Red Hat Gluster Storage server node. Ensure that correct DNS records exist, and that the FQDN is resolvable via both forward and reverse DNS lookup.
All installations of OKD must have valid subscriptions to Red Hat Network channels and Subscription Management repositories.
OKD installations must adhere to the requirements laid out in the Installation and Configuration documentation.
The OKD cluster must be up and running.
A user with cluster-admin permissions must be created.
All OKD nodes on RHEL systems must have the glusterfs-fuse RPM installed, which should match the version of Red Hat Gluster Storage server running in the containers. For more information on installing glusterfs-fuse, see Native Client in the Red Hat Gluster 3.3 Storage Administration Guide.
To provision GlusterFS volumes using the dedicated storage cluster solution, the following are required:
An existing storage device in your underlying infrastructure.
A distinct list of servers (IP addresses) in the Gluster cluster, to be defined as endpoints.
A service, to persist the endpoints (optional).
An existing Gluster volume to be referenced in the persistent volume object.
glusterfs-fuse installed on each schedulable OKD node in your cluster:
$ yum install glusterfs-fuse
Persistent volumes (PVs) and persistent volume claims (PVCs) can share volumes across a single project. While the GlusterFS-specific information contained in a PV definition could also be defined directly in a pod definition, doing so does not create the volume as a distinct cluster resource, making the volume more susceptible to conflicts. |
An endpoints definition defines the GlusterFS cluster as EndPoints
and
includes the IP addresses of your Gluster servers. The port value can be any
numeric value within the accepted range of ports. Optionally,
you can create a
service
that persists the endpoints.
Define the following service:
apiVersion: v1
kind: Service
metadata:
name: glusterfs-cluster (1)
spec:
ports:
- port: 1
1 | This name must be defined in the endpoints definition. If using a service, then the endpoints name must match the service name. |
Save the service definition to a file, for example gluster-service.yaml, then create the service:
$ oc create -f gluster-service.yaml
Verify that the service was created:
$ oc get services
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
glusterfs-cluster 172.30.205.34 <none> 1/TCP <none> 44s
Define the Gluster endpoints:
apiVersion: v1
kind: Endpoints
metadata:
name: glusterfs-cluster (1)
subsets:
- addresses:
- ip: 192.168.122.221 (2)
ports:
- port: 1
- addresses:
- ip: 192.168.122.222 (2)
ports:
- port: 1 (3)
1 | This name must match the service name from step 1. |
2 | The ip values must be the actual IP addresses of a Gluster server, not
fully-qualified host names. |
3 | The port number is ignored. |
Save the endpoints definition to a file, for example gluster-endpoints.yaml, then create the endpoints:
$ oc create -f gluster-endpoints.yaml
endpoints "glusterfs-cluster" created
Verify that the endpoints were created:
$ oc get endpoints
NAME ENDPOINTS AGE
docker-registry 10.1.0.3:5000 4h
glusterfs-cluster 192.168.122.221:1,192.168.122.222:1 11s
kubernetes 172.16.35.3:8443 4d
GlusterFS does not support the 'Recycle' reclaim policy. |
Next, define the PV in an object definition before creating it in OKD:
apiVersion: v1
kind: PersistentVolume
metadata:
name: gluster-default-volume (1)
spec:
capacity:
storage: 2Gi (2)
accessModes: (3)
- ReadWriteMany
glusterfs: (4)
endpoints: glusterfs-cluster (5)
path: myVol1 (6)
readOnly: false
persistentVolumeReclaimPolicy: Retain (7)
1 | The name of the volume. This is how it is identified via persistent volume claims or from pods. |
2 | The amount of storage allocated to this volume. |
3 | accessModes are used as labels to match a PV and a PVC. They currently
do not define any form of access control. |
4 | The volume type being used, in this case the glusterfs plug-in. |
5 | The endpoints name that defines the Gluster cluster created in Creating Gluster Endpoints. |
6 | The Gluster volume that will be accessed, as shown in the gluster volume status
command. |
7 | The volume reclaim policy Retain indicates that the volume will be
preserved after the pods accessing it terminates. For GlusterFS, the accepted
values include Retain , and Delete . |
Endpoints are name-spaced. Each project accessing the Gluster volume needs its own endpoints. |
Save the definition to a file, for example gluster-pv.yaml, and create the persistent volume:
$ oc create -f gluster-pv.yaml
Verify that the persistent volume was created:
$ oc get pv
NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE
gluster-default-volume <none> 2147483648 RWX Available 2s
Developers request GlusterFS storage by referencing either a PVC or the Gluster
volume plug-in directly in the volumes
section of a pod spec. A PVC exists
only in the user’s project and can only be referenced by pods within that
project. Any attempt to access a PV across a project causes the pod to fail.
Create a PVC that will bind to the new PV:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gluster-claim
spec:
accessModes:
- ReadWriteMany (1)
resources:
requests:
storage: 1Gi (2)
1 | accessModes do not enforce security, but rather act as labels to match a PV to a PVC. |
2 | This claim will look for PVs offering 1Gi or greater capacity. |
Save the definition to a file, for example gluster-claim.yaml, and create the PVC:
$ oc create -f gluster-claim.yaml
PVs and PVCs make sharing a volume across a project simpler. The gluster-specific information contained in the PV definition can also be defined directly in a pod specification. |
This section covers Gluster volume security, including matching permissions and SELinux considerations. Understanding the basics of POSIX permissions, process UIDs, supplemental groups, and SELinux is presumed.
See the full Volume Security topic before implementing Gluster volumes. |
As an example, assume that the target Gluster volume, HadoopVol
is mounted
under /mnt/glusterfs/, with the following POSIX permissions and SELinux
labels:
$ ls -lZ /mnt/glusterfs/
drwxrwx---. yarn hadoop system_u:object_r:fusefs_t:s0 HadoopVol
$ id yarn
uid=592(yarn) gid=590(hadoop) groups=590(hadoop)
In order to access the HadoopVol
volume, containers must match the SELinux
label, and run with a UID of 592 or 590 in their supplemental groups. The
OKD GlusterFS plug-in mounts the volume in the container with the
same POSIX ownership and permissions found on the target gluster mount, namely
the owner will be 592 and group ID will be 590. However, the container is
not run with its effective UID equal to 592, nor with its GID equal to 590,
which is the desired behavior. Instead, a container’s UID and supplemental
groups are determined by Security Context Constraints (SCCs) and the project
defaults.
Configure Gluster volume access by using supplemental groups, assuming it is not
an option to change permissions on the Gluster mount. Supplemental groups in
OKD are used for shared storage, such as GlusterFS. In contrast,
block storage, such as Ceph RBD or iSCSI, use the fsGroup SCC strategy and the
fsGroup value in the pod’s securityContext
.
Use supplemental group IDs instead of user IDs to gain access to persistent storage. Supplemental groups are covered further in the full Volume Security topic. |
The group ID on the target Gluster mount example above is 590.
Therefore, a pod can define that group ID using supplementalGroups
under the
pod-level securityContext
definition. For example:
spec:
containers:
- name:
...
securityContext: (1)
supplementalGroups: [590] (2)
1 | securityContext must be defined at the pod level, not under a specific container. |
2 | An array of GIDs defined at the pod level. |
Assuming there are no custom SCCs that satisfy the pod’s requirements, the pod
matches the restricted SCC. This SCC has the supplementalGroups
strategy
set to RunAsAny, meaning that any supplied group IDs are accepted without
range checking.
As a result, the above pod will pass admissions and can be launched. However, if group ID range checking is desired, use a custom SCC, as described in pod security and custom SCCs. A custom SCC can be created to define minimum and maximum group IDs, enforce group ID range checking, and allow a group ID of 590.
User IDs can be defined in the container image or in the pod definition. The full Volume Security topic covers controlling storage access based on user IDs, and should be read prior to setting up NFS persistent storage.
Use supplemental group IDs instead of user IDs to gain access to persistent storage. |
In the target Gluster mount example above, the container needs a UID set to 592, so the following can be added to the pod definition:
spec:
containers: (1)
- name:
...
securityContext:
runAsUser: 592 (2)
1 | Pods contain a securtityContext specific to each container and a pod-level securityContext , which applies to all containers defined in the pod. |
2 | The UID defined on the Gluster mount. |
With the default project and the restricted SCC, a pod’s requested user ID of 592 will not be allowed, and the pod will fail. This is because:
The pod requests 592 as its user ID.
All SCCs available to the pod are examined to see which SCC will allow a user ID of 592.
Because all available SCCs use MustRunAsRange for their runAsUser
strategy, UID range checking is required.
592 is not included in the SCC or project’s user ID range.
Do not modify the predefined SCCs. Insead, create a custom SCC so that minimum and maximum user IDs are defined, UID range checking is still enforced, and the UID of 592 will be allowed.
See the full Volume Security topic for information on controlling storage access in conjunction with using SELinux. |
By default, SELinux does not allow writing from a pod to a remote Gluster server.
To enable writing to GlusterFS volumes with SELinux enforcing on each node, run:
$ sudo setsebool -P virt_sandbox_use_fusefs on
The |
The -P
option makes the bool persistent between reboots.