By Annette Clewett and Husnain Bustam

Hopefully by now you've seen that with the release of Red Hat OpenShift Container Platform 3.10 we've rebranded our container-native storage (CNS) offering to be called Red Hat OpenShift Container Storage (OCS). Versioning remains sequential (i.e, OCS 3.10 is the follow on to CNS 3.9).

OCS 3.10 introduces important features for container-based storage with OpenShift. Arbiter volume support allows for there to be only two replica copies of the data, while still providing split-brain protection and ~30% savings in storage infrastructure versus a replica-3 volume. This release also hardens block support for backing OpenShift infrastructure services. In addition to supporting arbiter volumes, major improvements to ease operations are available to give you the ability to monitor provisioned storage consumption, expand persistent volume (PV) capacity without downtime to the application, and use a more intuitive naming convention for PVs.

For easy evaluation of these features, an OpenShift Container Platform evaluation subscription now includes access to OCS evaluation binaries and subscriptions.

New features

Now let’s dive deeper into the new features of the OCS 3.10 release:

  • Prometheus OCS volume metrics: Volume consumption metrics data (e.g., volume capacity, available space, number of inodes in use, number of inodes free) available in Prometheus for OCS are very useful. These metrics monitor storage capacity and consumption trends and take timely actions to ensure applications do not get impacted.
  • Heketi topology and configuration metrics: Available from the Heketi HTTP metrics service endpoint, these metrics can be viewed using Prometheus or curl http://<heketi_service_route>/metrics. These metrics can be used to query heketi health, number of nodes, number of devices, device usage, and cluster count.
  • Online expansion of provisioned storage: You can now expand the OCS-backed PVs within OpenShift by editing the corresponding claim (oc edit pvc <claim_name>) with the new desired capacity (spec→ requests → storage: new value).
  • Custom volume naming: Before this release, the names of the dynamically provisioned GlusterFS volumes were auto-generated with random uuid number. Now, by adding a custom volume name prefix, the GlusterFS volume name will include the namespace or project as well as the claim name, thereby making it much easier to map to a particular workload.
  • Arbiter volumes: Arbiter volumes allow for reduced storage consumption and better performance across the cluster while still providing the redundancy and reliability expected of GlusterFS.

Volume and Heketi metrics

As of OCP 3.10 and OCS 3.10, the following metrics are available in Prometheus (and by executing curl http://<heketi_service_route>/metrics):

kubelet_volume_stats_available_bytes:      Number of available bytes in the volume
kubelet_volume_stats_capacity_bytes: Capacity in bytes of the volume
kubelet_volume_stats_inodes: Maximum number of inodes in the volume
kubelet_volume_stats_inodes_free: Number of free inodes in the volume
kubelet_volume_stats_inodes_used: Number of used inodes in the volume
kubelet_volume_stats_used_bytes: Number of used bytes in the volume
heketi_cluster_count: Number of clusters
heketi_device_brick_count: Number of bricks on device
heketi_device_count: Number of devices on host
heketi_device_free: Amount of free space available on the device
heketi_device_size: Total size of the device
heketi_device_used: Amount of space used on the device
heketi_nodes_count: Number of nodes on the cluster
heketi_up: Verifies if heketi is running
heketi_volumes_count: Number of volumes on cluster

 

 

Populating Heketi metrics in Prometheus requires additional configuration of the Heketi service. You must add the bolded annotations using the following commands:

# oc annotate svc heketi-storage prometheus.io/scheme=http
# oc annotate svc heketi-storage prometheus.io/scrape=true
# oc describe svc heketi-storage
Name:           heketi-storage
Namespace:      app-storage
Labels:         glusterfs=heketi-storage-service
                heketi=storage-service
Annotations:    description=Exposes Heketi service
                prometheus.io/scheme=http
                prometheus.io/scrape=true
Selector:       glusterfs=heketi-storage-pod
Type:           ClusterIP
IP:             172.30.90.87
Port:           heketi  8080/TCP
TargetPort:     8080/TCP

Populating Heketi metrics in Prometheus also requires additional configuration of the Prometheus configmap. As shown in the following, you must modify the Prometheus configmap with the namespace of Hekti service and restart prometheus-0 pod:

# oc get svc --all-namespaces | grep heketi
appstorage       heketi-storage       ClusterIP 172.30.90.87  <none>  8080/TCP
# oc get cm prometheus -o yaml -n openshift-metrics
....
- job_name: 'kubernetes-service-endpoints'
   ...
   relabel_configs:
     # only scrape infrastructure components
     - source_labels: [__meta_kubernetes_namespace]
       action: keep
       regex: 'default|logging|metrics|kube-.+|openshift|openshift-.+|app-storage'
# oc scale --replicas=0 statefulset.apps/prometheus
# oc scale --replicas=1 statefulset.apps/prometheus

Online expansion of GlusterFS volumes and custom naming

First, let's discuss what's needed to allow expansion of GlusterFS volumes. This opt-in feature is enabled by configuring the StorageClass for OCS with the parameter allowVolumeExpansion set to "true," enabling the feature gate ExpandPersistentVolumes. You can now dynamically resize storage volumes attached to containerized applications without needing to first detach and then attach a storage volume with increased capacity, which enhances application availability and uptime.

Enable the ExpandPersistentVolumes feature gate on all master nodes:

# vim /etc/origin/master/master-config.yaml
kubernetesMasterConfig:
  apiServerArguments:
    feature-gates:
    - ExpandPersistentVolumes=true
# /usr/local/bin/master-restart api
# /usr/local/bin/master-restart controllers

This release also supports adding a custom volume name prefix created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID). Parameterizing the StorageClass ( `volumenameprefix: myPrefix`) allows easier identification of volumes in the GlusterFS backend.

The new OCS PVs will be created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID), making it easier for you to automate day-2 admin tasks like backup and recovery, applying policies based on pre-ordained volume nomenclature, and other day-2 housekeeping tasks.

In this StorageClass, support for both online expansion of OCS/GlusterFS PVs and custom volume naming has been added.

# oc get sc glusterfs-storage -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: glusterfs-storage
parameters:
  resturl: http://heketi-storage-storage.apps.ose-master.example.com
  restuser: admin
  secretName: heketi-storage-admin-secret
  secretNamespace: storage
  volumenameprefix: gf 
allowVolumeExpansion: true 
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete

❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
Parameter needed for online expansion or resize of GlusterFS PVs

Be aware that PV expansion is not supported for block volumes, only for file volumes.

Expanding a volume starts with editing the PVC field "requests:storage" with the new expanded size for the PersistentVolume. For example, we have 1GiB PV, we want to expand the PV to 2GiB. To expand/resize PV to 2GiB, edit the PVC field "requests:storage" with the new value. The PV will be automatically resized to 2GiB. The new 2GiB size will be reflected in OCP, heketi-cli, and gluster commands. The expansion process creates another replica set and converts the 3-way replicated volume to distributed-replicated volume, 2x3 instead of 1x3 bricks.

GlusterFS arbiter volumes

Arbiter volume support is new to OCS 3.10 and has the following advantages:

  • An arbiter volume is still a 3-way replicated volume for highly available storage.
  • Arbiter bricks do not store file data; they only store file names, structure, and metadata.
  • Arbiter uses client quorum to compare this metadata with metadata of other nodes to ensure consistency of the volume and prevent split brain conditions.
  • Using Heketi commands, it is possible to control arbiter brick placement using tagging so that all arbiter bricks are on the same node.
  • With control of arbiter brick placement, the ‘arbiter’ node can have limited storage compared to other nodes in the cluster.

The following example has two gluster volumes configured across 5 nodes to create two 3-way arbitrated replicated volumes, with the arbiter bricks on a dedicated arbiter node.

In order to use arbiter volumes with OCP workloads, an additional parameter must be added to the GlusterFS StorageClass, user.heketi.arbiter true. In this StorageClass, support for the online expansion of GlusterFS PVs, custom volume naming, and arbiter volumes have been added.

# oc get sc glusterfs-storage -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: glusterfs-storage
parameters:
  resturl: http://heketi-storage-storage.apps.ose-master.example.com
  restuser: admin
  secretName: heketi-storage-admin-secret
  secretNamespace: storage
  volumenameprefix: gf 
  volumeoptions: user.heketi.arbiter true ❸
allowVolumeExpansion: true 
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete

❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
Parameter needed for online expansion or resize of GlusterFS volumes
❸ Enable arbiter volume support in the StorageClass. All the PVs created from this StorageClass will be 3-way arbitrated replicated volume.

Want to learn more?

For hands-on experience combining OpenShift and OCS, check out our test drive, a free, in-browser lab experience that walks you through using both. Also, check out  this short video explaining why using OCS with OpenShift is the right choice for the container storage infrastructure. For details on running OCS 3.10 with OCP 3.10, click here.