Running OpenShift Container Storage 3.10 with Red Hat OpenShift Container Platform 3.10

By Annette Clewett anJose A. Rivera

With the release of Red Hat OpenShift Container Platform 3.10, we’ve officially rebranded what used to be referred to as Red Hat Container-Native Storage (CNS) as Red Hat OpenShift Container Storage (OCS). Versioning remains sequential (i.e, OCS version 3.10 is the follow on to CNS 3.9). You’ll continue to have the convenience of OCS 3.10 as part of the normal OpenShift deployment process in a single step, and OpenShift Container Platform (OCP) evaluation subscription has access to OCS evaluation binaries and subscriptions.

OCS 3.10 introduces an important feature for container-based storage with OpenShift. Arbiter volume support allows for there to be only two replica copies of the data, while still providing split-brain protection and ~30% savings in storage infrastructure versus a replica-3 volume. This release also hardens block support for backing OpenShift infrastructure services. Detailed information on the value and use of OCS 3.10 features can be found here.

OCS 3.10 installation with OCP 3.10 Advanced Installer

Let’s now take a look at the installation of OCS with the OCP Advanced Installer. OCS can provide persistent storage for both OCP’s infrastructure applications (e.g., integrated registry, logging, and metrics), as well as  general application data consumption. Typically, both options are used in parallel, resulting in two separate OCS clusters being deployed in a single OCP environment. It’s also possible to use a single OCS cluster for both purposes.

Following is an example of a partial inventory file with selected options concerning deployment of OCS for applications and an additional OCS cluster for infrastructure workloads like registry, logging, and metrics storage. When using these options for your deployment, values with specific sizes (e.g., openshift_hosted_registry_storage_volume_size=10Gi) or node selectors  (e.g., node-role.kubernetes.io/infra=true) should be adjusted for your particular deployment needs.

If you’re planning to use gluster-block volumes for logging and metrics, they can now be installed when OCP is installed. (Of course, they can also be installed later.)

[OSEv3:children]
...
nodes
glusterfs
glusterfs_registry

[OSEv3:vars]
...      
# registry
openshift_hosted_registry_storage_kind=glusterfs       
openshift_hosted_registry_storage_volume_size=10Gi   
openshift_hosted_registry_selector="node-role.kubernetes.io/infra=true"

# logging
openshift_logging_install_logging=true
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=50Gi
openshift_logging_es_cluster_size=3
openshift_logging_es_pvc_storage_class_name='glusterfs-registry-block'
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}

# metrics
openshift_metrics_install_metrics=true
openshift_metrics_storage_kind=dynamic
openshift_metrics_storage_volume_size=20Gi
openshift_metrics_cassandra_pvc_storage_class_name='glusterfs-registry-block'
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}

# Container image to use for glusterfs pods
openshift_storage_glusterfs_image="registry.access.redhat.com/rhgs3/rhgs-server-rhel7:v3.10"

# Container image to use for gluster-block-provisioner pod
openshift_storage_glusterfs_block_image="registry.access.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7:v3.10"

# Container image to use for heketi pods
openshift_storage_glusterfs_heketi_image="registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.10"
 
# OCS storage cluster for applications
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=false   

# OCS storage cluster for OpenShift infrastructure
openshift_storage_glusterfs_registry_namespace=infra-storage  
openshift_storage_glusterfs_registry_storageclass=false       
openshift_storage_glusterfs_registry_block_deploy=true   
openshift_storage_glusterfs_registry_block_host_vol_create=true    
openshift_storage_glusterfs_registry_block_host_vol_size=200   
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false

...
[nodes]
ose-app-node01.ocpgluster.com openshift_node_group_name="node-config-compute"
ose-app-node02.ocpgluster.com openshift_node_group_name="node-config-compute"
ose-app-node03.ocpgluster.com openshift_node_group_name="node-config-compute"
ose-app-node04.ocpgluster.com openshift_node_group_name="node-config-compute"
ose-infra-node01.ocpgluster.com openshift_node_group_name="node-config-infra"
ose-infra-node02.ocpgluster.com openshift_node_group_name="node-config-infra"
ose-infra-node03.ocpgluster.com openshift_node_group_name="node-config-infra"

[glusterfs]
ose-app-node01.ocpgluster.com glusterfs_zone=1 glusterfs_devices='[ "/dev/xvdf" ]'   
ose-app-node02.ocpgluster.com glusterfs_zone=2 glusterfs_devices='[ "/dev/xvdf" ]'
ose-app-node03.ocpgluster.com glusterfs_zone=3 glusterfs_devices='[ "/dev/xvdf" ]'
ose-app-node04.ocpgluster.com glusterfs_zone=1 glusterfs_devices='[ "/dev/xvdf" ]'

[glusterfs_registry]
ose-infra-node01.ocpgluster.com glusterfs_zone=1 glusterfs_devices='[ "/dev/xvdf" ]'
ose-infra-node02.ocpgluster.com glusterfs_zone=2 glusterfs_devices='[ "/dev/xvdf" ]'
ose-infra-node03.ocpgluster.com glusterfs_zone=3 glusterfs_devices='[ "/dev/xvdf" ]'

Inventory file options explained

The first section of the inventory file defines the host groups the installation will be using. We’ve defined two new groups: (1) glusterfs and (2) glusterfs_registry. The settings for either group all start with either openshift_storage_glusterfs_ or openshift_storage_glusterfs_registry. In each group, the nodes that will make up the OCS cluster are listed, and the devices ready for exclusive use by OCS are specified (glusterfs_devices=).

The first group of hosts in glusterfs specifies a cluster for general-purpose application storage and will, by default, come with the StorageClass glusterfs-storage to enable dynamic provisioning. For high availability of storage, it’s very important to have four nodes for the general-purpose application cluster, glusterfs.

The second group, glusterfs_registry, specifies a cluster that will host a single, statically deployed PersistentVolume for use exclusively by a hosted registry that can scale. This cluster will not offer a StorageClass for file-based PersistentVolumes with the options and values as they are currently configured (openshift_storage_glusterfs_registry_storageclass=false). This cluster will also support gluster-block (openshift_storage_glusterfs_registry_block_deploy=true). PersistentVolume creation can be done via StorageClass glusterfs-registry-block (openshift_storage_glusterfs_registry_block_storageclass=true). Special attention should be given to choosing the size for openshift_storage_glusterfs_registry_block_host_vol_size. This is the hosting volume for gluster-block devices that will be created for logging and metrics. Make sure that the size can accommodate all these block volumes and that you have sufficient storage if another hosting volume must be created.

If you want to tune the installation, more options are available in the Advanced Installation. To automate the generation of required inventory file options as shown previously, check out this newly available red-hat-storage tool called “CNS Inventory file Creator” or CIC (alpha version at this time). The CIC tool creates CNS or OCS inventory file options for both OCP 3.9 and OCP 3.10, respectively. CIC will ask a series of questions about the OpenShift hosts, the storage devices, sizes of PersistentVolumes for registry, logging and metrics and has baked-in checks to make sure the OCP installation will be successful. This tool  is currently alpha state, and we’re looking for feedback. Download it from github repository openshift-cic.

Single OCS cluster installation

Again, it is possible to support both general-application storage and infrastructure storage in a single OCS cluster. To do this, the inventory file options will change slightly for logging and metrics. This is because when there is only one cluster, the gluster-block StorageClass would be glusterfs-storage-block. The registry PV will be created on this single cluster if the second cluster, [glusterfs_registry], does not exist. For high availability, it’s very important to have four nodes for this cluster.  Also, special attention should be given to choosing the size for openshift_storage_glusterfs_block_host_vol_size. This is the hosting volume for gluster-block devices that will be created for logging and metrics. Make sure that the size can accommodate all these block volumes and that you have sufficient storage if another hosting volume must be created.

[OSEv3:children]
...
nodes
glusterfs

[OSEv3:vars]
...      
# registry
...

# logging
openshift_logging_install_logging=true
...
openshift_logging_es_pvc_storage_class_name='glusterfs-storage-block'
... 

# metrics
openshift_metrics_install_metrics=true
...
openshift_metrics_cassandra_pvc_storage_class_name='glusterfs-storage-block'

...

# OCS storage cluster for applications
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_create=true
openshift_storage_glusterfs_block_host_vol_size=100
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
...

[nodes]

ose-app-node01.ocpgluster.com openshift_node_group_name="node-config-compute"   
ose-app-node02.ocpgluster.com openshift_node_group_name="node-config-compute" 
ose-app-node03.ocpgluster.com openshift_node_group_name="node-config-compute" 
ose-app-node04.ocpgluster.com openshift_node_group_name="node-config-compute" 

[glusterfs]
ose-app-node01.ocpgluster.com glusterfs_zone=1 glusterfs_devices='[ "/dev/xvdf" ]'   
ose-app-node02.ocpgluster.com glusterfs_zone=2 glusterfs_devices='[ "/dev/xvdf" ]'
ose-app-node03.ocpgluster.com glusterfs_zone=3 glusterfs_devices='[ "/dev/xvdf" ]'
ose-app-node04.ocpgluster.com glusterfs_zone=1 glusterfs_devices='[ "/dev/xvdf" ]'

OCS 3.10 uninstall

With the OCS 3.10 release, the uninstall.yml playbook can be used to remove all gluster and heketi resources. This might come in handy when there are errors in inventory file options that cause the gluster cluster to deploy incorrectly.

If you’re removing an OCS installation that is currently being used by any applications, you should remove those applications before removing OCS, because they will lose access to storage. This includes infrastructure applications like registry, logging, and metrics that have PV claims created using the glusterfs-storage and glusterfs-storage-block Storage Class resources.

You can remove logging and metrics resources by re-running the deployment playbooks like this:

ansible-playbook -i <path_to_inventory_file> -e
"openshift_logging_install_logging=false"
/usr/share/ansible/openshift-ansible/playbooks/openshift-logging/config.yml

ansible-playbook -i <path_to_inventory_file> -e
"openshift_logging_install_metrics=false"
/usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml

Make sure to manually remove any logging or metrics PersistentVolumeClaims. The associated PersistentVolumes will be deleted automatically.

If you have the registry using a glusterfs PersistentVolume, remove it with the following command:

oc delete deploymentconfig docker-registry
oc delete pvc registry-claim
oc delete pv registry-volume
oc delete service glusterfs-registry-endpoints

If running the uninstall.yml because a deployment failed, run the uninstall.yml playbook with the following variables to wipe the storage devices for both glusterfs and glusterfs_registry before trying the OCS installation again.

ansible-playbook -i <path_to_inventory file> -e
"openshift_storage_glusterfs_wipe=True" -e
"openshift_storage_glusterfs_registry_wipe=true"
/usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/uninstall.yml

OCS 3.10 post installation for applications, registry, logging and metrics

You can add OCS clusters and resources to an existing OCP install using the following command. This same process can be used if OCS has been uninstalled due to errors.

ansible-playbook -i <path_to_inventory_file>
/usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/config.yml

After the new cluster(s) is created and validated, you can deploy the registry using a newly created glusterfs ReadWriteMany volume. Run this playbook to create the registry resources:

ansible-playbook -i <path_to_inventory_file>
/usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/config.yml

You can now deploy logging and metrics resources by re-running these deployment playbooks:

ansible-playbook -i <path_to_inventory_file>
/usr/share/ansible/openshift-ansible/playbooks/openshift-logging/config.yml

ansible-playbook -i <path_to_inventory_file>
/usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml

Want to learn more?

For hands-on experience combining OpenShift and OCS, check out our test drive, a free, in-browser lab experience that walks you through using both. Also, watch this short video explaining why to use OCS with OCP. Detailed information on the value and use of OCS 3.10 features can be found here.

Improved volume management for Red Hat OpenShift Container Storage 3.10

By Annette Clewett and Husnain Bustam

Hopefully by now you’ve seen that with the release of Red Hat OpenShift Container Platform 3.10 we’ve rebranded our container-native storage (CNS) offering to be called Red Hat OpenShift Container Storage (OCS). Versioning remains sequential (i.e, OCS 3.10 is the follow on to CNS 3.9).

OCS 3.10 introduces important features for container-based storage with OpenShift. Arbiter volume support allows for there to be only two replica copies of the data, while still providing split-brain protection and ~30% savings in storage infrastructure versus a replica-3 volume. This release also hardens block support for backing OpenShift infrastructure services. In addition to supporting arbiter volumes, major improvements to ease operations are available to give you the ability to monitor provisioned storage consumption, expand persistent volume (PV) capacity without downtime to the application, and use a more intuitive naming convention for PVs.

For easy evaluation of these features, an OpenShift Container Platform evaluation subscription now includes access to OCS evaluation binaries and subscriptions.

New features

Now let’s dive deeper into the new features of the OCS 3.10 release:

  • Prometheus OCS volume metrics: Volume consumption metrics data (e.g., volume capacity, available space, number of inodes in use, number of inodes free) available in Prometheus for OCS are very useful. These metrics monitor storage capacity and consumption trends and take timely actions to ensure applications do not get impacted.
  • Heketi topology and configuration metrics: Available from the Heketi HTTP metrics service endpoint, these metrics can be viewed using Prometheus or curl http://<heketi_service_route>/metrics. These metrics can be used to query heketi health, number of nodes, number of devices, device usage, and cluster count.
  • Online expansion of provisioned storage: You can now expand the OCS-backed PVs within OpenShift by editing the corresponding claim (oc edit pvc <claim_name>) with the new desired capacity (spec→ requests → storage: new value).
  • Custom volume naming: Before this release, the names of the dynamically provisioned GlusterFS volumes were auto-generated with random uuid number. Now, by adding a custom volume name prefix, the GlusterFS volume name will include the namespace or project as well as the claim name, thereby making it much easier to map to a particular workload.
  • Arbiter volumes: Arbiter volumes allow for reduced storage consumption and better performance across the cluster while still providing the redundancy and reliability expected of GlusterFS.

Volume and Heketi metrics

As of OCP 3.10 and OCS 3.10, the following metrics are available in Prometheus (and by executing curl http://<heketi_service_route>/metrics):

kubelet_volume_stats_available_bytes:      Number of available bytes in the volume
kubelet_volume_stats_capacity_bytes: Capacity in bytes of the volume
kubelet_volume_stats_inodes: Maximum number of inodes in the volume
kubelet_volume_stats_inodes_free: Number of free inodes in the volume
kubelet_volume_stats_inodes_used: Number of used inodes in the volume
kubelet_volume_stats_used_bytes: Number of used bytes in the volume
heketi_cluster_count: Number of clusters
heketi_device_brick_count: Number of bricks on device
heketi_device_count: Number of devices on host
heketi_device_free: Amount of free space available on the device
heketi_device_size: Total size of the device
heketi_device_used: Amount of space used on the device
heketi_nodes_count: Number of nodes on the cluster
heketi_up: Verifies if heketi is running
heketi_volumes_count: Number of volumes on cluster

 

 

Populating Heketi metrics in Prometheus requires additional configuration of the Heketi service. You must add the bolded annotations using the following commands:

# oc annotate svc heketi-storage prometheus.io/scheme=http
# oc annotate svc heketi-storage prometheus.io/scrape=true
# oc describe svc heketi-storage
Name:           heketi-storage
Namespace:      app-storage
Labels:         glusterfs=heketi-storage-service
                heketi=storage-service
Annotations:    description=Exposes Heketi service
                prometheus.io/scheme=http
                prometheus.io/scrape=true
Selector:       glusterfs=heketi-storage-pod
Type:           ClusterIP
IP:             172.30.90.87
Port:           heketi  8080/TCP
TargetPort:     8080/TCP

Populating Heketi metrics in Prometheus also requires additional configuration of the Prometheus configmap. As shown in the following, you must modify the Prometheus configmap with the namespace of Hekti service and restart prometheus-0 pod:

# oc get svc --all-namespaces | grep heketi
appstorage       heketi-storage       ClusterIP 172.30.90.87  <none>  8080/TCP
# oc get cm prometheus -o yaml -n openshift-metrics
....
- job_name: 'kubernetes-service-endpoints'
   ...
   relabel_configs:
     # only scrape infrastructure components
     - source_labels: [__meta_kubernetes_namespace]
       action: keep
       regex: 'default|logging|metrics|kube-.+|openshift|openshift-.+|app-storage'
# oc scale --replicas=0 statefulset.apps/prometheus
# oc scale --replicas=1 statefulset.apps/prometheus

Online expansion of GlusterFS volumes and custom naming

First, let’s discuss what’s needed to allow expansion of GlusterFS volumes. This opt-in feature is enabled by configuring the StorageClass for OCS with the parameter allowVolumeExpansion set to “true,” enabling the feature gate ExpandPersistentVolumes. You can now dynamically resize storage volumes attached to containerized applications without needing to first detach and then attach a storage volume with increased capacity, which enhances application availability and uptime.

Enable the ExpandPersistentVolumes feature gate on all master nodes:

# vim /etc/origin/master/master-config.yaml
kubernetesMasterConfig:
  apiServerArguments:
    feature-gates:
    - ExpandPersistentVolumes=true
# /usr/local/bin/master-restart api
# /usr/local/bin/master-restart controllers

This release also supports adding a custom volume name prefix created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID). Parameterizing the StorageClass ( `volumenameprefix: myPrefix`) allows easier identification of volumes in the GlusterFS backend.

The new OCS PVs will be created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID), making it easier for you to automate day-2 admin tasks like backup and recovery, applying policies based on pre-ordained volume nomenclature, and other day-2 housekeeping tasks.

In this StorageClass, support for both online expansion of OCS/GlusterFS PVs and custom volume naming has been added.

# oc get sc glusterfs-storage -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: glusterfs-storage
parameters:
  resturl: http://heketi-storage-storage.apps.ose-master.example.com
  restuser: admin
  secretName: heketi-storage-admin-secret
  secretNamespace: storage
  volumenameprefix: gf 
allowVolumeExpansion: true 
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete

❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
Parameter needed for online expansion or resize of GlusterFS PVs

Be aware that PV expansion is not supported for block volumes, only for file volumes.

Expanding a volume starts with editing the PVC field “requests:storage” with the new expanded size for the PersistentVolume. For example, we have 1GiB PV, we want to expand the PV to 2GiB. To expand/resize PV to 2GiB, edit the PVC field “requests:storage” with the new value. The PV will be automatically resized to 2GiB. The new 2GiB size will be reflected in OCP, heketi-cli, and gluster commands. The expansion process creates another replica set and converts the 3-way replicated volume to distributed-replicated volume, 2×3 instead of 1×3 bricks.

GlusterFS arbiter volumes

Arbiter volume support is new to OCS 3.10 and has the following advantages:

  • An arbiter volume is still a 3-way replicated volume for highly available storage.
  • Arbiter bricks do not store file data; they only store file names, structure, and metadata.
  • Arbiter uses client quorum to compare this metadata with metadata of other nodes to ensure consistency of the volume and prevent split brain conditions.
  • Using Heketi commands, it is possible to control arbiter brick placement using tagging so that all arbiter bricks are on the same node.
  • With control of arbiter brick placement, the ‘arbiter’ node can have limited storage compared to other nodes in the cluster.

The following example has two gluster volumes configured across 5 nodes to create two 3-way arbitrated replicated volumes, with the arbiter bricks on a dedicated arbiter node.

In order to use arbiter volumes with OCP workloads, an additional parameter must be added to the GlusterFS StorageClass, user.heketi.arbiter true. In this StorageClass, support for the online expansion of GlusterFS PVs, custom volume naming, and arbiter volumes have been added.

# oc get sc glusterfs-storage -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: glusterfs-storage
parameters:
  resturl: http://heketi-storage-storage.apps.ose-master.example.com
  restuser: admin
  secretName: heketi-storage-admin-secret
  secretNamespace: storage
  volumenameprefix: gf 
  volumeoptions: user.heketi.arbiter true ❸
allowVolumeExpansion: true 
provisioner: kubernetes.io/glusterfs
reclaimPolicy: Delete

❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
Parameter needed for online expansion or resize of GlusterFS volumes
❸ Enable arbiter volume support in the StorageClass. All the PVs created from this StorageClass will be 3-way arbitrated replicated volume.

Want to learn more?

For hands-on experience combining OpenShift and OCS, check out our test drive, a free, in-browser lab experience that walks you through using both. Also, check out  this short video explaining why using OCS with OpenShift is the right choice for the container storage infrastructure. For details on running OCS 3.10 with OCP 3.10, click here.

Breaking down data silos with Red Hat infrastructure

By Brent Compton, Senior Director, Technical Marketing, Red Hat Cloud Storage and HCI

Breaking down barriers to innovation.
Breaking down data silos.

These are arguably two of the top items on many enterprises’ wish lists. In the world of analytics infrastructure, people have described a solution to these needs as “multi-tenant workload isolation with shared storage.” Several public-cloud-based analytics solutions exist to provide this. However, many large Red Hat customers are doing large-scale analytics in their own data centers and were unable to solve these problems with their on-premises analytic infrastructure solutions. They turned to Red Hat private cloud platforms as their analytics infrastructure and achieved just this: multi-tenant workload isolation with shared storage. To be clear, Red Hat is not providing these customers with analytics tools. Instead, it is welcoming these analytics tools onto the same Red Hat infrastructure platforms running much of the rest of their other enterprise workloads.

Traditional on-premises analytics infrastructures do not provide on-demand provisioning for short-running analytics workloads, frequently needed by data scientists. In addition, traditional HDFS-based infrastructures do not share storage between analytics clusters. As such, traditional analytics infrastructures often don’t meet the competing needs of multiple teams needing different types of clusters, all with access to common data sets. Individual teams can end up competing for the same set of cluster resources, causing congestion in busy analytics clusters, leading to frustration and delays in getting insights from their data.

As a result, a team may demand their own separate analytics cluster so their jobs aren’t competing for resources with other teams, and so they can tailor their cluster to their own workload needs. Without a shared storage repository, this can lead to multiple analytic cluster silos, each with its own copy of data. Net result? Cost duplication and the burden of maintaining and tracking multiple data set copies.

An answer to these challenges? Bring your analytics workloads onto a common, scalable infrastructure.

Red Hat has seen customers solve these challenges by breaking down traditional Hadoop silos and bringing analytics workloads onto a common, private cloud infrastructure running in today’s enterprise datacenters. At its core is Red Hat Ceph Storage, our massively scalable, software-defined object storage platform, which enables organizations to more easily share large-scale data sets between analytics clusters. The on-demand provisioning of virtualized analytics clusters is enabled through Red Hat OpenStack Platform. Additionally, early adopters are deploying Apache Spark in kubernetes-orchestrated, container-based clusters via Red Hat OpenShift Container Platform. Delivery and support are provided by the IT experts at Red Hat Consulting based on documented leading practices to help establish an optimal architecture for our clients’ unique requirements.

Key benefits to customers

Agility

  • Get answers faster. By enabling teams to elastically provision their own dedicated analytics compute resources via Red Hat OpenStack Platform, teams have avoided cluster resource competition in order to better meet service-level agreements (SLAs). And teams can spin up these new analytics clusters without lengthy data-hydration delays (made possible by accessing shared data sets on Red Hat Ceph Storage).
  • Remove roadblocks. Empower teams of data scientists to use the analytics tools/versions they need through dynamically provisioned data labs and workload clusters (while still accessing shared data sets).
  • Hybrid cloud versatility. Enable your query authors to use the same S3 syntax in their queries, whether running on a private cloud or public cloud. Spark and other popular analytics tools can use the Hadoop S3A client to access data in S3-compatible object storage, in place of native HDFS. Ceph is the most popular S3-compatible open-source object storage backend for OpenStack.

Cost/risk reduction

  • Cut costs associated with data set duplication. In traditional Hadoop/Spark HDFS clusters, data is not shared. If a data scientist wants to analyze data sets that exists in two different clusters, they may need to copy data sets from one cluster to the other. This can result in duplicate costs for multi-PB data sets that must be copied among many analytics clusters.
  • Reduce risks of maintaining duplicate data sets. Duplicate data-set maintenance can be time-consuming and prone to error, but it can also result in incomplete or inaccurate insights being derived from stale data.
  • Scale costs based on requirements. In traditional Hadoop/Spark HDFS clusters, capacity is added by procuring more HDFS nodes with a fixed ratio of CPU and storage capacity. With Red Hat data analytics infrastructure, customers can provision compute servers separately from a common storage pool and thus can scale each resource according to need. By freeing storage capacity from compute cores previously locked together, companies can scale storage capacity costs independently of compute costs according to need.

Innovation for today’s data needs

As data continues to grow, organizations should have a supporting infrastructure that can break down data silos and enable teams to access and use information in more agile ways. Red Hat platforms can foster greater agility, efficiency, and savings–a nice combination for today’s data-driven organizations looking to build analytics applications across the open hybrid cloud.

You can also find our blog post that covers other news from the Strata conference and upstream community projects here. For more details on empirical test results, see here. For a video whiteboard of these topics, see here. Finally, To learn more, visit www.redhat.com/bigdata.

 

Introducing Red Hat Gluster Storage 3.4: Feature overview

By Anand Paladugu, Principal Product Manager

We’re pleased to announce that Red Hat Gluster Storage 3.4 is now Generally Available!

Since this release is a full rebase with the upstream, it consolidates many bug fixes, thus giving you a greater degree of overall stability for both container storage and traditional file serving use cases. Given that Red Hat OpenShift Container Storage is based on Red Hat Gluster Storage, these fixes will also be embedded in the 3.10 release of OpenShift Container Storage. To enable you to refresh your Red Hat Enterprise Linux (RHEL) 6-based Red Hat Gluster Storage installations, this release supports upgrading your Red Hat Gluster Storage servers from RHEL 6 to RHEL 7. Last, you can now deploy Red Hat Gluster Storage Web Administrator with minimal resources, which also offers robust and feature-rich monitoring capabilities.

Here is an overview of the new features delivered in Red Hat Gluster Storage 3.4:

Support for upgrading Red Hat Gluster Storage from RHEL 6 to RHEL 7

Many customers like to ensure they’re on the latest and greatest RHEL in their infrastructures. Two scenarios are now supported for upgrading RHEL servers in a Red Hat Gluster Storage deployment from RHEL 6 to RHEL 7:

  1. Red Hat Gluster Storage version is <= 3.3.x and the underlying RHEL version is <= latest version of 6.x. The upgrade process updates Red Hat Gluster Storage to version 3.4 and the underlying RHEL version to the latest version of RHEL 7.
  2. Red Hat Gluster Storage version is 3.4 and the underlying RHEL version is the latest version of 6.x. The upgrade process keeps the Red Hat Gluster Storage version at 3.4 and upgrades the underlying RHEL version to the latest version of RHEL 7.

MacOS client support

Mac workstations continue to make inroads into corporate infrastructures. Red Hat Gluster Storage 3.4 supports MacOS as a Server Message Block (SMB) client and thereby allows customers to map SMB shares backed by Red Hat Gluster Storage in the MAC finder tool.

Punch hole support for third-party applications

The “punch hole” feature provides the benefit of freeing up physical disk space when portions of a file are de-referenced. For example, suppose you’ve used up 20 Gigs of your disk space for backing up a file, and some portions of the file are de-referenced due to data duplication. Without punch hole support, the 20 Gigs remain occupied in the underlying physical hard disk. With support for punch holes, however, third-party applications can “punch a hole” corresponding to the portions of the deleted files, thereby freeing up physical disk space. This further helps to reduce storage costs associated with backing up and archiving those virtual machines (VMs).

Subdirectory exports using the Gluster Fuse protocol now fully supported

Beginning with Red Hat Gluster Storage 3.4, subdirectory export using Fuse is now fully supported. This feature provides namespace isolation where a single Gluster volume can be shared to many clients, and they can be mounting only a subset of the volume (namespace) (i.e., a subdirectory). You can also export a subdirectory of the already exported volume, to utilize space left in the volume for a different project.

Red Hat Gluster Storage web admin enhancements

The Web Administration tool delivers browser-based graphing, trending, monitoring, and alerting for Red Hat Gluster Storage in the enterprise. This latest Red Hat Gluster Storage release optimizes this web admin tool to consume fewer resources and allow greater scaling to monitor larger clusters than in the past.

Faster directory lookups using the Gluster NFS-Ganesha server

In Red Hat Gluster Storage 3.4, the Readdirp API is extended and enhanced to return handles along with directory stats as part of its reply, thereby reducing NFS operations latency.

In internal testing, performance gains were noticed for all directory operations when compared to Red Hat Gluster Storage 3.3.1. For example, make directory operations improved by up to 31%, file create operations have improved by up to 42%, and file read operations have improved by up to 150%.

Want to learn more?

For hands-on experience with Red Hat Gluster Storage, check out our test drive.