GBase 8a MPP Cluster Multi-Instance Best Practices

GBase 8a MPP Cluster Multi-Instance Best Practices

1 Overview

1.1 Overview

Deploying GBase 8a MPP Cluster on high-configuration servers and NUMA architecture servers (Non-Uniform Memory Access, or NUMA) often results in underutilization of hardware resources when only one database instance is deployed per server. For example:

When the server memory exceeds 300GB, a single database instance struggles to utilize all the memory effectively.
When the server CPU has more than 40 logical cores, a single database instance cannot achieve linear performance improvement with an increase in cores.
When using NUMA architecture servers with multiple NUMA nodes, frequent cross-node memory access by a single database instance leads to suboptimal performance.
A single database instance cannot fully leverage the capabilities of new hardware such as SSD/NVME.

The GBase 8a MPP Cluster V9.5.3 version officially supports multi-instance deployment. Deploying multiple database instances on a single server can address these performance utilization issues and improve cluster performance. According to actual tests, adopting a multi-instance deployment on NUMA architecture and high-configuration servers can significantly enhance cluster performance, with an improvement of more than 1x compared to single-instance deployment.

This document introduces the installation process, configuration recommendations, and management methods for GBase 8a MPP Cluster V9.5.3 in multi-instance deployment scenarios.

1.2 Terminology

The terms used in this document are explained as follows:

Term/Definition
Meaning

Multi-instance
Deployment of multiple database instances, also known as multi-instance deployment, refers to deploying multiple data cluster nodes on a single physical server. Each data cluster node is referred to as a database instance.

NUMA
Non-Uniform Memory Access

gcware node
Node managing the cluster, used for sharing information between GClusters

gcluster node
Node scheduling the cluster, responsible for SQL parsing, SQL optimization, distributed execution plan generation, and execution scheduling.

data node
Node of the data cluster, also known as gnode, consisting of data nodes, serving as the storage and computation unit of the data.

2 Multi-Instance Installation Deployment

2.1 Deployment Plan

GBase 8a MPP Cluster installs multiple data nodes on each server. Each data node needs to be configured with a unique IP address, and nodes are distinguished by these IP addresses. A maximum of one gcluster node and one gcware node can be installed per physical server:

Before deploying the cluster, the following tasks need to be planned and completed:

1) Evaluate and determine the number of instances to be deployed on each server:

Based on the number of NUMA nodes, memory size, cluster size, and business scenarios (load), evaluate the number of database instances to be deployed on each server. It is generally recommended to deploy no more than 4 database instances on a physical server, with each instance having at least 64GB of available memory.

2) IP address resource planning:

Apply for an IP address for each database instance for internal communication within the cluster. It is recommended to configure multiple network cards on the physical server and bind them in load-balancing mode.

3) Disk planning:

Different database instances should use different disk groups for disk I/O isolation. For example, configure a RAID5 disk group for each database instance.

4) Determine cluster architecture:

It is recommended to have an odd number of gcware nodes and gcluster nodes, with a maximum of one gcware node and one gcluster node per physical server. It is suggested to deploy gcware and gcluster nodes on a single NUMA node, separate from data nodes.

5) Ensure server and OS environment meet GBase 8a cluster installation requirements:

Refer to the GBase 8a MPP Cluster product manual.

2.2 Cluster Installation

Example server IPs for installation:

Server 1:

IP1: 192.168.146.20
IP2: 192.168.146.40

Server 2:

IP3: 192.168.146.21
IP4: 192.168.146.41

Current Cluster Version Limitations

The installation package only checks for RedHat, SUSE, and CentOS systems. Other systems need manual adjustments to bypass related checks.
Supports Python versions 2.6 and 2.7, not Python 3.
A physical machine can have only one coordinator and one gcware, and they must share the same IP.

2.2.1 Configure Multiple IPs

For servers with multiple 10Gbps network cards, bind multiple network cards to ensure high network availability and maximum bandwidth.

For each database instance, configure an IP address on a single network card or bound network cards. Example configuration:

vim /etc/sysconfig/network-scripts/ifcfg-p6p2

The blue parts above are the added content, IPADDR1 is the first virtual IP address, and NETMASK1 is the subnet mask for the first virtual IP. Follow this pattern for subsequent virtual IPs. NETMASK must match the NETMASK of the physical IP. If this parameter is not added, the subnet allocated for the virtual IP might differ from the physical IP.

When multiple network cards are not bound, configure one network card for each instance, as shown in the example below:

Restart the network service to apply changes:

service network restart
# or
systemctl restart network

2.2.2 Prepare for Installation

Follow the GBase 8a cluster installation steps as outlined in the product manual. Before formal installation:

Create a gbase user on each server:

useradd gbase
passwd gbase

Upload and extract the installation files:

tar xjf GBase8a_MPP_Cluster-NoLicense-9.5.3.17-redhat7.3-x86_64.tar.bz2
chown -R gbase:gbase gcinstall

Copy and execute SetSysEnv.py on all servers to configure environment variables:

scp SetSysEnv.py root@192.168.146.21:/opt
python SetSysEnv.py –installPrefix=/opt –dbaUser=gbase

Adjust the permissions of the installation path to allow the gbase user to write:

drwxr-x—. 6 gbase gbase 157 Jan 28 18:59 opt

Modify the installation configuration file demo.options:

2.2.3 Execute Installation

As the gbase user, run the installation:

python gcinstall.py –silent=demo.options

2.2.4 Obtain License

If installing the no-license version, skip this section.

1) Fingerprint Collection:
Use any instance IP from the multi-instance server to obtain the fingerprint:

./gethostsid -n 192.168.146.20,192.168.146.21 -u gbase -p gbase -f hostsfingers.txt

2) Generate License:
Email the hostsfingers.txt file to the vendor to obtain the license.

3) Import License:
Import the license to all instances:

./License -n 192.168.146.20,192.168.146.21,192.168.146.40,192.168.146.41 -u gbase -p gbase -f gbase.lic

2.2.5 Cluster Initialization

Configure distribution and execute initialization as per the product manual.

gcadmin createvc vc.xml
gcadmin distribution gcChangeInfo.xml p 1 d 1 vc vc1

Ensure primary and standby data slices are on different physical servers for data high availability.

2.3 NUMA Binding

For high-configuration servers with NUMA architecture, it is recommended to evenly allocate NUMA nodes to different GBase instances. For example, on a server with 8 NUMA nodes running two GBase instances, bind 4 NUMA nodes to each instance.

2.3.1 View NUMA Groups

For servers with NUMA architecture, you need to install numactl in advance on each server as follows:

Note: This feature must be installed as it is required for starting services.

Use the numastat command to view the NUMA groups of the server. The following examples show configurations for 4 NUMA nodes and 8 NUMA nodes. Depending on the number of NUMA nodes, you can allocate 1 NUMA node per instance (IP), 2 NUMA nodes per instance (IP), or 4 NUMA nodes per instance (IP), etc.

4 NUMA nodes:

8 NUMA nodes:

2.3.2 Bind GBase 8a Instances to NUMA Nodes

For servers that deploy multiple data nodes (GBase instances) only, it’s recommended to evenly distribute the server’s CPUs and memory across NUMA nodes based on the number of data nodes. For servers that deploy both data nodes (GBase instances) and gcluster/gcware nodes, it’s advisable to deploy gcluster and gcware nodes on the same NUMA node.

The binding relationship between GBase 8a instances and NUMA nodes is configured through adjustments to the gcluster_services script. After installing multiple instances, because the cluster service startup command gcluster_services points to the gcluster_services script under any instance (including gnode and gcluster instances), you can specify the gcluster_services file of a particular instance to add binding commands. For example, modify the gcluster_services file under IP/gnode/server/bin to add bindings.

There are two methods to start the database service thereafter:

Method 1: Use the modified gcluster_services file every time you start the database service.

cd IP/gnode/server/bin
./gcluster_services all start

Method 2: Copy the modified gcluster_services file to replace the gcluster_services files under all instances (IP/gnode/server/bin/gcluster_services and IP/gcluster/server/bin/gcluster_services). Subsequently, use the regular cluster startup command:

gcluster_services all start

Example: For a server with 8 NUMA nodes, where 1 instance is bound to 4 NUMA nodes:

Choose any instance’s gnode/server/bin/gcluster_services file and modify two sections:

1) Original section around line 410:

Modify to:

Note:

numactl –membind=nodes program (nodes specify the nodes to allocate, e.g., 0, 1, or other node numbers; program can be an absolute path or a service startup script)

numactl –cpunodebind=nodes program (nodes specify CPU nodes, followed by the program)

2) Original section around line 500:
Execute code of gcluster_services all start

Modify to:

3) Original section around line 450:
Execute code of gcluster_services gbase|syncserver start

Modify to:

After making these changes to the files above, restart the cluster service:

cd IP/gnode/server/bin
./gcluster_services all start

You can verify the NUMA binding effect using the following command:

numastat `pidof gbased`

For example, to view the effect of binding 1 NUMA node per instance in a scenario with 2 instances and 2 NUMA nodes per instance:

[root@pst_w61 config]$ numastat `pidof gbased`