2 Node PowerHA/HACMP setup using NIM (NFS RG)

This post will briefly describe how to setup a 2 node PowerHA 6.1 cluster with NFS as a Resource Group.

1. Pre Installation Requisites
2. Installing PowerHA using NIM
3. IP/Network Configuration
4. Storage Setup
5. Create cluster
6. NFS Setup
7. Monitoring and status commands.

1. Pre Install Fileset requisites:

The following filesets should be installed on both nodes before installing PowerHA as stated in RedBook SG24-7739-00:

bos.adt.lib
bos.adt.libm
bos.adt.syscalls
bos.net.tcp.client
bos.net.tcp.server
bos.rte.SRC
bos.rte.libc
bos.rte.libcfg
bos.rte.libcur
bos.rte.libpthreads
bos.rte.odm
bos.data
bos.rte
bos.rte.lvm
bos.clvm.enh
bos.net.nfs.server
bos.net.nfs.client
bos.clvm.enh
rsct.compat.basic.hacmp
rsct.compat.clients.hacmp
rsct.basic.rte

In order to check if the above listed filesets are installed, you can use this simple script, copy paste or download here.

 

#!/bin/ksh

# Tested on IBM AIX 6.1 

# set -x

# powerha_preq.sh  Version 0.3

# PowerHA pre-install fileset requisite check.
# Created by JJ aixdoc.wordpress.com
##########################################################################################################
# Purpose:                                                                                               #
# This script will check if all filesets required by PowerHA6.1 as a pre-install requisite are installed.#
# The fileset list is according to IBM Redbook SG24-7739-00.                                             #
##########################################################################################################

# Disclaimer: This script is provided on the basis of "as is" and without any expressed or implied warranty
# The AUTHOR of this script is not responsible for any damage or loss arising out of use of this script.

# This is the list of filesets that should be installed prior to PowerHA 6.1 installation:

list="
bos.adt.lib
bos.adt.libm
bos.adt.syscalls
bos.net.tcp.client
bos.net.tcp.server
bos.rte.SRC
bos.rte.libc
bos.rte.libcfg
bos.rte.libcur
bos.rte.libpthreads
bos.rte.odm
bos.data
bos.rte
bos.rte.lvm
bos.clvm.enh
bos.net.nfs.server
bos.net.nfs.client
rsct.compat.basic.hacmp
rsct.compat.clients.hacmp
rsct.basic.rte
"

# Check if tmp file exists, if yes delete the file.

if [[ -a /tmp/pre_ha_list.out ]]
  then 
     rm /tmp/pre_ha_list.out
fi

# Create and fill the tmp file with  the list of prereq. filesets so that each fileset name is on a new line.

echo $list | tr ' \t' '\n' | tr -s '\n' > /tmp/pre_ha_list.out

# Check if the secondary tmp file does exist, if so delete the file. This file will be filled with the list of installed filesets, for later comparison.

if [[ -a /tmp/pre_ha_installed.out ]]
  then 
     rm  /tmp/pre_ha_installed.out
fi

# Create a list of filesets that are PowerHA pre-install requisites and that are already installed. 

for name in `cat /tmp/pre_ha_list.out`
do lslpp -lJ | awk '!/#/ !_[$1]++ {print $1}' | grep -x $name >> /tmp/pre_ha_installed.out 
done

# Check for tmp file existence, delete if yes.

if [[ -a /tmp/ha_st.tmp ]]
    then 
       rm /tmp/ha_st.tmp
fi

# Compares the both lists, prints missing filesets.

	echo ""
        echo ""
{ diff /tmp/pre_ha_list.out /tmp/pre_ha_installed.out; echo $? > /tmp/ha_st.tmp;}  | grep "/\t/g;s/ > You need to install the filesets listed above before you start with the PowerHA 6.1 installation. <  > All PowerHA 6.1 pre-requisite filesets are installed on this system. < < "
    echo "     According to Redbook SG24-7739-00."
    echo " "
    echo " "
    echo "This is the list of filesets installed: "
sleep 2 
    echo " "
    echo $list | tr ' \t' '\n' | tr -s '\n'
    echo " "
fi

# Delete temp files

rm /tmp/pre_ha_list.out
rm /tmp/pre_ha_installed.out
rm /tmp/ha_st.tmp

2 Installing PowerHA

Now we will create a LPP Source on the NIM master server that will contain the PowerHA filesets which we will later install on both nodes.

       Operation               Attribute
       |                       |
# nim -o define -t lpp_source -a location=/export2/HA_LP -a server=master -a source=/i/PowerHA/installp/ppc HA6_LPP
                 |                                                                                          |      
                 Type                                                                         LPP_Source name

Preparing to copy install images (this will take several minutes)...

....

Since NIM did not fill the new LPP Source with all filesets I ran an update operation against the LPP Source

#  nim -o update -a packages=all -a source=/i/PowerHA/installp/ppc HA6_LPP

/export2/HA_LP/installp/ppc/cluster.msg.Ja_JP.hativoli.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.Ja_JP.es.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.Ja_JP.cspoc.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.Ja_JP.assist.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.En_US.hativoli.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.En_US.es.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.En_US.cspoc.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.msg.En_US.assist.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.man.en_US.es.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.man.en_US.assist.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.license.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.hativoli.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.worksheets.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.server.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.plugins.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.nfs.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.cspoc.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.client.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.cfs.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.es.assist.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.doc.en_US.es.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.doc.en_US.assist.6.1.0.0.I
/export2/HA_LP/installp/ppc/cluster.adt.es.6.1.0.0.I

# lsnim -l HA6_LPP
HA6_LPP:
   class       = resources
   type        = lpp_source
   arch        = power
   Rstate      = ready for use
   prev_state  = unavailable for use
   location    = /export2/HA_LP
   alloc_count = 0
   server      = master

To get a clean list of filesets contained in the LPP_Source use:

       Operation           Remove lines beginning with '#'     Begin from line 3
       |                   |                                   | 
# nim -o showres HA6_LP | sed -e '/^#/d'| grep -v '^$' | awk ' NR > 2 { print $1 }
                 |                        |                             |    
                 LPP Source name          Removes empty lines    Print the 1'st field

Define a NIM Mashine Group:

       Operation                           Hostname
       |                                   |
# nim -o define -t mac_group -a add_member=power1 -a add_member=power2 NIM_G1
                 |                                                     |
                 Type                                         Group Name
# lsnim -l NIM_G1
NIM_G1:
   class   = groups
   type    = mac_group
   member1 = power1
   member2 = power2

Now Allocate the LPP_Source to the NIM Mashine Group and initiate the installation on the Mashine Group.

       Operation                        NIM Machine Group Name
       |                                |
# nim -o allocate -a lpp_source=HA6_LPP NIM_G1
                    |
                    Attribute

# nim -o cust -a filesets=all -a lpp_source=HA6_LPP -a accept_licenses=yes NIM_G1

+-----------------------------------------------------------------------------+
                      Initiating "cust" Operation
+-----------------------------------------------------------------------------+
 Allocating resources ...
Initiating the cust operation on machine 1 of 2: power1 ...

 Initiating the cust operation on machine 2 of 2: power2 ...

+-----------------------------------------------------------------------------+
                      "cust" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 power1                  INITIATED
 power2                  INITIATED

Note: Use the lsnim command to monitor progress of “INITIATED”
targets by viewing their NIM database definition.

# lsnim -l power2
power2:
   class           = machines
   type            = standalone
   connect         = nimsh
   platform        = chrp
   netboot_kernel  = 64
   if1             = EN_0 power2 0
   cable_type1     = tp
   Cstate          = customization is being performed
   prev_state      = customization is being performed
   Mstate          = currently running
   lpp_source      = HA6_LPP
   nim_script      = nim_script
   cpuid           = 00C4489D4C00
   control         = master
   Cstate_result   = success
   installed_image = MK_P2_61_07_02

The install log can be found in /var/adm/ras/nim.installp on the respective node.

*hativoli does not have to be installed if Tivoli is not used. If you attempt to install this filesets they will fail do to missing dependencies.

After installation reboot both nodes.

3. IP/Network Configuration
Like all the cluster components also the nework part should be redudant -> avoid SPOF’s (Single Point Of Failure). In this scenario in the context of Neworking each node has 2 NIC’s. Each interface will have a so called boot/base IP, that is a default/standard IP that is assigned to the adapter by the OS during the boot.
The persistent IP is node bound, if a adapter on a node shold fail the persistent IP will move to the second adapter. The service IP is part of a Resource Group that is highly available and is used for comunication between the clients and the node/cluster.
In this configuration IP Address Takover via aliasing is used, the persistent and the Service IP will be configured via IP Aliases.

Subnet configuration restrictions are dependent on the PowerHA setup. They are described in the redbook SG24-7739-00. All the subnets in a PowerHA setup should be in one VLAN.

For IP interface configuration you may refer to “IP Configuration in AIX” post

Add the boot addresses to /usr/es/sbin/cluster/etc/rhosts file on both nodes.
The /etc/hosts file should be configured on each node and be identical.

cat /etc/hosts
# Boot IP
192.168.2.5     power1   # Boot1 P1
192.168.2.6     power1b  # Boot2 P1

192.168.2.7     power2   # Boot1 P2
192.168.2.8     power2b  # Boot2 P2

# Persistent
192.168.3.66    power1p # Persistent P1
192.168.3.77    power2p # Persistent P2

# Service
192.168.3.10  hapower # Service IP

4. Storage:
Detailed storage subsystem setup if out of scope of this post. In short, redudant adapters, switches and MPIO should be configured.

5. Create Cluster

Choose a cluster name and add the nodes in the following smit menu:

# smitty hacmp
  > Initialization and Standard Configuration
    > Configure an HACMP Cluster and Nodes

5.1 Add a network to the PowerHA configuration and add the network interfaces.

Heartbeating over IP aliases will be used as is described here

# smitty hacmp
 > Extended Configuration
  > Extended Topology Configuration
   > Configure HACMP Networks
     > Add a Network to the HACMP Cluster

                                  Add an IP-Based Network to the HACMP Cluster

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
* Network Name                                       [net_ether_01]
* Network Type                                        ether
* Netmask(IPv4)/Prefix Length(IPv6)                  [255.255.255.0]
* Enable IP Address Takeover via IP Aliases          [Yes]                                                         +
  IP Address Offset for Heartbeating over IP Aliases [192.168.4.1]

Add ethernet interfaces to the cluster

# smitty hacmp
 > Extended configuration
   > Extended Topology configuration
     > Configure HACMP Comunication Interfaces/Devices
      > Add Comunication Interface/Device > Discovered > Comunication Interface 

      | # power1 / net_ether_01                                                  |
      |         en0               power1                            192.168.2.5  |
      |         en2               power1b                           192.168.2.6  |
      |                                                                          |
      | # power2 / net_ether_01                                                  |
      |         en0               power2                            192.168.2.7  |
      |         en2               power2b                           192.168.2.8  |

5.2 Configure Disk Hearbeat

Ensure that the disk used as a hearbeat device do have the same pvid on both nodes.

Create a Enhanced concurrent VG on the first node.

Enhanced Concurrent VG is necessary because both nodes will require read & write access to this disk in order to have a functionall HeartBeat mechanism.

        Not vary on at boot
        |          VG Name   Disk device name
        |          |         |
# mkvg -n -s 4 -C -y hb_vg hdisk_hb
           |    | 
           |    Enhanced Concurent Volume Group
            Partition size 
hb_vg
mkvg: This concurrent capable volume group must be varied on manually.

Now import the VG on the second node

         
# importvg -y hb_vg hdisk_hb

synclvodm: No logical volumes in volume group ha_vg.
ha_vg
0516-783 importvg: This imported volume group is concurrent capable.
        Therefore, the volume group must be varied on manually.

Create a non IP heartbeat (diskhb) network

# smitty hacmp
Extended Configuration
 > Extended Topology Configuration
  > Configure HACMP Networks
    > Add a Network to the HACMP Cluster

                                    Add a Serial Network to the HACMP Cluster

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
* Network Name                                       [net_diskhb_01]
* Network Type                                        diskhb

Add comunication devices to the cluster, the hdisk_hb device from both nodes.

# smitty hacmp
 > Extended Configuration
  > Extended Topology Configuration
    > Configure HACMP Communication Interfaces/Devices
      > Add Communication Interfaces/Devices

                                     Configure HACMP Communication Interfaces/Devices

  Select Point-to-Point Pair of Discovered Communication Devices to Add   |
  |                                                                          |
  | Move cursor to desired item and press F7.                                |
  |     ONE OR MORE items can be selected.                                   |
  | Press Enter AFTER making all selections.                                 |
  |   # Node                              Device   Pvid                      |
  |     power1                            hdisk1   00cf405e0a662a8f          |
  |     power2                            hdisk1   00cf405e0a662a8f          |
  |     power1                            hdisk2   00c4489d0a494dbc          |
  |     power2                            hdisk2   00c4489d0a494dbc          |
  | >   power1                            hdisk_hb 00c4489d0a55f6f0       |
  | >   power2                            hdisk_hb 00c4489d0a55f6f0       |

The cltopinfo command shows you the configuration

# /usr/es/sbin/cluster/utilities/cltopinfo
Cluster Name: CL_01
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 2 node(s) and 3 network(s) defined
NODE power1:
        Network net_diskhb_01
                power1_hdisk_hb_01      /dev/hdisk_hb
        Network net_ether_01
                power1  192.168.2.5
                power1b 192.168.2.6
NODE power2:
        Network net_diskhb_01
                power2_hdisk_hb_01      /dev/hdisk_hb
        Network net_ether_01
                power2  192.168.2.7
                power2b 192.168.2.8

No resource groups defined

Test the Heartbeat disk network communication.

We will send data from the second node and receive on the first node
For this to work it is nacessary to initiate the receive operation on the first node first.

            Specifies device name           Receive mode 
                                     |           |
1 # /usr/sbin/rsct/bin # ./dhb_read -p hdisk_hb -r
 HB CLASSIC MODE
 First node byte offset: 61440
Second node byte offset: 62976
Handshaking byte offset: 65024
       Test byte offset: 64512

Receive Mode:
Waiting for response . . .
Magic number = 0x87654321
Magic number = 0x87654321
Magic number = 0x87654321

Now on the second node initate the transmitt operation.

                  Specifies device name           Transmitt mode
                                      |           |
2 # /usr/sbin/rsct/bin) # ./dhb_read -p hdisk_hb -t
HB CLASSIC MODE
 First node byte offset: 61440
Second node byte offset: 62976
Handshaking byte offset: 65024
       Test byte offset: 64512

Transmit Mode:
Magic number = 0x87654321
Detected remote utility in receive mode.  Waiting for response . . .
Magic number = 0x87654321
Magic number = 0x87654321
Link operating normally

The message: “Link operating normally” should appear on both nodes.
Swap the transmitt and receive nodes and test again.

If you select to use multinode disk hearbeat it would be necessary to create a Resource Group for the hearbeat that would be online on both nodes.
More information can be found the redbook SG24-7739-00 on page 586

5.3 Create the IP service Label

# smitty hacmp
  > Initialization and Standard Configuration
    > Configure Resources to Make Highly Available
      > Configure Service IP Labels/Addresses
        > Add a Service IP Label/Address

5.4 Add Persistent IP’s

# smitty hacmp
 > Extended Configuration
   > Extended Topology Configuration
     > Configure HACMP Persistent Node IP Lables/Addresses
      > Add Persistent IP Label/Address

Verify and Synchronize the cluster, correct any errors that appear, repeat untill the Sync and verify end’s with a OK.

#smitty hacmp
 > Extended Configuration 
   > Extended Verification and Synchronization

So now we can start the Cluster services on both nodes.

 
# smitty hacmp
 > System Management (C-SPOC)
  > HACMP Services
   > Start Cluster Services

                                      Start Cluster Services

Type or select values in entry fields.
Press Enter AFTER making all desired changes.

                                                        [Entry Fields]
* Start now, on system restart or both                now                                       +
  Start Cluster Services on these nodes              [power2,power1]                            +
* Manage Resource Groups                              Manually                                  +
  BROADCAST message at startup?                       false                                     +
  Startup Cluster Information Daemon?                 true                                      +
  Ignore verification errors?                         false                                     +
  Automatically correct errors found during           Interactively                             +
  cluster start?

This will take a while, if you selected to start the Cluster Information Daemon u can get a cluster status overview after all services have been started.
Both nodes and their interfaces are up, also the hapower –  service label is up. But we still need to create a resource group.

# /usr/es/sbin/cluster/clstat -r1 
                clstat - HACMP Cluster Status Monitor
                -------------------------------------

Cluster: CL_01  (1082309238)
Tue Jun 26 12:07:23 GMT+02:00 2012
                State: UP               Nodes: 2
                SubState: STABLE

        Node: power1            State: UP
           Interface: power1 (2)                Address: 192.168.2.5
                                                State:   UP
           Interface: power1b (2)               Address: 192.168.2.6
                                                State:   UP
           Interface: power1_hdisk_hb_01 (1)            Address: 0.0.0.0
                                                State:   UP
           Interface: hapower (2)               Address: 192.168.3.10
                                                State:   UP

        Node: power2            State: UP
           Interface: power2 (2)                Address: 192.168.2.7
                                                State:   UP
           Interface: power2b (2)               Address: 192.168.2.8
                                                State:   UP

One thing that is missing in the clstat output is the IP’s used for heartbeating via aliases. In the ifconfig output we can see them, the 192.168.4.1 and 192.168.5.1.
Details about this topic are  here:

# ifconfig -a
en0: flags=1e080863,c0
        inet 192.168.2.5 netmask 0xffffff00 broadcast 192.168.2.255
        inet 192.168.3.66 netmask 0xffffff00 broadcast 192.168.3.255
      > inet 192.168.4.1 netmask 0xffffff00 broadcast 192.168.4.255
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en2: flags=4e080863,80
        inet 192.168.2.6 netmask 0xffffff00 broadcast 192.168.2.255
        inet 192.168.3.10 netmask 0xffffff00 broadcast 192.168.3.255
      > inet 192.168.5.1 netmask 0xffffff00 broadcast 192.168.5.255

5.5 Create a Enhanced Concurent VG (ECVG) on source node for the NFS Resource Group.

Enhanced Concurrent Volume Groups is varied on on all cluster member nodes but only the node that holds the
Resource Group online has the read/write access to the VG. The other node has only passive (read) access.

Note: [ For dynamic changes on Enhanced Concurrent Volume Groups to work correctly,
it is required to have gsclvmd, topsvcs, grpsvcs, and emsvcs running while
performing maintenance. This means the HACMP services must run ]

        Not vary on at boot
        | Major Number   VG Name      
        |          |     |             
# mkvg -n -s 4 -C -V 50 -y HA_VG hdisk1 hdisk2
           |    | 
           |    Enhanced Concurent Volume Group
           Partition size

HA_VG      
mkvg: This concurrent capable volume group must be varied on manually.

 # lspv
hdisk0          00c0e90dce6c290a                    rootvg          active              
hdisk1          00cf405e795187cc                    HA_VG                               
hdisk2          00cf405e795ab841                    HA_VG

The VG must have the same Major Number on both nodes.

Now on the destination node import the VG

# importvg -V 50 -y HA_VG hdisk1   
ynclvodm: No logical volumes in volume group HA_VG.
HA_VG
0516-783 importvg: This imported volume group is concurrent capable.
        Therefore, the volume group must be varied on manually.
 # lspv
hdisk0          00cf405ea25f92ed                    rootvg          active              
hdisk1          00cf405e795187cc                    HA_VG                               
hdisk2          00cf405e795ab841                    HA_VG

On Source node, varyon the VG

            
# varyonvg -c HA_VG
# lspv  
hdisk0          00c0e90dce6c290a                    rootvg          active              
hdisk1          00cf405e795187cc                    HA_VG           concurrent          
hdisk2          00cf405e795ab841                    HA_VG           concurrent          

# lsvg HA_VG
VOLUME GROUP:       HA_VG                    VG IDENTIFIER:  00cf405e00004c0000000137e1b7032d
VG STATE:           active                   PP SIZE:        4 megabyte(s)
VG PERMISSION:      read/write               TOTAL PPs:      1022 (4088 megabytes)
MAX LVs:            256                      FREE PPs:       1022 (4088 megabytes)
LVs:                0                        USED PPs:       0 (0 megabytes)
OPEN LVs:           0                        QUORUM:         2 (Enabled)
TOTAL PVs:          2                        VG DESCRIPTORS: 3
STALE PVs:          0                        STALE PPs:      0
ACTIVE PVs:         2                        AUTO ON:        no
>>Concurrent:         Enhanced-Capable         Auto-Concurrent: Disabled
>>VG Mode:            Concurrent                               
Node ID:            1                        Active Nodes:       
MAX PPs per VG:     32512                                     
MAX PPs per PV:     1016                     MAX PVs:        32
LTG size (Dynamic): 256 kilobyte(s)          AUTO SYNC:      no
HOT SPARE:          no                       BB POLICY:      relocatable 
PV RESTRICTION:     none                     INFINITE RETRY: no

The above repeat on the destination node.

To check if the VG is seen by both cluster nodes from the PowerHA prespective run:

# /usr/es/sbin/cluster/cspoc/cl_ls_shared_vgs -c -C  
#Volume Group    Resource Group                 Node List
 HA_VG             power2,power1
 hb_vg             power2,power1

Mirror the VG.

            Create exact mapping
            |
# mirrorvg -m HA_VG hdisk2
                |     |
                |     Disk to mirror to
                VG Name

In this scenario where 2 disks are used in the VG in a mirror, the quorum will be disabled (set to 1)

Note: Check the Quorum considerations, if different configurations are used. Here and here

Check if the Logical Partitions of the Logical Volumes are mapped to both disks.

# lslv -l nfs_lv
nfs_lv:/N_share                               (LPs)         inner middle
PV                COPIES        IN BAND       DISTRIBUTION  |
hdisk1            020:000:000   100%          000:020:000:000:000 
hdisk2            020:000:000   100%          000:020:000:000:000 
                                               |   |   |        |   
                                               |   |   center   inner edge  
                                               |   outer middle 
                                               outer edge

The lslv output description is here and here

6 NFS Setup

6.1 Prerequisites

The following fileset need to be installed so that PowerHA can work with NFS

# lslpp -l | grep cluster.es.nfs.rte
  cluster.es.nfs.rte         6.1.0.0  COMMITTED  ES NFS Support
  cluster.es.nfs.rte         6.1.0.0  COMMITTED  ES NFS Support

Check if the portmap daemon is running:

#  lssrc -s portmap
Subsystem         Group            PID          Status 
 portmap          portmap          4522122      active

Create Mount point for the shared FS that we will later crate and also a mount point for so called stable storage that is required by NFSv4 as is described here:

# mkdir /N_share
# mkdir /stable

In NFSv4 the nfs domain needs to be configured on all nodes:

 
# chnfsdom 
# chnfsdom [new domain name]

6.2 Create a Logical Volumes and filesystems

 > System Management (C-SPOC)
  > Storage 
   > Logical Volumes 
    > Add a Logical Volume

Check:

# lsvg -l HA_VG
HA_VG:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
nfs_lv              jfs2       300     300     1    closed/syncd  N/A
nfs_stable_lv       jfs2       128     128     1    closed/syncd  N/A

Now create a Filesystem on the LV’s

Note: If the volume group is varied on in concurrent access mode, you will not be able to create logical volumes. A concurrent-capable volume group must be varied on in nonconcurrent access mode to create logical volumes on it. For details click here and see Step 5.

# smitty hacmp
 > System Management (C-SPOC)
   > Storage 
     > File Systems
      > Add a Filesystem

Check

# lsvg -l HA_VG
HA_VG:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
nfs_lv              jfs2       300     300     1    closed/syncd  /N_share
nfs_stable_lv       jfs2       128     128     1    closed/syncd  /stable

Start and Stop scripts.

This come in the case of NFS with AIX

# ls -la /usr/es/sbin/cluster/apps/clas_nfsv4
total 16
drwxr-xr-x    2 root     system          256 Sep 17 2009  .
drwxr-xr-x    4 root     system          256 Sep 17 2009  ..
-rwxr--r--    1 root     system         2703 Sep 17 2009  start
-rwxr--r--    1 root     system         1922 Sep 17 2009  stop

# ls -la /usr/es/sbin/cluster/apps/clam_nfsv4
total 56
drwxr-xr-x    2 root     system          256 Sep 17 2009  .
drwxr-xr-x    4 root     system          256 Sep 17 2009  ..
-rwxr--r--    1 root     system        25870 Sep 17 2009  monitor

6.3 Create the Application server using:

# smitty hacmp
 > Extended Configuration
  > Extended Resource Configuration 
   > HACMP Extended Resources Configuration
    > Configure HACMP Applications Servers
      > Add an Application Server

6.4 Configure application server monitor:

# smitty hacmp
 > Extended Configuration
  > Extended Resource Configuration
   > HACMP Extended Resources Configuration
    > Configure HACMP Applications Servers
      > Configure HACMP Application Monitoring
        > 
         > Configure Custom Application Monitors
          > Add a Custom Application Monitor

The app server monitor script is in /usr/es/sbin/cluster/apps/clam_nfsv4

6.5 Now create the Resource Group

# smitty hacmp
 > Extended Configuration
   > HACMP Extended Resources Configuration
    > HACMP Extended Resource Group Configuration
      > Add a Resource Group

After the Resource group is created go back and select:
> Change/Show Resources and Attributes for a Resource Group
in the above menu path

                        Change/Show All Resources and Attributes for a Custom Resource Group
[TOP]     
  Resource Group Name                                 NFS_RG 
  Participating Nodes (Default Node Priority)         power1 power2

  Startup Policy                                      Online On Home Node Only
  Fallover Policy                                     Fallover To Next Priority Node In The List
  Fallback Policy                                     Never Fallback

  Service IP Labels/Addresses                        [hapower]                                                    +
  Application Servers                                [NFS]                                                        +

  Volume Groups                                      [HA_VG]                                                      +
  Use forced varyon of volume groups, if necessary    false                                                       +
  Automatically Import Volume Groups                  false                                                       +

  Filesystems (empty is ALL for VGs specified)       [/N_share /stable]                                           +
        21                                                                                                        +
  Filesystems/Directories to Export (NFSv4)          [/N_share]                                                   +
  Stable Storage Path (NFSv4)                        [/stable]                                                    +
  Filesystems/Directories to NFS Mount               [/home/N_share;/N_share]

Now we will export the FS

smit mknfsexp
[TOP]                                                   [Entry Fields]
* Pathname of directory to export                    [/N_share]                                                   /
  Anonymous UID                                      [-2]                                                          
  Public filesystem?                                  no                                                          +
* Export directory now, system restart or both        both                                                        +
  Pathname of alternate exports file                 [/usr/es/sbin/cluster/etc/exports]                            
  Allow access by NFS versions                       []                                                           +
  External name of directory (NFS V4 access only)    []

If HACMP services are running stop them and Synchronize the cluster in C-SPOC

# smitty hacmp
 > Extended Configuration
  > Extended Verification adn Synchronization

Correct any errors that might have occured.

Start the cluster Services

6.6 Bring the Resource Group online

# smitty hacmp 
> System Management (C-SPOC) 
  > Resource Groups and Applications 
   > Bring a Resource Group Online

Run clstat.

# /usr/es/sbin/cluster/utilities/clstat

             clstat - HACMP Cluster Status Monitor
                -------------------------------------

Cluster: CL_01  (1082292739)
Thu Jun 28 18:40:53 GMT+02:00 2012
                State: UP               Nodes: 2
                SubState: STABLE

        Node: power1            State: UP
           Interface: power1 (2)                Address: 192.168.2.5
                                                State:   UP
           Interface: power1b (2)               Address: 192.168.2.6
                                                State:   UP
           Interface: power1_hdisk_hb_01 (1)            Address: 0.0.0.0
                                                State:   UP
           Interface: hapower (2)               Address: 192.168.3.10
                                                State:   UP
           Resource Group: RG_NFS                       State:  On line

        Node: power2            State: UP
           Interface: power2 (2)                Address: 192.168.2.7
                                                State:   UP
           Interface: power2b (2)               Address: 192.168.2.8
                                                State:   UP
           Interface: power2_hdisk_hb_01 (1)            Address: 0.0.0.0
                                                State:   UP

6.7 Move the NFS_RG to another node

# smitty hacmp
 > System Management (C-SPOC)
  > Resource Groups and Applications
   > Move a Resource Group to Another Node / Site
    > Move Resource Groups to Another Node

Select the Resource Group and node to which you want to move the RG.

Verify with clstat

          clstat - HACMP Cluster Status Monitor
                -------------------------------------

Cluster: CL_01  (1082292739)
Thu Jun 28 18:45:13 GMT+02:00 2012
                State: UP               Nodes: 2
                SubState: STABLE

        Node: power1            State: UP
           Interface: power1 (2)                Address: 192.168.2.5
                                                State:   UP
           Interface: power1b (2)               Address: 192.168.2.6
                                                State:   UP
           Interface: power1_hdisk_hb_01 (1)            Address: 0.0.0.0
                                                State:   UP

        Node: power2            State: UP
           Interface: power2 (2)                Address: 192.168.2.7
                                                State:   UP
           Interface: power2b (2)               Address: 192.168.2.8
                                                State:   UP
           Interface: power2_hdisk_hb_01 (1)            Address: 0.0.0.0
                                                State:   UP
           Interface: hapower (2)               Address: 192.168.3.10
                                                State:   UP
           Resource Group: RG_NFS                       State:  On line

7 Monitoring and status commands.

Monitoring the heartbeats

# lssrc -ls topsvcs
Subsystem         Group            PID     Status
 topsvcs          topsvcs          7602274 active
Network Name   Indx Defd  Mbrs  St   Adapter ID      Group ID
net_ether_01_0 [ 0] 2     2     S    192.168.5.2     192.168.5.2    
net_ether_01_0 [ 0] en2              0x47ec7559      0x47ec755a
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 5615 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 6680 ICMP 0 Dropped: 0
NIM's PID: 9306116
net_ether_01_1 [ 1] 2     2     S    192.168.4.2     192.168.4.2    
net_ether_01_1 [ 1] en0              0x47ec755b      0x47ec755c
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 5616 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 6682 ICMP 0 Dropped: 0
NIM's PID: 6095076
diskhb_0       [ 2] 2     2     S    255.255.10.1    255.255.10.1   
diskhb_0       [ 2] rhdisk_hb        0x87ec7558      0x87ec755d
HB Interval = 2.000 secs. Sensitivity = 4 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent    : 2685 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 2535 ICMP 0 Dropped: 0
NIM's PID: 6750456
  2 locally connected Clients with PIDs:
haemd(6881296) hagsd(7471334) 
  Fast Failure Detection available but off.
  Dead Man Switch Enabled:
     reset interval = 1 seconds
     trip  interval = 20 seconds
  Client Heartbeating Disabled.
  Configuration Instance = 7
  Daemon employs no security
  Segments pinned: Text Data.
  Text segment size: 862 KB. Static data segment size: 1497 KB.
  Dynamic data segment size: 8449. Number of outstanding malloc: 176
  User time 0 sec. System time 0 sec.
  Number of page faults: 0. Process swapped out 0 times.
  Number of nodes up: 2. Number of nodes down: 0.

Show cluster services

# /usr/es/sbin/cluster/utilities/clshowsrv -v
Status of the RSCT subsystems used by HACMP:
Subsystem         Group            PID          Status 
 topsvcs          topsvcs          7602274      active
 grpsvcs          grpsvcs          7471334      active
 grpglsm          grpsvcs                       inoperative
 emsvcs           emsvcs           6881296      active
 emaixos          emsvcs                        inoperative
 ctrmc            rsct             4980888      active

Status of the HACMP subsystems:
Subsystem         Group            PID          Status 
 clcomdES         clcomdES         4063436      active
 clstrmgrES       cluster          8650824      active

Status of the optional HACMP subsystems:
Subsystem         Group            PID          Status 
 clinfoES         cluster          6553794      active

Show all PowerHA interfaces:

 # ./cltopinfo -i
IP Label                         Network          Type     Node             Address                                  If      Netmask          Prefix Length     
=========                        =======          ====     ====             =======                                  ====    =======          =============     
power1_hdisk_hb_01               net_diskhb_02    diskhb   power1           /dev/hdisk_hb                            hdisk_hb                  
hapower                          net_ether_01     ether    power1           192.168.3.10                                     255.255.255.0    24
power1b                          net_ether_01     ether    power1           192.168.2.6                              en2     255.255.255.0    24
power1                           net_ether_01     ether    power1           192.168.2.5                              en0     255.255.255.0    24
power2_hdisk_hb_01               net_diskhb_02    diskhb   power2           /dev/hdisk_hb                            hdisk_hb                  
hapower                          net_ether_01     ether    power2           192.168.3.10                                     255.255.255.0    24
power2                           net_ether_01     ether    power2           192.168.2.7                              en0     255.255.255.0    24
power2b                          net_ether_01     ether    power2           192.168.2.8                              en2     255.255.255.0    24

Show missed heartbeats

# /usr/es/sbin/cluster/utilities/cltopinfo -m

Interface Name      Adapter       Total Missed  Current Missed
                    Address        Heartbeats     Heartbeats
--------------------------------------------------------------
en2                 192.168.5.1             0              0
en0                 192.168.4.1             0              0
rhdisk_hb          255.255.10.0             0              0

Cluster Services Uptime:        0 days 1 hours 37 minutes

Show overall info

/usr/es/sbin/cluster/utilities) # ./cltopinfo   
Cluster Name: CL_01
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 2 node(s) and 3 network(s) defined
NODE power1:
        Network net_diskhb_01
        Network net_diskhb_02
                power1_hdisk_hb_01      /dev/hdisk_hb
        Network net_ether_01
                hapower 192.168.3.10
                power1b 192.168.2.6
                power1  192.168.2.5
NODE power2:
        Network net_diskhb_01
        Network net_diskhb_02
                power2_hdisk_hb_01      /dev/hdisk_hb
        Network net_ether_01
                hapower 192.168.3.10
                power2  192.168.2.7
                power2b 192.168.2.8

Resource Group RG_NFS
        Startup Policy   Online On Home Node Only
        Fallover Policy  Fallover To Next Priority Node In The List
        Fallback Policy  Never Fallback
        Participating Nodes      power1 power2
        Service IP Label                 hapower

Total Heartbeats Missed:        0
Cluster Topology Start Time:    06/28/2012 17:16:19

Application Monitor status:

# ./clRGinfo -m
---------------------------------------------------------------------------------------------------------------------
Group Name     Group State                  Application state            Node          
---------------------------------------------------------------------------------------------------------------------
RG_NFS         ONLINE                                                    power2        
 clas_nfsv4                                  ONLINE MONITORED

Cluster Status

# lssrc -ls clstrmgrES
Current state: ST_STABLE
sccsid = "@(#)36    1.135.1.97 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 53haes_r610, 0933A_hacmp610 8/8/09 14:44:29"
i_local_nodeid 1, i_local_siteid -1, my_handle 2
ml_idx[1]=0     ml_idx[2]=1     
There are 0 events on the Ibcast queue
There are 0 events on the RM Ibcast queue
CLversion: 11
local node vrmf is 6100
cluster fix level is "0"
The following timer(s) are currently active:
Current DNP values
DNP Values for NodeId - 1  NodeName - power1
    PgSpFree = 522328  PvPctBusy = 0  PctTotalTimeIdle = 96.952674
DNP Values for NodeId - 2  NodeName - power2
    PgSpFree = 129529  PvPctBusy = 3  PctTotalTimeIdle = 94.480065

Possible cluster states:

ST_INIT The cluster is configured but not active on this node.
ST_STABLE The cluster services are running with resources online.
ST_JOINING The cluster node is joining the cluster.
ST_VOTING The cluster nodes are voting to decide event execution.
ST_RP_RUNNING The cluster is running a recovery program.
RP_FAILED A recovery program event script has failed.
ST_BARRIER Clstrmgr is in between events waiting at the barrier.
ST_CBARRIER Clstrmgr is exiting a recovery program.
ST_UNSTABLE The cluster is unstable usually do to an event error

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: