UNIX SYSTEM ADMINISTRATION : July 2016

Thursday, July 21, 2016

XFS Filesystem

Was studying the XFS filesyetsm to understand . Come through some good and easily understandable docs. below are the extracts .'.......................

The XFS file system was developed as a journaling file system that uses a B-tree balanced tree algorithm to allocate data as fast as possible.
One of the major design goals was support for large files and large file systems. The maximum file size currently supported is 2 Exabytes,
and the maximum file system size is 8 Exabytes.

The direct I/O option guarantees that a file is not buffered in buffer cache, but written to disk immediately after it has been committed.
XFS exclusively offers a guaranteed rate I/O, which guarantees that certain file systems have a minimum I/O-bandwidth.

Features of XFS Filesystem

1. Journaling

journaling is a capability which ensures consistency of data in the file system, despite any power outages or system crash that may occur. XFS provides journaling for file system metadata,
where file system updates are first written to a serial journal before the actual disk blocks are updated.

2.Allocation Groups

XFS file systems are internally partitioned into allocation groups, which are equally sized linear regions within the file system. Files and directories can span allocation groups. Each allocation group manages its own inodes and free space separately,
providing scalability and parallelism so multiple threads and processes can perform I/O operations on the same file system simultaneously.

3.Stripped allocation

If an XFS file system is to be created on a striped RAID array, a stripe unit can be specified when the file system is created. This maximizes throughput by ensuring that data allocations,
inode allocations and the internal log (the journal) are aligned with the stripe unit.

Variable block sizes

When many small files are expected, a small block size would typically maximize capacity, but for a system dealing mainly with large files,
a larger block size can provide a performance efficiency advantage.

Delayed allocation

When a file is written to the buffer cache, rather than allocating extents for the data, XFS simply reserves the appropriate number of file system blocks for the data held in memory. The actual block allocation occurs only when the data is finally flushed to disk.
This improves the chance that the file will be written in a contiguous group of blocks, reducing fragmentation problems and increasing performance.

Direct I/O

For applications requiring high throughput to disk, XFS provides a direct I/O implementation that allows non-cached I/O operations to be applied directly to the userspace. Data is transferred between the buffer of the application and the disk using DMA,
which allows access to the full I/O bandwidth of the underlying disk devices.

time spent removing large files

Comparison of b lock dev ice, XF S , ex t4, and ex t3 w h e n w r i t i n g large f i l e

How to create new FS/VG/LV in suse linux

How to create new FS/VG/LV in suse linux
=========================================

pre-requisite :

1. take the df -h;fdisk -l ;multipath -ll ,vgs ,pvs,lvs command output .
2.take the backup of /etc/fstab.

2. once the storage team allocate the LUN's . NOte down the LUN-ID provided by storage team .
Suppose LUN-ID = AB0004lm0000000008n00876d00005e9e

run the below command to detetct at server level.

#rescan-scsi-bus.sh

3. Validate that the New LUN has been detetcted .
#ls -ltr /dev/mapper/ - check the latest(last) entry to cross-check
#multipath -ll |grep "AB0004lm0000000008n00876d00005e9e"
# ls -l /dev/disk/by-id/ |grep -i "AB0004lm0000000008n00876d00005e9e"

4. Once you are able to find out the new LUN's , note down the logical name of it .

5. If you are need to create PV .

# pvcreate /dev/mapper/mpathdi

6. Create the volume group
#vgcreate abhidata3vg /dev/mapper/mpathdi

7. Once done you need to create the LV .
#lvcreate -L 20G -n lvabhidata3 abhidata3vg

8. Here we are going to create xfs filesystem
#mkfs.xfs /dev/mapper/abhidata3vg-lvabhidata3

9. create the directory on which you want to mount the new FS "/oracle/SQ7/sapdata3 and mount the filesystem.

#mkdir /aks
#mount /dev/abhidata3vg/lvabhidata3 /aks

10. To make the changes permanent add the entry of this filesystem in /etc/fstab

/dev/abhidata3vg/lvabhidata3 /aks xfs rw,noatime,nodiratime,barrier=0 0 0

below options normally we prefer for xfs filesystems .

Friday, July 15, 2016

HACMP Failover Test Scenario's

CLUSTER FAILOVER TEST SCENARIO’S IN AIX ENVIRONMENT

This document covers the Cluster Failover Test Scenarios in AIX Environment .

In AIX ,We have normally three ways for performing the Failover Testing .

1. Manual Failover by moving the Resource Group

2. Automatic Failover by abruptly halting the nodes

3. Failover Testing by removing the attached hardware(disabling the NIC’s ,cables etc)

Important points that need to be validated before performing any failover test as a System Administrator .

1. Data backup should be handy .

2. Cluster snapshot should be taken .

3. Configuration backup (including the RG attributes ,FS details ).

4. If crossmount is configured kindly verify the exports file and compare the FS crossmounted .

In 1 case we noticed that the cluster filesystem was mounted as normal nfs mount leading to issue while performing the failover test . Since cluster will look for the entries in file "/usr/es/sbin/cluster/etc/exports " if it exists to mount and unmount the FS .

5. Also if going for failover test , if the RG's goes to error state , there are cases where it will not allow you to execute any cluster commands . In this case you may require to reboot the nodes . So better keep the required team updated ,that we may require the server reboot of both the nodes in case of any issues.

Manual Failover Testing by moving the RG’s

Steps :

1. Take the console session of both the nodes.

2. Verify the Resource Group availability on nodes before the failover test .

Command to be used #/usr/es/sbin/cluster/utilities/clRGinfo

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

RES_01 ONLINE node1 >>>>>. RG (RES_01) currently active on node1

OFFLINE node2

RES_02 ONLINE node2

OFFLINE node1

3. Here in this case .we are going to manually move the resource group (RES_GRP_01) from node1 to node2

4. From node1 run the command #smitty clstop

node1# smitty clstop

Stop Cluster Services

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Stop now, on system restart or both now +

Stop Cluster Services on these nodes [node1 ] + >>>>>> select the node

BROADCAST cluster shutdown? true +

* Select an Action on Resource Groups Move Resource Groups >>>>> need to select this option for manual failover

5. Next screen will ask for the Resource group to move and the node where to move . Select the appropiate Resource Group and press enter , it will start the failover .

6. From node 2 , verify the RG status using the command #/usr/es/sbin/cluster/utilities/clRGinfo

1^st probable output

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

RES_01 OFFLINE node1

ACQUIRING node2 >>>>>>>>>> failover initiated and node2 is acquiring the Resource group

RES_02 ONLINE node2

OFFLINE node1

2^nd probable output

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

RES_GRP_01 OFFLINE node1

ONLINE node2 Failover completed successfully ,node2 has acquired Resource Group (RES_GRP_01)

RES_GRP_02 ONLINE node2

OFFLINE node1

Note: When stopping the cluster on node 1 the first thing executed is the cluster stop script. It brings down the applications and unmounts all application filesystems. If your application stop script is not able to stop all application processes some filesystems can't be unmounted and the failover fails.When all resources are down on node 1 HACMP starts to bring up all resources on node 2. The application start script is the last thing hacmp does.

7. Verify the status of the cluster using the command #lssrc -ls clstrmgrES . It should be in "stable" state . If so everything is fine .

7. Perform the server-level health-checkup to validate the FS and Cluster IP'S have moved successfully.

8. Inform APP/DB Team to start the APP/DB Services or validate the APP/DB Status after failover

Force of auto failover by rebooting active node (typically not recommended, but an option)

HACMP is intelligent enough to differentiate between deliberate shutdown and abrupt shutdown of node due to any hardware failures. Whenever we are forcing the failover by bringing down the active node , shutdown ,reboot command will not trigger failover.

The halt command will only force the automatic RG failover from Server end .

1. Login to node1 , run the command #halt –q as root user . This will bring down the node1 abruptly and Force the RG available on node1 to automatically failover to node2 .

2. Login to node2 ,Verify the Resource group status on node2 using the below command .

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

RES_01 OFFLINE node1

ONLINE node2 Failover completed successfully ,node2 has acquired Resource Group (RES_01)

RES_02 ONLINE node2

OFFLINE node1

3. Verify that all the filesystems and IP’s are available on node2 after the automatic failover.

4. Inform APP/DB Team to validate the APP/DB Status and Startup(if applicable)