UNIX SYSTEM ADMINISTRATION : 2015

Saturday, September 19, 2015

Introduction to GPFS Filesystem

IBM Introduced GPFS Filesystems in 1998.
GPFS is a high performance clustered file system developed by IBM .

GPFS provides concurrent high speed file access to application executing on multiple nodes of cluster

It is a high-performance shared-disk file system that can provide fast data access from all nodes in a homogenous or heterogenous cluster of IBM UNIX servers running either the AIX or the Linux operating system or windows.

All nodes in a GPFS cluster have the same GPFS journaled filesystem mounted, allowing multiple nodes to be active at the same time on the same data.

GPFS Filesystem internals

A file system (or stripe group) consists of a set of disks that are used to store file metadata as well as data and structures used by GPFS, including quota files and GPFS recovery

How does the GPFS Filesystem works ?

Whenever a disk is added to GPFS Filesystem , a file system descriptor is written on it . The filesystem desccriptor is written at a fixed position
on each disks which helps the GPFS to identify this disk and its place in a file system.

The filesystem descriptor contains file system specifications and information about the state of the file system.

the GPFS Filesystem uses the concept of inodes,indirect blocks and data blocks to access and store the disks .

what is metadata ?

Inodes and indiret blocks are considered as metadata .
The metadata for each file is stored in the inodes and contains information such as file-name,file-size and last modification timestamp.

For faster access , the inodes of the small files also contains the addresses of all disk blocks that contains the file data.

You can control which disks GPFS uses for storing metadata when creating the file system using the mmcrfs command or
when modifying the file system at a later time by issuing the mmchdisk command.

How to define which disk will be used for storing the metadata ?

already discussed ,the format of the disk descriptor file .

Diskname:::Diskusage:FailureGroup::StoragePool:

The DiskUsage field will decide what kind of data you are going to store in the disk

Below are the options that can be used.

dataAndMetadata >> indicates that disk stores both data and metadata
dataOnly >> indicates that disk stores only data
metadataOnly >> indicates that disk contains only metadata
descOnly >> indicates that disk contains only file system decsriptor.

We can also use the same options with the mmchdisk command for changing the disk usage options .

But after changing the diskusage paramter using mmchdisk command ,we need to use the mmrestripfs command with -r option to re-allocate the data
as per the new disk parameter. This is online activity but running the mmrestripefs command is I/O intensive,so need to be executed when i/O load is
less.

ex. mmchdisk gpfs0 change -d "gpfsnsd:::dataOnly"

after this confirm whether the changes has been done successfully using the below command
mmlsdisk gpfs0

GPFS and memory

GPFS uses three areas of memory:

memory allocated from the kernel heap,
memory allocated within the daemon segment, and
shared segments accessed from both the daemon and the kernel.

Memory allocated from the kernel heap
GPFS uses kernel memory for control structures such as vnodes and related structures
that establish the necessary relationship with the operating system

Memory allocated within the daemon segment
GPFS uses daemon segment memory for file system manager functions. Because of that, the file system manager
node requires more daemon memory since token states for the entire file system are initially stored there.

File system manager functions requiring daemon memory include:

Structures that persist for the execution of a command
Structures that persist for I/O operations
States related to other nodes

Shared segments accessed from both the daemon and the kernel

Shared segments consist of both pinned and unpinned memory that is allocated at daemon startup.
The initial values are the system defaults. However, you can change these values later using the mmchconfig

The pinned memory is called the pagepool and is configured by setting the pagepool cluster configuration parameter.
This pinned area of memory is used for storing file data and for optimizing the performance of various data access patterns

In a non-pinned area of the shared segment, GPFS keeps information about open and recently opened files. This information is held in two forms:
1. full inode cache
2. stat cache

Pinned memory

GPFS uses pinned memory (also called pagepool memory) for storing file data and metadata in support of I/O operations.
With some access patterns, increasing the amount of pagepool memory can increase I/O performance

Increased pagepool memory can be useful in the following cases:
There are frequent writes that can be overlapped with application execution.
There is frequent reuse of file data that can fit in the pagepool.
The I/O pattern contains various sequential reads large enough that the prefetching data improves performance.

Pinned memory regions cannot be swapped out to disk, which means that GPFS will always consume at least the value of pagepool in system memory.

Non-pinned memory
There are two levels of cache used to store file metadata:

Inode cache
The inode cache contains copies of inodes for open files and for some recently used files that are no longer open.
The maxFilesToCache parameter controls the number of inodes cached by GPFS.

Every open file on a node consumes a space in the inode cache.
Additional space in the inode cache is used to store the inodes for recently used files in case another application needs that data.

The number of open files can exceed the value defined by the maxFilesToCache parameter to enable applications to operate. However,
when the maxFilesToCache number is exceeded, there is not more caching of recently open files, and only open file inode data is kept in the cache.

Stat cache
The stat cache contains enough information to respond to inquiries about the file and open it, but not enough information to read from it or write to it.

A stat cache entry consumes significantly less memory than a full inode. The default value stat cache is four times the maxFilesToCache parameter.

This value may be changed through the maxStatCache parameter on the mmchconfig command.

Monday, September 14, 2015

Adding the space or disks in GPFS Filesystem

Steps to add the disks to the filesystem

step 1 : Before adding a disks in the GPFS ,take the details of GPFS disks .

# mmlsnsd and also verify using the command

# mmlsnsd
File system Disk name NSD servers
--------------------------------------------------------------------------
gpfs0 nsd08 (directly attached)

gpfs0 nsd09 (directly attached)

#mmlsnsd -m >> this gives details of the corresponding disk and ID .

Step 2 : Before adding the disk in GPFS filesystem ,we need to create the
GPFS Disk using the command mmcrnsd.

For creating a nsd we need to create a disk descriptor file . The format of the file is as follows .
it is not necessary to to define all fields.

disk-Name:Primaryserver:backupserver:diskusage:failuregroup:desiredname:storagepool

I am going to add hdisk1,hdisk2,hdisk3,hdisk4,hdisk5,hdisk6 to the filesystem gpfs0 .

Create the file /tmp/abhi/gpfs-disks.txt .

hdisk1:::dataAndMetadata::nsd01::
hdisk2:::dataAndMetadata::nsd02::
hdisk3:::dataAndMetadata::nsd03::
hdisk4:::dataAndMetadata::nsd04::
hdisk5:::dataAndMetadata::nsd05::
hdisk6:::dataAndMetadata::nsd06::

#mmcrnsd -F /tmp/abhi/gpfs-disks.txt

mmcrnsd: Processing disk hdisk1
mmcrnsd: Processing disk hdisk2
mmcrnsd: Processing disk hdisk3
mmcrnsd: Processing disk hdisk4
mmcrnsd: Processing disk hdisk5
mmcrnsd: Processing disk hdisk6
mmcrnsd: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.

Once the command is sucessful ,we can see that NSD names corresponding to the disks in lspv output.

# lspv
hdisk0 00c334b6af00e77b rootvg active
hdisk1 none nsd01
hdisk2 none nsd02
hdisk3 none nsd03
hdisk4 none nsd04
hdisk5 none nsd05
hdisk6 none nsd06
hdisk8 none nsd08
hdisk9 none nsd09

Also we need to verify using the mmlsnsd command .

# mmlsnsd
File system Disk name NSD servers
--------------------------------------------------------------------------
gpfs0 nsd08 (directly attached)

gpfs0 nsd09 (directly attached)

(free disk) nsd01 (directly attached)

(free disk) nsd02 (directly attached)

(free disk) nsd03 (directly attached)

(free disk) nsd04 (directly attached)

(free disk) nsd05 (directly attached)

(free disk) nsd06 (directly attached)

step 3 -- after this we need to add the disks to the filesystems

Before adding the disk to the GPFS filesystems ,we need to create a disk descriptor file .
since while creating the NSD ,we have already defined some of the parameters so no need to define it again here .
Below fields "diskname",datausage ,failure group,storagepool should be defined

by default GPFS Cluster will have one storage pool that is "system" but we can define many storage pools as per our requirement.

diskname:::diskusage:failuregroup::storagepool:

cat /tmp/abhi/gpfs-disk.txt
nsd01:::dataAndMetadata:-1::system
nsd02:::dataAndMetadata:-1::system
nsd03:::dataAndMetadata:-1::system

#mmadddisk gpfs -F /tmp/cg/gpfs-disk.txt -r >>>-r option is used here for re-balancing the data on all the new disks

Note: Rebalancing of data is I/O intensive job . it is not preferred to use this option during peak load .

Once added verify the disk size using the df -gt and also the output of #mmlsnsd.

# mmlsnsd
File system Disk name NSD servers
--------------------------------------------------------------------------
gpfs0 nsd08 (directly attached)

gpfs0 nsd09 (directly attached)

gpfs0 nsd01 (directly attached)

gpfs0 nsd02 (directly attached)

gpfs0 nsd03 (directly attached)

gpfs0 nsd04 (directly attached)

gpfs0 nsd05 (directly attached)

gpfs0 nsd06 (directly attached)

Wednesday, July 29, 2015

CPU ---MONITORING & PERFORMANCE & TUNING

central processing unit (CPU) of a computer is a piece of hardware that carries out the instructions of a computer program. It performs the basic arithmetical, logical, and input/output operations of a computer system. The CPU is like the brains of the computer - every instruction, no matter how simple, has to go through the CPU.

A typical CPU has a number of components.

1. ALU -which performs simple arithmetic and logical operations

CU - Second is the control unit (CU), which manages the various components of the computer. It reads and interprets instructions from memory and transforms them into a series of signals to activate other parts of the computer. The control unit calls upon the arithmetic logic unit to perform the necessary calculations.

Cache - CPU caching keeps recently (or frequently) requested data in a place where it is easily accessible. This avoids the delay associated with reading data from RAM.

What is CPU Processor Clock Speed ?

A processor's clock speed measures one thing -- how many times per second the processor has the opportunity to do something.

Ex. A 2.3 GHz processor's clock ticks 2.3 billion times per second, while a 2.6 GHz processor's clock ticks 2.6 billion times per second. All things being equal, the 2.6 GHz chip should be approximately 13 percent faster.

What is CPU Caching ?

CPU caching keeps recently (or frequently) requested data in a place where it is easily accessible. This avoids the delay associated with reading data from RAM.

A CPU cache places a small amount of memory directly on the CPU. This memory is much faster than the system RAM because it operates at the CPU's speed rather than the system bus speed. The idea behind the cache is that chip makers assume that if data has been requested once, there's a good chance it will be requested again. Placing the data on the cache makes it accessible faster.

WHY IS CACHE REQUIRED FOR BETTER PERFORMANCE ?

CPU will be accessing the data from memory . CPU is connected to memory through system bus. The clock speed of the CPU is much higher than the speed of the system Bus . For completion of any request ,CPU need to fetch the data from the memory which can be accessed after going through the system bus . here is speed of the system bus comes into picture . As a result,request processing power of the CPU was impacted .

So for overcoming this latency the concept of CPU Caching was introduced. The Cache will be on the processor chip and will store the recently or frequently requested data and is lot many times faster than accessing data from memory . Now since all the required data is already available in cache the CPU will not have to wait for getting the data from memory and in terms request processing speed will be increased.

Typically there are now 3 layers of cache on modern CPU cores:

L1 cache is very small and very tightly bound to the actual processing units of the CPU, it can typically fulfil data requests within 3 CPU clock ticks. L1 cache tends to be around 4-32KB depending on CPU architecture and is split between instruction and data caches.

L2 cache is generally larger but a bit slower and is generally tied to a CPU core. Recent processors tend to have 512KB of cache per core and this cache has no distinction between instruction and data caches, it is a unified cache.

L3 cache tends to be shared by all the cores present on the CPU and is much larger and slower again, but it is still a lot faster than going to main memory.

Note : CPU Performance also largely depend on the size of the L1 ,L2 & L3 Cache

performance metrics in terms of CPU Performance

latency


The time that one system component spends waiting for another component in order to complete the entire task. Latency can be defined as wasted time. In networking discussions, latency is defined as the travel time of a packet from source to destination.

response time


The time between the submission of a request and the completion of the response.

response-time = service time+wait time

service time


The time between the initiation and completion of the response to a request.

throughput

The number of requests processed per unit of time.

wait time

The time between the submission of the request and initiation of the response.

Response Time

Because response time equals service time plus wait time, you can increase performance in this area by:

    Reducing wait time

    Reducing service time

Understanding different aspects of CPU Service time ?

Suppose ,a LPAR is having 2 physical CPU allocated to it (no SMT Enabled/single threaded Mode) then what will happen, each CPU will be processing 1 request at a time . hence there will be no wait time & also least service time. this in term will increase the application response time .

                                                                   In other case suppose LPAR is assigned .4 CPU and 2 virtual CPU . Also SMT-2 is enabled . that means you will be having 2 threads per Virtual CPU. Each Virtual CPU is entitled 20 ms per timecycle/core . If simultaneously requests from both the threads of Virtual CPU1 is queued up in the run-queue.The thread which is having the high priority will be dispatched for execution first. CPU dispatcher & Scheduler will decide when to provide the timeslice to other thread as per the scheduling algorithms.IF the primary physical CPU is not able to provide the timeslice to the thread ,context switching will happen and the request will be executed by other physical CPU of the same pool . That means anyhow your's service-time will be increased . This in terms will increase the application response time .

Thursday, April 09, 2015

Cluster issue(netmon.cf )--solved

                           HACMP CLUSTER ISSUE.

The hacmp cluster nodes are having the only 1 ethernet adapter which is virtual.

           IP Configuration on both nodes
node-1
en0:
boot-ip           : 192.168.3.7
Persistent-ip :   10.1.1.16

node-2

en0:
boot-ip           :    192.168.3.8
persistent-ip :    10.1.1.18

problem statement

After performing the re-configuration ,when we started the cluster, automatically 1 node was getting rebooted automatically.

              Understanding the exact issue.

1. we verified the cluster logs (hacmp.out and cluster.log files) and found the error mssgs from both the nodes.

       hacmp.out log entry on "node-1"

dec 5 23:07:52 node-1 user:notice HACMP for AIX: EVENT START: fail_interface node-1 192.168.3.7    >>>>>>>>>>>>>>>>> this indicates that there is some issue with boot-ip interface
dec 5 23:07:52 node-1 user:notice HACMP for AIX: EVENT COMPLETED: fail_interface node-1 192.168.3.7 0
dec 5 23:07:59 node-1 local0:crit clstrmgrES[7143544]: Sun dec 5 23:07:59 announcementCb: Called, state=ST_STABLE, provider token 1
dec 5 23:07:59 node-1 local0:crit clstrmgrES[7143544]: Sun dec 5 23:07:59 announcementCb: GsToken 2, AdapterToken -1, rm_GsToken 1
dec 5 23:07:59 node-1 local0:crit clstrmgrES[7143544]: Sun dec 5 23:07:59 announcementCb: GRPSVCS announcment code=512; exiting
dec 5 23:07:59 node-1 local0:crit clstrmgrES[7143544]: Sun dec 5 23:07:59 CHECK FOR FAILURE OF RSCT SUBSYSTEMS (topsvcs or grpsvcs) >>>this refers nthat there can be heartbeat issue.
dec 5 23:07:59 node-1 daemon:err|error haemd[13041862]: LPP=PSSP,Fn=emd_gsi.c,SID=1.4.1.37,L#=1395,                                     haemd: 2521-032 Cannot dispatch group services (1). >>> This again indicates that there is some issue with boot-ip's
dec 5 23:07:59 node-1 user:notice HACMP for AIX: clexit.rc : Unexpected termination of clstrmgrES.
dec 5 19:08:00 node-1 user:notice HACMP for AIX: clexit.rc : Halting system immediately!!!

                      Cluster.log error entry

cllsstbys: No communication interfaces found.

on other node also same type of error mssgs were recieved.

Now question arise that why both the nodes are complaining that the boot-ip's interfaces are down .

1. So again we re-validated the cluster configuration, we tried ping testand also performed heartbeat functionality test from both the nodes to figure out exact issue.
                           The cluster configuration was fine and also offline synchronization was happening successfully,hearbeat links were operational and also the ping test was successful.

2. Then,we started to verify the cluster related configuration files and at last were successful in finding the root cause.

                 Root Cause

After going through the netmon.cf file that is normally used in virtualized cluster environment, we found the following entry in the netmon.cf file

        Node-1

!REQD !ALL 192.168.2.8

The node-1 ethernet adapter will be considered up if it is able to ping 192.168.2.8 .

Here was the issue, the entry in the netmon.cf file was in-correct means wrong ip was mentioned in the file. because of the entry in netmon.cf file , the node-1 where trying to reach the ip-address 192.168.2.8 ,since this ip doesn't exist and is un-reachable ,it marks the interface as down in cluster .
         Node-2

!REQD !ALL 192.168.2.7

The node-2 ethernet adapter will be considered up if it is able to ping 192.168.2.7

   here was the issue, the entry in the netmon.cf file. here also since ip 192.168.2.7 was not reachable, the cluster was marking the interface as down in the cluster log.

This lead to the condition when both the nodes think that other node is not reachable and will try to grab the RG 's and for maintaining the data integrity,it will reboot the other node.

                   Solution provided

We modified the entry in the netmon.cf file on both the nodes as

Node-1

!REQD !ALL 192.168.3.8

Node-2

!REQD !ALL 192.168.3.7

After that we synchronized the cluster and started the cluster, the issue got resolved.

Note : if we are removing the netmon.cf file then also the issue will get resolved.

Friday, March 13, 2015

0505－121：alt_disk_install error & 0516－082 -----solved

516－082：/usr/sbin/lchangelv: Unable to access a special device file.

0505－121：alt_disk_install error

0516－082：/usr/sbin/lchangelv: Unable to access a special device file.
Execute redefinevg and synclvodm to build correct environment.

              Understanding the exact issue first

The lchangelv command is low level command used to change the lv parameters. As per the error ,lchangelv is not able to access the special device files ,means issue seems with device files and Can be with ODM also.

                  Below are the steps performed

     1.                         for understanding , i ran alt_clone again and tried to figure out the major number of the device files related to alt_clone. got the details

                                                  major number
brw-rw----    1 root     system       40,                 9   Mar 14 06:32 alt_hd10opt
brw-rw----    1 root     system       40,          14 Mar 14 06:32 alt_fslv01


2. verified the existing device files starting with 40 to figure out whether the device files where actually used .

3. Copied all the files with major number 40 to different folder .

4. Again ran alt_clone ,but same error.But noticed that again the minor numbers are starting with 20 ,means there are some ODM entries also that are affecting.

5. Since every device file will have entry in CuDvDr ,i started searching for the Major Number 40 in it

odmget -q value1=40 CuDvDr
6. Once again cross verified the values and after confirmation removed it.
odmdelete -q value1=40 -o CuDvDr

7. Now my CuDvDr class is clear. But we need to verify all the object classes.

#odmget CuAt | grep alt
#odmget CuDv | grep alt
#odmget CuDvDr | grep alt
#odmget CuDep | grep alt
#odmget -q parent=altinst_rootvg -o CuDv

8. Found that entries where available only on CuDv and CuDep. Hence removed the entries using below command.

odmdelete -q parent=altinst_rootvg -o CuDv
odmdelete -q name=altinst_rootvg -o CuDep

9. Started the Alt_clone and Completed successfully.

Wednesday, January 21, 2015

LINUX Disk Storage Management & LVM Storage Management

     LINUX Disk Storage Management & LVM Storage Management

         Disk Storage Management
                    ------------------------------------------

Physical Disks are represented in LINUX as /dev/sda(SCSI Disks), /dev/hda(IDE Disks).

Suppose there is three SCSI Disks connected to the server . It will appear on server as

1st Disk --/dev/sda
2nd Disk --/dev/sdb
3rd Disk --/dev/sdc

A valid block device could be one of two types of entries:

    A mapped device — A logical volume in a volume group, for example, /dev/mapper/VolGroup00-LogVol02.

    A static device — A traditional storage volume, for example, /dev/hdbX,/dev/sdaX, where hdb & sda is a storage device name and X is the partition number.

What is Partition?

The physical disk can be divided into one or more logical disks . These logical disks are known as partitions.

The idea is that if you have one hard disk, and want to have, say, two operating systems on it, you can divide the disk into two partitions. Each operating system uses its partition as it wishes and doesn't touch the other ones. This way the two operating systems can co-exist peacefully on the same hard disk. Without partitions one would have to buy a hard disk for each operating system.

*On an IDE drive you can have up to 63 partitions, 3 primary and 60 logical ( contained in one extended partition )

*On a SCSI drive the maximum number of partitions is 15.

Ex. -- Suppose you want to 4 partition on new disk assigned to the server /dev/sdb.

After partitioning the newly created logical partitions will appear as.
/dev/sdb1 ,/dev/sdb2 ,/dev/sdb3 and /dev/sdb4

What is Extended Partition ?

An extended partition is the only kind of partition that can have multiple partitions inside. Think of it like a box that contains other boxes, the logical partitions.

The extended partition can't store anything, it's just a holder for logical partitions.
The extended partitions is a way to get around the fact you can only have four primary partitions on a drive. You can put lots of logical partitions inside it.

What is Logical Partition?

Logical partitions are partitions that are created by dividing up the extended partition.

                       MBR(Master Boot Record)

The MBR is a small program that is executed when a computer begins to boot up (i.e., start up) in order to find the operating system and load parts of it into memory.

The first sector is the master boot record (MBR) of the disk

The master boot record contains a small program that ;

1. Reads the partition table, checks which partition is active (that is, marked bootable),
2. Reads the first sector of that partition, the partition's boot sector (the MBR is also a boot sector, but it has a special status and therefore a special name). This boot sector contains another small program that reads the first part of the operating system stored on that partition (assuming it is bootable), and then starts it.

Understanding the Partitoning Concept

There are two ways of partitioning the disk :

1. Standard Partitions using parted
2. LVM Partition Management

Standard Partitions using parted

parted utility is used in linux for partitioning the disks having large size greater than 2 TB .

By default, the parted package is included when installing Red Hat Enterprise Linux.

Using the parted utility , we can perform below tasks.

    a) View the existing partition table

    b) Change the size of existing partitions

    c) Add partitions from free space or additional hard drives

                          Viewing the Partition Table

Creating a Partition

For creating the partition on new disk first we need to label the disk.

From the partition table, determine the start and end points of the new partition and what partition type it should be.

Removing a Partition

Creation of swap Partition using parted

Creating a LVM Partition using parted

Creation of boot partition using parted

under construction

Saturday, January 17, 2015

REDHAT LINUX BASICS -STARTUP

                    LINUX -OPERATING SYSTEM.

Linus Travolds released linux in 1991 under GPL.

What is kernel?

Kernel is called as the heart of operating system. Kernel is also the program acting as chief opertions

There are many functionalities that are handled by Kernel.Below are the list of some critical fuctionalities:
1. Starting & Stopping other programs.
2. Handling Requests from memory
3. Accessing disks
4.Managing network connections etc..

Kernel are basically of two types :

1. Monolithic -----That provides all the services that application needs
EX; Linux is using monolithic kernel

2. Micro Kernel --- These consists of small core set of services . It nees                      other modules to be loaded to perform other functions.
   EX:Windows.

LINUX Distributions are classified into two groups

1. Commercial     -- This type of distribution tends to have longer release cycle .Also Commercial generally offers support for their distribution at certain cost. EX--redhat,suse

2.Non-Commercial --The company offers use the non-commercial distribution basically for testing purpose of the software. Several of ,non-commercial distributions are backed up with the support.
Ex: Debian,Fedora,Ubuntu

LINUX Licences:

GNU Public Licences(GPL) ---GPL States that the software realesed is free .It's acceptable to take the software and resell it for his own profit,But when reseling and if any changes made in the code ,u need to release the full source code including the changes at GPL platform and also the new source code will be under GPL . EX. Redhat

BSD & Apache -- These types of licences gives the user to modify the source code without disclosing the changes made in the source code.

------------------------------------------------------------------------------------------------

Basic Linux System Adminsitration Tasks;

1. User Management
2. Logical Volume Management
3. Network Management
4.Device Management.
5.Package Management

--------------------------------------------------------------------------------------

User Management In Linux

1. Every file or program under Linux is owned by a user.
2. Each user will be having a unique User ID(UID).
3. Root user is known as super user which can do all the tasks in linux.
     By default the UID for root user is "0" .
4. System Users are normally having the UID from 0 to 499 . The manually   created users will have UID after that.

5. All the user information in linux is kept under text files .

Below are the files where the user's information is kept.

1. /etc/passwd -- this file stores user-name,encrypted password entry,UID,GID,Gecos,Home directory and login shell informations

2. /etc/shadow --- this file stores the encrypted passoword information for user accounts.

why was the requirement of /etc/shadow file if it was possible through /etc/password file only?

Ans: As we all know that /etc/passwd file is readable by all the users,it was leading to the security treat since it was easy for the hackers to crack the encrypted password . So for handling this linux introduced /etc/shadow file that is only readable by root users or other required priviledged programs that requires access to that information.

How to create a user

Using the "useradd' command we are creating the users in linux.
Whenever we are running the useradd command the ASCII Text File " /etc/default/useradd" is executed.

Content of /etc/default/useradd

# useradd defaults file
GROUP=100
HOME=/home
INACTIVE=-1
EXPIRE=
SHELL=/bin/bash
SKEL=/etc/skel
CREATE_MAIL_SPOOL=yes

*Above mentioned parameters are automatically taken once the useradd command is executed .

By default, a group will also be created for the new user .

Changing the default values(changing the /etc/default/useradd parameters)

When invoked with only the -D option, useradd will display the current default values

[root@abhi ~]# useradd -D
GROUP=100
HOME=/home
INACTIVE=-1
EXPIRE=
SHELL=/bin/bash
SKEL=/etc/skel
CREATE_MAIL_SPOOL=yes

--------------------------------------------------------------------------------------------------------
Below help page of linux will be helpful in using the useradd command:

Usage: useradd [options]USER-NAME

Options:

-b, --base-dir BASE_DIR base directory for the home directory of the new account

-c, - -comment COMMENT GECOS field of the new account

-d, --home-dir HOME_DIR home directory of the new account

-e, --expiredate EXPIRE_DATE expiration date of the new account(The date is specified in the format YYYY-MM-DD.)

-f, --inactive INACTIVE password inactivity period of the new account

-g, --gid GROUP name or ID of the primary group of the new account

-G, --groups GROUPS list of supplementary groups of the new account

-m, --create-home create the user's home directory

-M, --no-create-home do not create the user's home directory

-p, --password PASSWORD encrypted password of the new account

-r, --system create a system account

-s, --shell SHELL login shell of the new account

-u, --uid UID user ID of the new account

-U, --user-group create a group with the same name as the user

--------------------------------------------------------------------------------------------------
Example

1. #useradd test

This will create a user-id and it's home directory . Home directory will be by default "/home/user-id"

2. # useradd -d /home/test -p test123 test

Here we are creating a user test with home directory "/home/test" and the passowrd that will be stored in /etc/shadow will be "test123"

'-p" parameter is not recommended to use until you are not creating the encrypted password using crypt command.

[root@abhi ~]# cat /etc/shadow |grep -i test
test:test123:16452:1:90:7:::
[root@abhi ~]#

3. Creation of system user account with UID 510 and Primary goup ID as 500 .System user acount will not have home directory . But the user will have the no-ageing(means never expiry ) by default.
[root@abhi ~]# useradd -u 510 -g 500 -r test

**** This is helpful when customer requests for user account for collecting some details,who can't create any files or directory except /tmp.

4. If you want to create system user with home directory you need to use -m option .
#useradd -r -m test

5 .Creating a user-ID whose Gecos is "test user". The user account expires on 2015-12-18 and will become inactive after 5 days the user-ID expires.

[root@abhi ~]# useradd -c "test user" -e 2015-12-18 -f 5 test2

[root@abhi ~]# cat /etc/passwd |grep -i test2
test2:x:501:501:test user:/home/test2:/bin/bash

Content of /etc/shadow file after this:

[root@abhi ~]# cat /etc/shadow|grep test2
test2:!!:16452:1:90:7:5:16787:

.Note:   -f 0 means that the user account will become inactive as soon as    user-id expires

            -f -1 means that user account inactive parameter will be disbaled    for this    user.

Changing the base-dir )HOME) parameter in /etc/default/useradd file
[root@abhi ~]# useradd -D -b /home/test
[root@abhi ~]# useradd -D
GROUP=100
HOME=/home/test
INACTIVE=-1
EXPIRE=
SHELL=/bin/bash
SKEL=/etc/skel
CREATE_MAIL_SPOOL=yes
[root@abhi~]#

----------------------------------------------------------------------------------------------------------------------
                    How to remove a user .

We are using the command "userdel" to remove the user.

# userdel test       -----removes the user from the system(including entry in /etc/passwd & /etc/shadow file) IT will not remove the user's home directory.
#userdel -r test     -----It will remove the user definition and also the home directory of the user.
#userdel -f -r test -----It will remove the user definition ,home directory and other definitions of user forcefully,even if he is still logged in.

               Changing the attributes of user

We can change the attributes of users using the "usermod" command.

Below are the options available for usermod command

Usage: usermod [options] LOGIN

Options:
-c, --comment COMMENT                   new value of the GECOS field
-d, --home HOME_DIR                           new home directory for the user account
-e, --expiredate EXPIRE_DATE             set account expiration date to EXPIRE_DATE
-f, --inactive INACTIVE                         set password inactive after expiration to INACTIVE
-g, --gid GROUP                                    force use GROUP as new primary group
-G, --groups GROUPS                            new list of supplementary GROUPS
   -l, --login NEW_LOGIN                       new value of the login name
-L, --lock                      lock the user account
-m, --move-home                                    move contents of the home directory to the
                                                                  new location (use only with -d)
   -s, --shell SHELL                                  new login shell for the user account
-u, --uid UID                new UID for the user account
-U, --unlock                    unlock the user account

#usermod -L test ---locks the user account
#usermod -U test ----unlocks the user account
# usermod -u 505 test ---changing the UID for the user
# usermod -G admin test --changing the primary group of user test to admin
#usermod -G users,admin,system test -- adding the user "test" to users,admin & test group
#usermod e 2015-12-18 -f 5 test2     -- modifying the account expiry date for the user test2 to 18th dec 2015 and password to be set as inactive after 5 days of expiry.
#usermod -a aks test -- appending the user to the group aks
#usermod -m -d /etc/test test ---moving the home directory and it's contents to new location /etc/test for user test.
---------------------------------------------------------------------------------------------------------------------

                            How to create a group

We can grate a group using the command "groupadd" Group details are stored in files /etc/group and /etc/gshadow .

#groupadd aks   ---creates a group named "aks"
#groupadd -g 508 abhi ---creates a group abhi with GID 508

                        How to delete a group

we can delete a group using the groupdel command

#groupdel aks

                       MOdifying group attributes

Group Attributes are modified using the command "groupmod"

#groupmod -g 510 abhi -- changing the GID for group abhi
#groupmod -n test abhi ---changing the group name from "abhi" to "test"

------------------------------------------------------------------------------------------------------------------------------------

Some tips on applying Security Hardening   for users.

1. Setting the password policies for particular user

Listing the current password policies applied to user "test"
#chage -l test

Last password change                   : Jan 17, 2015
Password expires                   : Apr 17, 2015
Password inactive                : never
Account expires                       : never
Minimum number of days between password change        : 0
Maximum number of days between password change        : 90
Number of days of warning before password expires   : 7

setting the parameter (Maximum) to 90 . the user will be prompted for changing the password after 90 days.

# chage -M 90 test



#chage -W 8 test -- Start warning the user 8 days before password expires