Tuesday, September 12, 2017

Dynamic Routing -gated services aix


                In TCP/IP, routing can be one of two types: 

 1.  Static  routing
2.  Dynamic routing

 With static routing, you maintain the routing table manually using the route command. Static routing is practical for a single network communicating with one or two other networks.

* Note -  However, as your network begins to communicate with more networks, the number of gateways increases, and so does the amount of time and effort required to maintain the routing table manually.
With dynamic routing, daemons update the routing table automatically. Routing daemons continuously receive information broadcast by other routing daemons, and so continuously update the routing table.



 In AIX  , TCP/IP  provides two daemons for use in dynamic routing,

1.  routed  deamon
2.  gated daemon

The gated daemon supports  

 a)Routing Information Protocol (RIP) & Routing Information Protocol Next Generation (RIPng)
 b)Exterior Gateway Protocol (EGP), 
 c)Border Gateway Protocol (BGP) and BGP4+, 
 d)Defense Communications Network Local-Network Protocol (HELLO), 
 e)Open Shortest Path First (OSPF), 
 f) Simple Network Management Protocol (SNMP) and some more 


Routing daemons can operate in one of two modes,
1.  passive 
2.  active,  

In active mode, routing daemons both broadcast routing information periodically about their local network to gateways and hosts, and receive routing information from hosts and gateways.
                                                              In passive mode, routing daemons receive routing information from hosts and gateways, but do not attempt to keep remote gateways updated (they do not advertise their own routing information).

                                              Dynamic routing daemons, however, must be run in the passive (quiet) mode when run on a host that is not a gateway.

Recently came across environment where gated services where used with OSPF routing protocol

       This was something new for me ,so started reading the pdf's and blogs to understand the exact concepts. 
               
The most important point is that if you want to understand the complete configuration ,you first need to understand the Routing protocol and it's working and it's network terms  .





Now let us go through the basic concept   of the OSPF routing protocol that will be helpful in configuration 


OSPF 

  • Dynamic Routing Protocol 
  •  Link State technology 
  • Runs over IP, protocol 89 
  •  Designed by IETF for TCP/IP 
  • Supports VLSM   -- It supports subnetting 
  • Multi-vendor   - It is standard protocol  and supported by all the vendor's 
  • Fast rerouting - OSPF detects changes in the topology, such as link failures, and converges on a new loop-free routing structure within seconds.
  • Minimises routing protocol traffic 

  • Low bandwidth requirements 
  •  Supports different types of areas 
  • Route summarisation and authentication


 Under construction  ....  


Sunday, September 10, 2017

Network performance .. Some points


Recently  was trying  to understand the issue,  in which the customer  complained that there network  connection  are getting dropped off.
Network team worked on it for long time, and came to Unix team to look from server end also.

Since it was virtualize environment,  we started look from network end first. And also informed application team to let us know how is these connections  setup.

Hoping that some tuning is required from both  the end to resolve  the issue
Network stats
=============

108038312 packets received
                67173530 acks (for 3510816000 bytes)
                295731 duplicate acks
                0 acks for unsent data
                97425484 packets (2215095896 bytes) received in-sequence
                22985 completely duplicate packets (28717295 bytes)
                0 old duplicate packets
                8552 packets with some dup. data (5423403 bytes duped)
                8332754 out-of-order packets (461387377 bytes)

 understanding the reason for these out of order packet and duplcate packets at receiving end  ?

There are these certain scenario's  :

1. The network congestion . 
2. the adapter(etherchannel) configuration 
3. the adapter buffers etc 

The Adapter Configuration
=====================
In our scenario ,The etherchannel is configured  as link-aggregation but with the algorithm used as “round-robin”.

Let us first understand the round-robin algorithm 


Round-Robin: All outgoing traffic is spread evenly across all of the adapters in the EtherChannel. It provides the highest bandwidth optimization for the AIX server system.  While round robin distribution is the ideal way to utilize all the links equally but we should also  consider that it also introduces the potential for out-of-order packets at the receiving system. 

  The out of order packets ,duplicate acks  these all can be due to the etherchannel configuration algorithm “round-robin”  or may indicate any other network issues .



2.       We have noticed the lot of  TCP ack packets are getting  delayed. This is normal behavior of TCP-IP functioning in UNIX but sometime  for high performance(response time )  demanding application  it may be issue .

This is normally customized at app level . but we are also having option in AIX to overcome this. the “TCP_NODELAY” socket option is disabled by default,  which means  TCP Nagle algorithm on network transmissions is used which delays sending small successive packets. 

The nagle algorithm means that a TCP connection can only have one outstanding acknowledgement for a small segment. Clearly this causes delays in sending further packets until either the acknowledgement is received or TCP can bundle up more data into a full segment. Setting tcp_nodelay to 1 is a dynamic change and can better response time.
                                                                                                                   sometimes it is seen that this is very helpful , in getting the network throughput for high response time demanding application  . but this will increase the cpu overhead  and may lead to network congestion .



Before reaching the conclusion , we also need to validate different other parameters                                                             .... Under construction....