Recently was trying to understand the issue, in which the customer complained that there network connection are getting dropped off.
Network team worked on it for long time, and came to Unix team to look from server end also.
Since it was virtualize environment, we started look from network end first. And also informed application team to let us know how is these connections setup.
Hoping that some tuning is required from both the end to resolve the issue
Network statsNetwork team worked on it for long time, and came to Unix team to look from server end also.
Since it was virtualize environment, we started look from network end first. And also informed application team to let us know how is these connections setup.
Hoping that some tuning is required from both the end to resolve the issue
=============
108038312 packets received
67173530 acks (for 3510816000 bytes)
295731 duplicate acks
0 acks for unsent data
97425484 packets (2215095896 bytes) received in-sequence
22985 completely duplicate packets (28717295 bytes)
0 old duplicate packets
8552 packets with some dup. data (5423403 bytes duped)
8332754 out-of-order packets (461387377 bytes)
understanding the reason for these out of order packet and duplcate packets at receiving end ?
There are these certain scenario's :
1. The network congestion .
2. the adapter(etherchannel) configuration
3. the adapter buffers etc
The Adapter Configuration
=====================
In our scenario ,The etherchannel is configured as link-aggregation but with the algorithm used as “round-robin”.
Let us first understand the round-robin algorithm
Round-Robin: All outgoing traffic is spread evenly across all of the adapters in the EtherChannel. It provides the highest bandwidth optimization for the AIX server system. While round robin distribution is the ideal way to utilize all the links equally but we should also consider that it also introduces the potential for out-of-order packets at the receiving system.
The out of
order packets ,duplicate acks these all can be due to the etherchannel
configuration algorithm “round-robin” or may indicate any other network issues .
2. We have noticed the lot of TCP ack packets
are getting delayed. This is normal behavior of TCP-IP functioning
in UNIX but sometime for high performance(response time ) demanding
application it may be issue .
This is normally customized at app level . but we are
also having option in AIX to overcome this. the “TCP_NODELAY” socket
option is disabled by default, which means TCP Nagle algorithm
on network transmissions is used which delays sending small successive packets.
The
nagle algorithm means that a TCP connection can only have one outstanding
acknowledgement for a small segment. Clearly this causes delays in sending
further packets until either the acknowledgement is received or TCP can bundle
up more data into a full segment. Setting tcp_nodelay to 1 is a dynamic change
and can better response time.
sometimes
it is seen that this is very helpful , in getting the network throughput for
high response time demanding application . but this will increase the cpu
overhead and may lead to network congestion .
Before reaching the conclusion , we also need to validate different other parameters .... Under construction....
No comments:
Post a Comment