TCP performance

I just found a presentation I created where I was trying to explain to a company exec that increasing the bandwidth between the US and Malaysia would not increase the file transfer performance unless we did some tuning on the servers at each end, in fact we found that if we tuned the servers we did not have to increase the BW at all, hence saving on network costs. The exec’s realization that the speed of light is fixed and we cannot reduce latency by increasing BW was interesting, only by moving the end users closer to the data or moving the data closer to the end user, if the distance between the two is fixed then the latency is fixed. Even things like WAN accelerators do not change the latency but they use some smart TCP manipulation, such as send/receive window sizes or using persistent connections so you don’t suffer the TCP connection start handshake time.    

The issue comes down to the way TCP works over high latency links, but can have a dramatic effect on short links as well. TCP is expecting an acknowledgment for every window of data sent, in fact in extreme cases, it may be looking for an ack for every packet. From looking at lots of centralized hosting systems, a server accepting a request from a client would start by sending a single packet, then engage TCP slow start and double the send window every round trip, up to the send buffer size. In modern system it is not a real issue as operating systems such as windows 10 or Lynux have auto scaling for their send/receive buffers, windows will scale to over 3MB when performing file transfers or moving large blocks of data. The issue comes with older systems or systems that have send/receive buffers set by default to smaller sizes and do not turn on windows scaling by default, IBM’s AIX and OSA express cards are examples of this.

It is very easy to calculate the throughput you will achieve, send buffer size ( in bits)/ latency = Throughput in bits/sec * 8 to get bytes per second.

A 16K send buffer with a network loop delay of 20ms works out to

16 * 1024 * 8 /.02 * 8 = 102,400 Bytes/sec

Not great if the company had invested in Gig links to achieve high throughput. You will also note that there is no mention of bandwidth in the equation. It basically does not matter how much BW you have, if the server or workstations are not tuned then you will get poor application performance.

If the system has 64K buffers the results are 3.276MB/sec

And if the system has 3MB buffers the results are 153.6MB/sec.

Still nowhere near the Gig speeds the company was looking for. So you can imagine the effect if the end user is the other side of the planet or just across the Pacific where 200ms to 300ms are common between systems hosted in the US or Europe and the end users.

Best case with 3MB send buffers but a 250ms loop delay, is only 12.288MB/sec

Over the years I saw many critical systems and Database updates perform poorly because people thought the network was at fault. It was brought home when we worked to move a system from Japan to Australia. Naturally the users in Japan were concerned that their performance was going to degrade due to greater latency. We actually used Netdata to model the performance hit and found it to be about 7% worse. We reported that the users would get 10% poorer performance so that when the system was moved we had given them a forewarning. We moved the system but tuned the TCP settings and OS parameters where we could, the result was a 22% increase in performance for the users in Japan. We were asked why we did not tune the system while it was hosted in Japan.

There are two other things to keep in mind when you are suffering poor system performance. The fact that Windows 10 can send over 3MB before it starts to look for an ack is what Cisco call a micro burst. If the network you are running on is not capable of working with these micro bursts, as in the routers or switches have buffers less than the data received they will start dropping packets, thus generating retransmissions and poorer performance.

The other thing to discuss is the performance of the receiving station, either a high end server or an end user workstation. If that machine has small receive buffers or small application memory or a slow processor, although it is rare today to have insufficient CPU cycles. It is very interesting to watch a file transfer to a receiver that is performing badly, you can sit there and watch TCP progressively reduce it’s receive window and transmit that reduction back to the sending station. Finally the receiver can no longer move data out of the TCP receive buffers to the application buffers fast enough and totally closes the TCP window. All data transfer stops yet nearly everyone will blame the network.

Not certain how I got here but I think in over 8 years for spending every day looking at application and system performance I think I only found the network at fault once and then only dues to a bad decision on the part of the network designers. Application and System performance is not simple, you have to view the big picture, what is the system doing from end to end, is what is being reported a symptom or the fault. I am really enjoying putting some of the issues down on paper, if you get tired of me rambling, don’t read it.    

Chris, these are interesting and valuable lessons. I hope you can share more from your extensive experience.

To view or add a comment, sign in

More articles by Chris C Taylor

  • Environment tuning and compression

    I was going to write about getting to know your customers but I just saw an article on features within Websphere…

  • A secret tool

    Well my wife and I have started our Tree Change, moved to a 6 acre property growing nothing but grass. But this move…

    4 Comments
  • Current Projects

Others also viewed

Explore content categories