Debugging Lower Throughputs
TR-398 is one of the most comprehensive test plans for access point validation, covering scenarios like baseline performance, coverage, multi-client behaviour, stability, latency, roaming etc. While performing TR-398 test suite on a Wi-Fi 6 AP, i started noticing consistent failures in 5 GHz.
For example, consider the 6.2.2 Maximum Throughput Test that intends to measure the maximum throughput performance of the DUT with a single station active. The test uses TCP traffic, and the station is configured for 80MHz in 5GHz. According to the pass/fail criteria defined by Broadband forum, for the configuration of 802.11ax 80MHz 2x2, 880Mbps throughput should be achieved.
To further investigate, i ran throughput tests across all four traffic types: TCP and UDP, in both downlink and uplink directions and, observed quite lower throughputs with TCP traffic. I’ve then repeated the same tests with a reference AP in the same test environment and below are the test results:
Note: Both APs supports only 1G LAN and 2x2 NSS.
As shown above, the 5 GHz TCP downlink throughput is about 100 Mbps lower compared to the reference AP. To identify the cause of this reduced throughput, I captured OTA traffic and analysed the packet capture while running the traffic using LANforge (Candela box).
Packet capture observations:
Thereafter I proceeded analysing the data lengths of the QoS data frames
We might wonder how data frame lengths larger than 1500 bytes are observed, and why some frames reach around 3000 and some around 5000 or 9000 bytes.
This is due to frame aggregation in Wi-Fi, which improves efficiency by combining multiple packets into a single transmission. Wi-Fi supports two types of aggregation: A-MSDU and A-MPDU.
Recommended by LinkedIn
Both the DUT and the reference AP perform A-MSDU and A-MPDU aggregation.
AMSDU Observations:
As shown in the packet capture snapshot below, the DUT uses A-MSDU aggregation with two MSDUs in a single A-MSDU.
While the reference AP aggregates 6 MSDUs per A-MSDU, reducing the overhead and resulting in improved throughput compared to the DUT.
AMPDU Observations:
Although both APs support A-MSDU and A-MPDU aggregation, the aggregation behaviour differs noticeably. The DUT mostly aggregates two MSDUs per A-MSDU and shows a quite higher percentage of smaller A-MPDUs, whereas the reference AP aggregates up to six MSDUs per A-MSDU and builds more percentage of larger A-MPDUs.
Due to this, the reference AP transmits more payload in each transmission, reducing per-packet overhead and improving airtime utilization. In contrast, the DUT sends smaller aggregated frames more frequently, which increases overhead and limits throughput.
Even though both APs operate at similar RSSI, MCS, bandwidth, and NSS, the difference in A-MSDU and A-MPDU aggregation efficiency results in the observed throughput with the DUT. [I'm further analysing TCP uplink behaviour, I will share the uplink observations soon].
Have you observed similar scenarios where aggregation significantly impacted throughput, even under similar RSSI and MCS conditions? Let’s discuss.
Deepika Balla Thank you for analysis which provide much insights on Aggregation. While checking test results in UDP it is not suffered much and TCP only there is a drop due to intermediate less aggregation as you pointed out in the graph 12% for DUT. Can it be due to TCP protocol overhead this drop may be happening? Whether TCP sliding window scaling configuration are same? Also can check during transmission whether sliding window goes to lesser values due to TCP ack missing? Is TCP system configuration are same in DUT and reference AP? While analysing UDP can confirm 12% similar issue happening if not then it will be higher layer issues mostly? This looks interesting as UDP is not suffering but TCP
Is this MPDU or MSDU are configurable in DUT and market AP ?
Deepika Balla really nice analysis showcasing the power of aggregation which is the unsaid "hero" for better throughputs.. high PHY rates get a lot of attention but without proper aggregation, the throughput performance lags. Interestingly, in your observations, the UDP DL performance of the DUT is better than that of your reference AP, how was the aggregation in that case. If it is different, does the MAC aggregation treat TCPUDP differently.??.. Did you notice BA bitmap sizes greater than 64?..Any inputs on this will be useful..
This is really an insightful post! Thanks for it! Apart from the same NSS, RSSI and other similar conditions, would it be due to the difference in chipset that is used in the Reference and DUT AP for the 5 GHz radio? or, is it the same chipset/ board, that have almost similar configs in antenna/ Tx point of view?
Thanks for sharing. How did you identify that aggregation was the root cause instead of retries or bandwidth limitations?