Practical Optimization of Network Performance via Linux Kernel Tuning

Practical Optimization of Network Performance via Linux Kernel Tuning

Optimizing network performance is a recurring challenge in environments that demand high efficiency in data communication. The Linux kernel offers a range of tunable parameters that allow for improvements in the TCP/IP stack’s operation, without the need for deep source code modifications. Adjustments to receive and transmit buffers, backlog control, and the selection of congestion control algorithms can significantly impact throughput, latency, and connection stability.

This article describes the progress of an experimental project using a Raspberry Pi 5 running Debian GNU/Linux 12, directly connected to a Windows notebook via Ethernet cable. The main goal is to provide a detailed account of the changes made, the testing environment, the challenges faced, and the partial results obtained so far—providing a basis for evaluating the impact of these tunings in the context of high-speed local networks.


Description of the Changes Made

The project focused on the practical optimization of the Linux kernel's TCP/IP network stack by adjusting critical performance parameters. All changes were made through the /etc/sysctl.conf configuration file on Debian GNU/Linux 12 running on the Raspberry Pi 5.

The main changes implemented were:

  • TCP buffer adjustment: Increased the maximum values for receive and transmit buffers (net.core.rmem_max and net.core.wmem_max). Also adjusted the minimum, default, and maximum TCP buffer windows (net.ipv4.tcp_rmem and net.ipv4.tcp_wmem).
  • Network interface backlog: Increased the value of net.core.netdev_max_backlog, allowing the kernel to handle a larger number of queued packets before processing, thus reducing losses during traffic spikes.
  • TCP window scaling: Enabled the net.ipv4.tcp_window_scaling parameter to allow larger TCP windows and better utilization of high-speed links.
  • Congestion control algorithm: Switched from the default algorithm (Cubic) to the BBR algorithm by loading the tcp_bbr module and updating the net.ipv4.tcp_congestion_control setting.

All changes were performed via configuration files and kernel modules, without recompiling or directly modifying kernel source code.


Technical Approach

The strategy prioritized adjusting Linux kernel parameters using only the system’s native configuration resources, avoiding any source code modifications or kernel recompilation. This approach ensures practicality, rapid implementation, easy reversal, and reproducibility on other Linux-based systems.

Parameter selection was based on their relevance to TCP/IP performance in high-speed, low-latency networks. Receive and transmit buffers dictate the amount of data the kernel can process at once, backlog controls how many packets can be queued without loss, and the congestion control algorithm determines how the sending rate adapts to network conditions.

Each modification was evaluated both individually and combined, to identify both isolated improvements and the cumulative effect of the tuning.


Tools and Testing Environment

The experiments were conducted in a controlled setup consisting of:

  • Hardware: Raspberry Pi 5 (ARM Cortex-A76, 8 GB RAM) and a modern Windows notebook.
  • Operating Systems: Debian GNU/Linux 12 (Bookworm) on the Pi, Windows 11 on the notebook.
  • Connectivity: Direct Cat5e Ethernet connection, static IP addressing on both devices.
  • Network Configuration: On Windows, the adapter was set to a static IP and the firewall was configured to allow TCP port 5201 (used by iperf3).
  • Measurement Tools: Used iperf3 (server on Raspberry Pi, client on Windows) to measure TCP throughput. Tests were repeated multiple times, before and after tuning.

All configurations and tests were executed via terminal, allowing total control and easy documentation of commands used.


Challenges Faced

Several difficulties and limitations arose during implementation and testing:

  • Point-to-point network configuration: Setting up the direct network between the Raspberry Pi and the notebook required manual IP configuration on both devices, which may be unintuitive for less experienced users.
  • Windows firewall adjustment: The Windows firewall initially blocked traffic on the port used by iperf3, requiring explicit permission for TCP port 5201.
  • Loading the BBR module: The BBR algorithm was not enabled by default in the kernel, requiring manual loading and configuration for automatic activation on reboot.
  • Monitoring and analysis: Ensuring reliable results required multiple measurements and validation, as small variations could occur due to background processes or temporary fluctuations.
  • Limited observable gain in local networks: While tuning brought improvements, the ultra-low-latency, high-speed local network limited the observable difference between modern algorithms like Cubic and BBR.


Tests Performed and Results

To assess the impact of the tunings, several throughput tests were conducted using iperf3. Experiments were performed in three main scenarios:

  1. Default configuration: All kernel parameters at original values.
  2. Tuned buffers and backlog with Cubic: Buffer and backlog tuning with the default congestion control algorithm (Cubic).
  3. Tuned buffers and backlog with BBR: The same buffer and backlog tuning, but switching the congestion control algorithm to BBR.

Typical results:

  • Default configuration: Average throughput around 767 Mbits/s.
  • Tuning with Cubic: Average throughput increased to about 797 Mbits/s.
  • Tuning with BBR: Average throughput similar to the previous scenario, close to 794 Mbits/s.

These results show that tuning buffers, backlog, and window scaling provided approximately a 4% increase in local network throughput. The difference between Cubic and BBR was negligible in this specific local setup, but the process shows how simple adjustments can boost TCP/IP performance on Linux.



Conclusion

This project demonstrates that practical adjustments to Linux kernel network parameters can yield tangible gains in throughput and stability for TCP connections—even in low-latency local environments. The tuning performed—covering buffers, backlog, window scaling, and congestion control algorithm selection—proved effective and easy to implement.

Despite some limitations in the test environment, this study contributes to the consolidation of easily replicable optimization procedures, without the need for kernel source changes. The next phases aim to expand the analysis to more complex scenarios and provide a robust foundation for future recommendations in high-demand Linux network environments.

To view or add a comment, sign in

More articles by Marco Thulio Maciel

Others also viewed

Explore content categories