MTU Issue? Nope, It is LRO with Bridge and Bond

Edmund Haselwanter | 02.11.2014 DevOps

This one bugged me for a while. Cause it was so miss-leading to debug. Most of the time when faced with connection loss on larger packages one immediately thinks: Damn it, bitten again by PMTU, have to fix the MTU all the way in and out. Recently I got the exact same behaviour on a OpenStack cluster with VLAN provider networking on 10GbE bonds (port channels/lacp/bond4).

Debug Symptom

On a node accessible from the target create two files:

1head -c 1600 /dev/urandom > 1600.txt
2head -c 500 /dev/urandom > 500.txt

1600 bytes is big enough to cause problems with fragmentation. (default MTU is 1500 almost everywhere), increase the file size if you have jumbo frames enabled on your path.

Then e.g. scp the packet.

you will see, that copying the small packet will succeed

1scp admin@10.10.91.1:500.txt .
2admin@10.10.91.1's password:
3500.txt                                                                                                                                   100%  500     0.5KB/s   00:00

but copying the large file will stall

scp admin@10.10.91.1:1600.txt .
admin@10.10.91.1's password:
1600.txt                                                                                                                                    0%    0     0.0KB/s - stalled -^C

This is how a MTU problem reveals itself.

Checked everything, played with various fixes that helped in the past.

Things like:

http://lartc.org/howto/lartc.cookbook.mtu-mss.html

1iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  --clamp-mss-to-pmtu

does not help

A Hint from a Friend - LRO Findings

Then I got a hint from a friend (Thore Bahr) to have a look into rx-vlan-offload. I did so, but that did not help.

But, this triggered further investigation and more deep digging into the root cause.

Finally it revealed itself derived from some other observations:

=> So there is an issue with the lro setting and bonding with Intel igbxe adapters. We have to turn off lro

From the Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Family of Adapters README:

IMPORTANT NOTE
WARNING: The ixgbe driver compiles by default with the LRO (Large Receive Offload) feature enabled. This option offers the lowest CPU utilization for receives, but is completely incompatible with routing/ip forwarding and bridging. If enabling ip forwarding or bridging is a requirement, it is necessary to disable LRO using compile time options as noted in the LRO section later in this document. The result of not disabling LRO when combined with ip forwarding or bridging can be low throughput or even a kernel panic.

Change Offload Settings with `ethtool`

First try to get a connection to proof it is not working

1$ ip netns exec qdhcp-9d444bee-0395-47d9-ae7e-ae315c25e088 ssh 50.0.0.9

Change the settings with ethtool -K <adapter> lro off

 1$ ethtool -K p3p1 lro off
 2$ ethtool -K p3p2 lro off
 3$ ethtool -k p3p1
 4Offload parameters for eth6:
 5rx-checksumming: on
 6tx-checksumming: on
 7scatter-gather: on
 8tcp-segmentation-offload: on
 9udp-fragmentation-offload: off
10generic-segmentation-offload: on
11generic-receive-offload: on
12large-receive-offload: off
13rx-vlan-offload: on
14tx-vlan-offload: on
15ntuple-filters: off
16receive-hashing: on

This was NOT working, now it works:

1$ ip netns exec qdhcp-9d444bee-0395-47d9-ae7e-ae315c25e088 ssh 50.0.0.9 'uptime'
2Warning: Permanently added '50.0.0.9' (ECDSA) to the list of known hosts.
3 17:34pm  up 10 days  3:20,  0 users,  load average: 0.00, 0.01, 0.05

Make it Permanent

http://www.novell.com/support/kb/doc.php?id=7013089 suggests adding the following options to the network config:

1ETHTOOL_OPTIONS='-K iface lro off'

We have to add this to the automation for all interfaces in a/the bond.

Finally since we use Chef we can push down this setting to all compute nodes:

1knife ssh roles:*comp* -- ethtool -K p3p1 lro off
2knife ssh roles:*comp* -- ethtool -k p3p1 |grep large
3df0-xx-xx-xx-aa-aa.test-openstack.org large-receive-offload: off
4df0-xx-xx-xx-aa-aa.test-openstack.org large-receive-offload: off
5[ .. snipped .. ]
6df0-xx-xx-xx-aa-aa.test-openstack.org large-receive-offload: off

Go Back explore our courses

Agile Testing CI/CD Bootcamp

Run your tests on autopilot with CI/CD pipelines.

Terraform Foundations

Introduction to modern infrastructure provisioning.

Cloud Native Essentials

This is an introduction to Kubernetes, an open-source system for automating deployment, scaling, and management of containerized applications.

Martin Buchleitner | 02.07.2025 Devops, Cloud Native, Infrastructure as Code

Is Atlantis a Viable Alternative to HashiCorp Cloud Platform Terraform?

Infrastructure as Code (IaC) has revolutionized the way organizations manage cloud infrastructure, with Terraform leading as a premier tool. HashiCorp Cloud

Martin Buchleitner | 25.06.2025 Terraform, OpenTofu

Terraform BSL Overview: Limits and Opportunities for Users

Understanding Terraform’s New License: What You Can and Cannot Do Under the Business Source License (BSL) HashiCorp’s Terraform has become the go-to

Martin Buchleitner | 18.06.2025 Infrastructure as Code, DevOps, HashiCorp

Automating Environments with Trunk-Based Development

Introduction Trunk-Based Development (TBD) means everyone integrates to a single branch: usually main or trunk. There are almost no long-lived branches or repo

Matthias Theuermann | 16.06.2025 DevOps, Artificial Intelligence

Postman's Model Context Protocol (MCP) Server Integration with GitHub Copilot and CLI

The world of AI is evolving rapidly, and with it, the way we interact with APIs and infrastructure. At Infralovers, we're excited to explore how Postman's new

Martin Buchleitner | 11.06.2025 Infrastructure as Code, DevOps, HashiCorp

Pipeline Automation for Forked Repository Environment Management

Introduction If your team chose the forked repo model for maximum isolation—or due to regulatory/access concerns—your CI/CD strategy has special requirements.

We are here for you

You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.