KN-1012 (MT7981) fw/5.0.12: hardware PPE again breaks long-lived outbound UDP — same class as topic 1171 on MT7620. WebRTC/voice drops every ~13–15 min
With "ppe hardware" enabled (the default), a long-lived WebRTC UDP media session (Opus/RTP audio, ~50 pps, to a remote SFU) degrades reliably every ~13–15 minutes: the far side stops hearing us, the client reconnects, the cycle repeats. It reproduces with two independent WebRTC clients — a Chromium-based browser client and a native one — so it is not application-specific.
Localization (two-sided nf_conntrack capture, LAN + WAN, taken live through the moment of failure):
• LAN-IN: packets from the PC keep flowing at ~50 pps — the PC is sending fine;
• WAN-OUT: the outbound translation (external IP → SFU) freezes — the outbound half of the 5-tuple stops being forwarded;
• downlink (inbound from the SFU) stays alive;
• the conntrack entry is tagged [FASTNAT].
So the hardware offload stops forwarding the outbound half of an active UDP session while the inbound half is still alive. This is not congestion (RTT stays ~30 ms, jitter low) and not an idle timeout (traffic is continuous at ~50 pps).
Root-cause isolation — exactly one variable changed:
• ppe hardware ON → failure guaranteed every ~13–15 min (many cycles in a row);
• no ppe hardware (both off) → 22.5 min call, zero drops, 0.018% loss;
• ppe software only → 34 min clean, networkScore = EXCELLENT, 0.024% loss, surviving two windows where it previously always failed.
Nothing else was changed — toggling ppe hardware deterministically breaks / fixes the path.
This is a known class of bug
The same issue was reported and acknowledged by Keenetic years ago on the older MT7620 platform — topic 1171 (in Russian): https://forum.keenetic.com/topic/1171-
A Keenetic developer (ndm) acknowledged it was their oversight: the hardware path did not account for checksum changes inside the UDP stream. It was fixed in KeeneticOS 2.09.A.0.0-3 by routing UDP through ppe software while keeping TCP on ppe hardware.
On the new MT7981 platform the symptom is back — a regression, or the automatic "UDP → software PPE" routing from 2.09 is not applied on the KN-1012. The same MT7981 SoC has an analogous hardware-flow-offload defect in OpenWrt — issue #19449 "hardware flow offloading working abnormally": https://github.com/openwrt/openwrt/issues/19449
The general "HW offload breaks long-lived UDP" class is #17915: https://github.com/openwrt/openwrt/issues/17915
Request
Please either restore the automatic routing of UDP through software PPE (as done for MT7620 in 2.09), or expose a CLI/UI switch to disable UDP offload while keeping TCP hardware offload. Right now the only workaround is a full "no ppe hardware", which gives up gigabit LAN acceleration.
Notes
The single deciding variable is the state of ppe hardware — firmware, client and network are otherwise identical across the failing and working runs, and it reproduces across independent WebRTC clients (browser-based and native). Happy to provide a self-test diagnostic file or run any additional repro on request.
You can post now and register later.
If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.
Question
Antony
Device: Keenetic Giga KN-1012 (MediaTek MT7981 / Filogic 820, aarch64)
Firmware: KeeneticOS 5.00.C.12.0
Symptom
With "ppe hardware" enabled (the default), a long-lived WebRTC UDP media session (Opus/RTP audio, ~50 pps, to a remote SFU) degrades reliably every ~13–15 minutes: the far side stops hearing us, the client reconnects, the cycle repeats. It reproduces with two independent WebRTC clients — a Chromium-based browser client and a native one — so it is not application-specific.
Localization (two-sided nf_conntrack capture, LAN + WAN, taken live through the moment of failure):
• LAN-IN: packets from the PC keep flowing at ~50 pps — the PC is sending fine;
• WAN-OUT: the outbound translation (external IP → SFU) freezes — the outbound half of the 5-tuple stops being forwarded;
• downlink (inbound from the SFU) stays alive;
• the conntrack entry is tagged [FASTNAT].
So the hardware offload stops forwarding the outbound half of an active UDP session while the inbound half is still alive. This is not congestion (RTT stays ~30 ms, jitter low) and not an idle timeout (traffic is continuous at ~50 pps).
Root-cause isolation — exactly one variable changed:
• ppe hardware ON → failure guaranteed every ~13–15 min (many cycles in a row);
• no ppe hardware (both off) → 22.5 min call, zero drops, 0.018% loss;
• ppe software only → 34 min clean, networkScore = EXCELLENT, 0.024% loss, surviving two windows where it previously always failed.
Nothing else was changed — toggling ppe hardware deterministically breaks / fixes the path.
This is a known class of bug
The same issue was reported and acknowledged by Keenetic years ago on the older MT7620 platform — topic 1171 (in Russian): https://forum.keenetic.com/topic/1171-
A Keenetic developer (ndm) acknowledged it was their oversight: the hardware path did not account for checksum changes inside the UDP stream. It was fixed in KeeneticOS 2.09.A.0.0-3 by routing UDP through ppe software while keeping TCP on ppe hardware.
On the new MT7981 platform the symptom is back — a regression, or the automatic "UDP → software PPE" routing from 2.09 is not applied on the KN-1012. The same MT7981 SoC has an analogous hardware-flow-offload defect in OpenWrt — issue #19449 "hardware flow offloading working abnormally": https://github.com/openwrt/openwrt/issues/19449
The general "HW offload breaks long-lived UDP" class is #17915: https://github.com/openwrt/openwrt/issues/17915
Request
Please either restore the automatic routing of UDP through software PPE (as done for MT7620 in 2.09), or expose a CLI/UI switch to disable UDP offload while keeping TCP hardware offload. Right now the only workaround is a full "no ppe hardware", which gives up gigabit LAN acceleration.
Notes
The single deciding variable is the state of ppe hardware — firmware, client and network are otherwise identical across the failing and working runs, and it reproduces across independent WebRTC clients (browser-based and native). Happy to provide a self-test diagnostic file or run any additional repro on request.
0 answers to this question
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.