Jump to content

Antony

Forum Members
  • Posts

    1
  • Joined

  • Last visited

Everything posted by Antony

  1. Device: Keenetic Giga KN-1012 (MediaTek MT7981 / Filogic 820, aarch64) Firmware: KeeneticOS 5.00.C.12.0 Symptom With "ppe hardware" enabled (the default), a long-lived WebRTC UDP media session (Opus/RTP audio, ~50 pps, to a remote SFU) degrades reliably every ~13–15 minutes: the far side stops hearing us, the client reconnects, the cycle repeats. It reproduces with two independent WebRTC clients — a Chromium-based browser client and a native one — so it is not application-specific. Localization (two-sided nf_conntrack capture, LAN + WAN, taken live through the moment of failure): • LAN-IN: packets from the PC keep flowing at ~50 pps — the PC is sending fine; • WAN-OUT: the outbound translation (external IP → SFU) freezes — the outbound half of the 5-tuple stops being forwarded; • downlink (inbound from the SFU) stays alive; • the conntrack entry is tagged [FASTNAT]. So the hardware offload stops forwarding the outbound half of an active UDP session while the inbound half is still alive. This is not congestion (RTT stays ~30 ms, jitter low) and not an idle timeout (traffic is continuous at ~50 pps). Root-cause isolation — exactly one variable changed: • ppe hardware ON → failure guaranteed every ~13–15 min (many cycles in a row); • no ppe hardware (both off) → 22.5 min call, zero drops, 0.018% loss; • ppe software only → 34 min clean, networkScore = EXCELLENT, 0.024% loss, surviving two windows where it previously always failed. Nothing else was changed — toggling ppe hardware deterministically breaks / fixes the path. This is a known class of bug The same issue was reported and acknowledged by Keenetic years ago on the older MT7620 platform — topic 1171 (in Russian): https://forum.keenetic.com/topic/1171- A Keenetic developer (ndm) acknowledged it was their oversight: the hardware path did not account for checksum changes inside the UDP stream. It was fixed in KeeneticOS 2.09.A.0.0-3 by routing UDP through ppe software while keeping TCP on ppe hardware. On the new MT7981 platform the symptom is back — a regression, or the automatic "UDP → software PPE" routing from 2.09 is not applied on the KN-1012. The same MT7981 SoC has an analogous hardware-flow-offload defect in OpenWrt — issue #19449 "hardware flow offloading working abnormally": https://github.com/openwrt/openwrt/issues/19449 The general "HW offload breaks long-lived UDP" class is #17915: https://github.com/openwrt/openwrt/issues/17915 Request Please either restore the automatic routing of UDP through software PPE (as done for MT7620 in 2.09), or expose a CLI/UI switch to disable UDP offload while keeping TCP hardware offload. Right now the only workaround is a full "no ppe hardware", which gives up gigabit LAN acceleration. Notes The single deciding variable is the state of ppe hardware — firmware, client and network are otherwise identical across the failing and working runs, and it reproduces across independent WebRTC clients (browser-based and native). Happy to provide a self-test diagnostic file or run any additional repro on request.
×
×
  • Create New...