QUOTE(Anime4000 @ Jun 1 2025, 04:25 AM)
Just disconnect it self,
Possible someone do checking on BNG late night at weekend? but I don't think they work this late
OK, I got a different theory.
I lean towards problem with the BNG and not the last mile distribution network based on the following facts:
1. OLT cannot do HQoS
2. OLT is purely layer 2 and cannot differentiate IPv4 and IPv6
3. Given your problem only occurs with IPv4 but not IPv6 on all ONU regardless of PPTP and VEIP, it cannot be layer 2 issue.
4. Only BNG do layer 3
My guess is...
Since this is a new BNG, maybe they didn't configure the TCAM partitioning correctly. Now I don't know the hardware and the spec, or even if it uses TCAM or HBM. But let's just assume it is classic service router and uses TCAM.
The IPv4 routing table is very large and if the TCAM partition is configured incorrectly, it will overflow and the BNG will fallback to software processing. The only way to know for sure is check the BNG log for TCAM exception. At least on Cisco that's what shows up in log.
On IPv6, the routing table is much smaller and it will have no problem and packet just flies through.
Now if this actually happens, it is in an undefined behavior territory. They might get route flapping. Will PPPoE disconnect? Not sure, it is undefined behavior when the system is overloaded.
There might be another cause, CGNAT state tracking. Even if you have a public IP, there might still be state tracking. Maybe it is configuration issue, maybe it is software bug, maybe it is just how it works. But I really haven't come across CGNAT state tracking issue causing PPP session to be killed.
Or maybe the supervisor / line card is just broken. Everything is just a guess.
But main focus would be on BNG based on what happened to you.
EDIT:
Just to mention, at least on a Cisco, if TCAM is overflowed, latency did increase and bandwidth did drop. Exactly like what you observe.
This post has been edited by kwss: Jun 1 2025, 04:45 AM