So it’s been almost 2 years before I’ve dumped my ideas into a post here on this site, however I need a side project for 2026 so this may be a long term 5 year hobby project. Trying to build a virtual BNG…
Why do people make BNGs sound so complicated?
A lot of networking people I speak with regarding BNGs tend to think that BNGs/subscriber management is a lost art and that it requires black magic to fully understand the process of a BNG. If you start with the basic concepts of subscriber management at the control plane layer, you end up breaking it into 3 main parts.
- Subscriber Access Termination Protocol (IPoE vs PPPoE)
- AAA
- Subscriber Management
Let’s explore these 3 parts at a high level.
Subscriber Access Termination Protocol (IPoE vs PPPoE)
A BNG serving IPoE clients needs to track the DHCP DORA process so that we can initially build the RADIUS access-request on a DHCP discover for IPoE based subscribers, going deep into the DHCP headers and using specific headers to build firstly the subscriber metadata (eg. xid+mac for dhcpv4 and xid+duid for dhcpv6), then secondly the RADIUS authentication request (eg. User-Name from the mac address, pppoe agent relay id, dhcpv4 option 82 or dhcpv6 option 18/37m etc… this is typically configurable via some Subscriber/AAA framework on the vendor device). Further more during the DHCP process, we may need to interact with the data plane layer to build the components like a virtual interface, subscriber IP to FIB/forwarding plane, QoS profiles, and other bits which then ties into the subscriber management process.
As we continue in the DHCP process (or PPP(oE) process), we may need to program the IPv4 address / IPv6 address and/or delegated prefix into the data/forwarding plane so that return traffic from the internet->subscriber can build all the right headers and get to the right customer (eg. IPoE we need to build layer 2 ethernet headers, we will talk about VLANs/QinQ at a later point, PPPoE we need to build the PPP encapsulation and pppoe headers like session id etc.).
Remember, high level… We will dive into how to get a dhcp/pppoe packet from the NIC into a user space application for testing purposes and then start to explore more methods to offload this packet classification on the NIC itself before the packet even enters the linux kernel. So let’s move onto AAA.
Note: Do you see how a lot of BNG functions are control plane level? Most if not all vendors, punt control packets like DHCP and PPPoE from the ASICs up into the user space applications to then start processing these packets and perform the subscriber management. DHCP discovers don’t need to be processed at millions of packets a second, but standard non-dhcp/control plane traffic does for line rate speeds.
AAA
Focusing on just Authentication for now, if we take the IPoE example of the DHCP Discover packet, the typical process is to inspect the DHCP headers and selectively choose a few fields that the operator (ISP) configures in the CLI under the subscriber management hieracy to be used when crafting the RADIUS authentication requests. Eg. Access-Request to set User-Name field from dhcpv4 packet option-82 remote-id, or the PPP PADR packet for agent-remote-id.
If we want to then speak to the subscriber radius systems, this doesn’t happen in hardware and would typically just run over a udp socket inside the subscriber management daemon or a separate aaa daemon. In this case, we need to keep the socket open (with timeouts) and actually manage return traffic because UDP is fire and forget, we’re still awaiting the Access-Accept or Access-Reject to be sent back, at this point our DHCP discover still is held in memory/buffered, ready to be fired over to the DHCP server when we get that Access-Accept.
Let’s quickly jump into Subscriber Management and then show you what I’ve done already (yes, bngblaster returns 10/10 IPoE sessions for my DIY BNG project built purely in Go, how awesome is that?!?!).
Subscriber Management
Ok so the previous 2 parts are technically integrated into most vendor routing solutions, so… what areas do we need to explore to make a successful subscriber management architecture? Exploring some ideas may take us deep down the rabit hole so I’ll keep it high level again with IPoE, its just much simplier and 100% what I’ll be implementing first before PPPoE.
- DHCP Discover comes in, we hold it in memory while our AAA implementation authorizes the user onto the network, forward the discover packet to a dhcp server, almost like we are doing a buffered dhcp relay here (or process if via a dhcp daemon if we are a local dhcp server), the offer comes back.
- We now can build the virtual subscriber if we see the DHCP offer come back, at least partially in memory… At this point, we could record this metadata about the IP address and build the subscriber in a database running entirely inside memory. Now, we know that the rest of the DHCP process should be the subscriber broadcasts a request for that IP sent in the offer, we could now make a request to the dataplane to program some forwarding information about this client (eg. ethernet mac address, IP address, vrf from RADIUS, etc…)
- Once a subscriber is working at the data plane level, we need to handle DHCP renewals/releases, accounting, CoA, but again from a super high level without all the bells ans whistles, that’s pretty much a working BNG (for the minimal IPoE ping between A and B implementation)
Ok so I am getting bored writing this because I want to show you an initial PoC before we dive into what we could achieve writing our own bng control plane ontop of existing network stacks to handle the actual forwarding and routing. Our idea is not to build a new forwarding plane or routing stack, but to simply try to integrate BNG functionality into an existing project.
The PoC
Goals:
- Get DHCP packets into our Go app
- Authenticate against external RADIUS server based on some DHCP options (82 sub op 2)
- Relay DHCP discover and requests to external DHCP server
- Build virtual interfaces on Linux so ARP/ICMP works for dhcpv4
DHCP packets into Go
I’d thought I would explore a bit more with eBPF here rather than building a tap, we want to be able to actually take the packet, perform some processing, and then actually deal with the packet. If we tried to intercept these DHCP packets traditionally, we’d setup an AF_PACKET socket (or RAW_SOCK but the problem with this is for IPoE VLAN based/QinQ, we don’t have access to layer 2 headers, we’d need to setup a socket with AF_PACKET, or even setup a DGRAM socket but this also means the UDP DHCP packet goes through Netfilter (eg. iptables), this is technically the way that should be done since AF_PACKET is copying the packet, whereas an actual TCP/UDP socket receives the original packet, AF_PACKET is technically just a tap/sniffer). Processing these sockets ends up happening in userspace which is at the highest point of the kernel stack, it’s already been processed through the existing Linux TCP/IP stack and is at the furthest point in the operating system (user space) that has access to the packet.
Now here is the thing… Control packets are not flowing as often as data packets, we don’t have to process a million control packets a second, however we also want to avoid mass outages causing DHCP floods and ensure we don’t 100% our CPU usage by inefficently processing control packets, no matter what we do we will probably end up interacting with some dataplane level call/API to punt DHCP packets from the NIC to our CPU, but for now, eBPF will handle this for us.
You would typically see eBPF combined with XDP (eXpress Data Path) but right now we’ll just cover the eBPF part to understand how this works. So how can I explain this eBPF functionality. Let’s take the famous postal service example, imagine you are renting a house/apartment/flat/whatever.
Someone goes to post a letter (Subscriber sending traffic) in a postbox, these letters are collected in batch and then sent to a central processing office to take the addresses/destinations, sort them into some efficient delivery window so that a driver can deliver these to a specific area they are responsible for delivering letters to. If the destination is quite far, they may be processed and sent into another postal delivery office and further processed into the systems before it ends up being picked up by a driver.
The driver who hand delivers the letter typically drops it off at a mailbox or through a letter box on a door. We can pretend this letter box is initial entrypoint between a packet in the linux kernel travelling up the stack, but now just imagine the deliver driver pulls out a heavy photocopier, opens up your letter and takes a photo copy, takes the copied version into a letter envelope, now they post the copied letter and kept the original one so they can actually deliver the original letter to your landlord, however your landlord doesn’t need this letter, its not for them, they will actually just throw it in the bin… This would be if we used an AF_PACKET socket, if we want the original packet (via an actual DGRAM socket implementation) then the driver is at the door about to deliver your letter, you actually opened your door a few mins ago in preparation of the postman arriving, and when they arrive, say hi to the postman and finally before giving you the letter, he looks at the letter just to double check its the correct house and then backs off slightly to look at your door number then leans back in to give you the letter, then you put grab it while he is holding it then he finally lets it go and its now in your hands to do as you wish.
eBPF allows us to interface with the kernel and put some small C code in a restricted environment, and instruct the kernel to actually intercept the original packet and send it directly to our application, you made a special arrangement with the postal service and instead of sorting your letters into a pile of other letters, waiting for a driver to collect them and drive to your house and post them, they have an automated conveyor system that takes your letter, opens it and photocopies it then automatically sends it straight after the photocopy button is pressed to your email address.
– TO BE CONTINUED AFTER I GET BACK FROM AMERICA (fuck yeahhhh…!!!) —-