Sunday, December 26, 2021

Bell Canada vDSL demystified + GPON/FTTH

 I've been using Bell Canada's DSL lines in some shape or form for the past few decades at least.  The first internet connection I set up myself and managed myself was a Bell Sympatico DSL line.  I learned a lot back then, about DSL line filters and PPPoE and the nuances of getting that set up on my D-Link router with Sympatico's modem.  This was at a time when modems were just that, modems.  The recent release of Bell's Homehub series (and the 2wire that preceded it) are all modem/routers, this was before the 2wire was widely distributed as the Bell modem of choice.

I've never encouraged or endorsed anyone using a modem/router from any provider, they are all terrible. They are the digital equivalent of the phrase "jack of all trades, master of none"; and that rings true with all providers I've encountered so far.  They're not great routers, but the underlying technology can do the modem tasks pretty damn well, if you strip away everything else it's trying to do on top of it.  Most notably, I've found that the DHCP servers on these devices are slow, and frequently crash, so your lease times out, they also are not great at DNS, the queries are slower than going to the internet directly.  Not only are they pointing to Bell's global DNS servers, which are frequently not as fast for response time as other globally accessible DNS, but they add non-trivial delay in and of themselves.  On top of that, they are not great NAT devices, they frequently forget NAT sessions and their session limit seems to be quite low, so if you throw any number of clients at it (beyond a very minimal amount of 2-3), you'll frequently get oddities with your connection that things just stop working or don't work at all.  This requires that you restart your modem/router constantly, which isn't great.

Very quickly: The world of today is built upon the internet.  This foundation should be infallible, reliable and consistent.  The reality is that, even globally for routing, it's a convoluted mess of policies and protocols that enable us to communicate, but the equipment providing that connection in your home should not be under constant question and scrutiny to ensure it's working as intended.  In my opinion, at least for very simple networks, the connectivity provided by the network should not be the thing you're constantly trying to fix.  It's something that's astonishingly easy to get right, yet so many companies do it wrong.  There's a litany of reasons it can go wrong.  I'll refrain from commenting further because my opinions on how a network should operate - regardless of scope (eg. home/business/enterprise/provider), are a whole post in an of themselves.

There's a good number of reasons to put Bell's equipment into bridged mode (operating as a modem only) or removing it entirely.  Both of these can increase the reliability of your network, whether at home or at work.  The only exception is if they're providing you with something better than the Home Hub series.  There are a few instances that I have seen that Bell has provided Cisco or Juniper (or similar) class of equipment.  I believe this is reserved for very specific business use-cases, from medium business up through enterprise connectivity.  Setting that aside for the moment, since those solutions are good and work consistently, I want to talk about the vDSL and GPON that is provided for home-based and small-business use cases.  This often involves a home hub.

The information that Bell won't tell you is enormous.  Their usual line is to use the provided gear and that's the end of the discussion, it can be quite frustrating as a technology enthusiast or networker looking to get something a bit more robust to run your network.  It seems to me that Bell's intention is that clients in the consumer and SMB space will use their gateway as the default gateway, never ask questions and just deal with how horrible the device is.

Let me make this perfectly clear: Bell has a strong, reliable and robust network.... until you get to the gateway that they provide for you.  I've used a lot of Bell's networks for the purposes of connecting to the internet, both as a consumer, and as support for businesses trying to navigate Bell's messaging.  In every case where something is wrong, the problem is 90% of the time, the provided equipment.  If you're having a hard time with Bell, that is very likely the culprit.  Honorable mention to those in rural areas where the copper lines are horrible; but once you get past the modem/router through the copper lines to the node, it's smooth sailing out to the internet.  Why they condemn their clients to the horrible products they put out, I'll never know.  I feel that knowledgeable clients should be able to buy their own gear for the purpose of using it on Bell's network, and Bell should supply options for those people specifically, that serves their needs.  They do not.

To be VERY CLEAR: Bell, please give us devices that are strictly modems.  There's a non-trivial number of users who would benefit from this, and this message, as far as I'm concerned, has been shouted from the rooftops for years.  When it comes to fiber, give us an ONT that works, and let us figure out the rest.

To be fair to Bell, 80% of the clients they service are home-based users that don't know networking well enough to actually do what's required to get things working, and that's fair. I don't think grandma smith down the street cares that her internet isn't super reliable, as long as she can play her facebook games most of the time; but Bell EXCLUSIVELY caters to those who have zero networking knowledge or expertise; and that's what I think should change.


Moving on to more important topics, vDSL on the Bell network is fairly simple and straight forward, at least for anyone capable of their 50/10 "high speed" packages. These connections are handled by VDSL (ITU G.933.1 or ITU G.933.2), usually topping out around Profile 17a, though evidence suggests they may be moving to Profile 30a in the near future. There's remarkably little information as to the DSL profiles available if you examine the provided routers (home hubs 1, 2, and 3 - the 4 doesn't have a DSL port), however, some information can be gained from looking at wholesale customers like Teksavvy or Start.ca.  There's a ton of other wholesale clients for Bell's services, but I'm going to focus on Teksavvy since it seems to be the most popular in my area.  For vDSL 50mbps service, the unit they offer is the SmartRG 516AC, this is from a line of SmartRG modems, which includes everything from the RG501 through the RG516AC and beyond, they all have similar or the same chipsets for DSL, with varying features (the 501 only has a single ethernet port, as an example, while the 516 has full modem/router + wifi capabilities). Looking at the spec sheets for the 516AC, they support Annex A, L and M up to profile 17a.

So breaking it down, the basic specs are vDSL2+ using Profile 17a, on either Annex A, L or M, should be sufficient.  I've done my own research and found the PTM is the mode being used over VLAN 35.

I recently acquired this information by picking up a Cisco EHWIC-VA-DSL-M (Annex M supporting Profile 17a), it's entirely possible you could get everything working using the Annex A version of the same (EHWIC-VA-DSL-A), however, I have not tested this.  I have every suspicion it will work, but I have no evidence.  I'm subscribed to a wholesale line via Start.ca, who has been very good to me.  I installed the EHWIC into a Cisco ISR G2 1921 for use, which comes with it's own caveats.

Relating to Cisco vs DSL: you do not need the EHWIC-VA-DSL module's ATM port, you can disable it with the shutdown command.  How this works is that the ADSL and vDSL modes are descrete interfaces and controllers in IOS. so the ATM features and functions for ADSL are not required at all.  If you're following in my footsteps at all, you may want to look up the firmware for the card, however, there isn't an easy way to find it.  This module is the same as the built in module for the 800 series routers and the firmware is actually listed on the Cisco website under those routers.  One of the options for that firmware is the firmware that actually says it's compatible with the EHWIC-VA-DSL modules, so select that.  If you don't have a service contract with Cisco for the unit, you may be out of luck for downloading the firmware from Cisco - my only suggestion here is that if you manage to acquire it by other means, verify it with the MD5/SHA512 hash from the official download to verify it is correct and has not been tampered with.

vDSL will automatically try to connect without additional configuration, this is a L1 link and the defaults will work with this.  If you wish you can go into the controller settings (command is: (config)# controller vdsl <unit/slot/port>  where the slot/port/unit for me was 0/1/0, but could easily be 0/0/0 depending on your specific configuration), and set it to use the command ' operating mode vdsl2 ' to prevent it from discovering that.  Since it will always discover the same mode every time, this could save a bit of time when getting connected.  There's some merit to setting the SRA command here too for Seamless Rate Adjustment, though not strictly required.  After that, you may note an Ethernet interface popping up under the same unit/slot/port number, in my case Ethernet 0/1/0.  Get into the configuration mode for this interface and perform a no shutdown.  That's all that's needed here.  Next you want to create an interface for vlan 35, I selected ethernet 0/1/0.35 for the purpose, though the subinterface number could be anything, and set ' encapsulation dot1Q 35 ' to set VLAN ID 35 on the interface. this is where you set your dialer interface to dial from (pppoe enable // ppoe-client dial-pool-number #).  which requires a dialer interface configured with your username and password, as well as several other options that have been covered at length in other posts/blogs/kb articles.

One issue I kept running into was that my 1921 refused to connect and closed out the vDSL connection immediately after it was established.  I tracked this to a debug log entry that said it "failed to add pppoe switching subblock". This appears to be a Cisco bug, and I believe what finally fixed this was the inclusion of they keyword "callin" in the ' ppp authentication ' command (full resulting configuration was: ' ppp authentication pap chap callin ' - which appears to do the trick).  Once all that is set, you should have a functioning connection.  All usual nuances of setting up NAT and routing need to be done as well before you have a useful connection, but it does indeed work.


GPON/FTTH/Fibe:  This was an interesting journey down a rabbit hole for me.  Bell is using GPON very similarly to PTM over DSL, on VLAN 35.  With their recent release of the HH4k, they now have units in the field that are also XGS-PON.  So, starting from the beginning, they use GPON with 2.488 Gbit/s downstream and at least 1.244 Gbit/s of upstream (ITU G.984). This technology uses a form of waveform division multiplexing (or WDM), to mux 1310nm light for upstream traffic and 1490nm wavelength for downstream (optionally video at 1550nm).  These are split using, what is essentially a prism so tx and rx are independent, resulting in full-duplex operation.  The addition of XGS-PON is logical, since it can co-exist with GPON.  XGS-PON (ITU G.9807.1), as far as I know, uses the same frequencies as XG-PON (ITU G.987), but with increased bandwidth on upload (Nearing 10Gbit/s with 9.953 Gbit/s). hense XGS-PON - or X (for 10) G (gigabit) S (Symmetrical), PON (Passive Optical Network).  To my understanding this bandwidth is shared, and Bell will only give you a 'cut' of the bandwidth available.  It is likely they are planning to roll out, or have rolled out XGS-PON in high-demand areas, to avoid having to install more GPON line terminals to handle the user load, and more lines/splitters to divide customers up into more ports on the OLT.  At the head end, they can simply splice off the XG-PON wavelengths and install an XGS-PON line terminal to provide the required bandwidth while continuing to serve slower committed-rate clients with GPON. This is an economical solution and demonstrates Bell's ingenuity when it comes to their client-handling equipment.

There's a catch with GPON, that the transceiver needs to be authorized with the OLT.  So Bell can authorize or de-authorize whatever they want on their network, providing a significant challenge to anyone trying to remove, eliminate or otherwise bypass the homehub equipment.  With the early releases of GPON, this was a fairly trivial matter as Bell included a G-010S-A SFP GPON fiber module with the HH3k, which provided the crossover from GPON to ethernet inside of their homehub, you could remove this module and connect it to whatever you wanted, and get service, this has been eliminated with the use of the Homehub 4000, since it has a built-in GPON and XGS-PON transceiver array, which cannot be removed or changed, and must be used to connect, as alternatives are not authorized to connect to the OLT.  There are three factors for authorization that are possible, first is the module's MAC address, which is very commonly a filter that ISPs will use to classify equipment as authorized or not.  Next is the ONT S/N, which is broken into two parts, the MFR ID, which is the first four letters, and the G984 Serial number, which is an eight character hexadecimal code.  These are printed on the HH4k or the G-010S-A modules and can be readily accessed.  The last possible factor is the SLID or Subscriber Local Identifier, which is not printed on the unit nor accessible by the firmware on the homehub.  Luckily, with a bit of wizardry, I was able to obtain this information from a G-010S-A, and resulted in a string of zeros.  It appears Bell isn't using this factor, but may in the future.  We simply do not know.

So if you are pursuing a bypass to the HH3k or HH4k for GPON (the HH4k will tell you if it's in GPON mode on the WAN mode page), you can simply replace the HH4k with a Nokia G-010S-A module (which can have the MAC, SN, and SLID programmed), model 3FE46541AA (or same from Alcatel/Huawei), and reprogram it with the MAC/SN/SLID and use that instead.  All PPPoE needs to be done over VLAN 35, and everything should just work from there.

There is a git repository on the subject, so you shouldn't have any trouble getting access to the module for reprogramming, or finding the reprogramming commands.

This, of course, is informational, I offer no guarantee any of this will be valid tomorrow, or work for anyone else. If you choose to pursue removing the Bell branded equipment for your own, then do so at your own risk.  I am posting all this information because I have been consistently frustrated by Bell's lack of transparency, and rather than have anyone else go through the process of figuring it out, I wanted to put it out there for anyone seeking to do the same, so you can learn from my mistakes (of which there are many) and reach your goals faster, with less effort.  I am certain that Bell will not appreciate using alternative devices, modules, or connection methods to their network, and I am entirely positive that they will refuse to help anyone who has something set up in an "unsupported configuration".  So beware of issues.  It is handy to have the homehub given to you as part of your subscription in case of any issues.  First thing to do when experiencing a problem is to revert back to Bell's equipment and test to see if things are working with that before calling them to complain that anything isn't working.  IMO, they won't even talk to you about it until you do.

But I will say that I've moved over from using the homehubs and ISP provided equipment and my internet is quicker (lower latency), and more reliable than ever before.  Bandwidth is still limited, of course, but I can get what I need to get done, that much faster because I'm not waiting on their systems to figure out what to do next. I have control over the hardware, and I can troubleshoot very intelligently before needing to revert back to the provider-approved and supplied gear to determine if my equipment is at fault, or if their network is at fault.  Simply put, now that I've replaced the garbage modem/router they provided, I haven't had to deal with customer support for internet issues in many years.  Outages still happen, but I can determine the cause and wait it out before having to call them.

Bear in mind, that I do this on a professional level, so troubleshooting network connections is part of my DNA.  If that's not you, then maybe consider something a bit more conservative and hang onto that homehub.... just put it into bridged mode and call it a day.