How overcrowded can you make your telecoms networks?

Telecoms networks are all spectacular digital cathedrals

The telecoms industry is hardly new. Only this week I was standing in the Science Museum in London, showing my daughters one of the earliest commercial telegraph transmitters from the 1830s. Yet despite its age, telecoms remains a surprisingly immature industry.

What we have today is akin to a “medieval” digital society, with a pre-industrial structure of production. Legions of craftspeople work in guilds, producing bespoke networks at high cost. Their knowledge is often tacit, handed down by word of mouth and expensive experience.

Telecoms networks are all spectacular digital cathedrals, competing on the scale of their extremity. There is no telecoms equivalent of the qualified structural engineer or quantity surveyor: these ‘cathedrals’ can (and do) collapse under load, burying the value of their investor capital.

The construction industry became industrialised long ago: we can now build skyscrapers using repeatable and predictable methodologies. A similar industrial revolution is required in telecoms, one that hugely reduces cost and increases productivity. To achieve this transformation we must collapse complexity in network design and deployment.

Networking is a complex business

Why are telecoms networks so expensive, fragile and inflexible? The basic problem is that there are many interacting components that all need to be orchestrated to work together. There are lots of ways of reaching failure, but only a few of achieving success. Worse, there are ever more types of network element, which keeps on multiplying the number of possible interactions, and unanticipated forms of failure. On top of this, user demand becomes more diverse and difficult to satisfy over time.

A common issue found in network operators is, therefore, one of overwhelming complexity. This exists at many levels, and its nature impedes the ability to make rational decisions and progress.

This complexity can be found in three basic areas: how networks are designed, configured and operated. The design is about creating a supply capability; the configuration creates policies that allocate that supply to the demand load being applied; and the operation is about execution of those policies, and the resulting downstream business processes like billing.

Abstraction collapses complexity

The way we deal with complexity in any domain is through abstraction. An appropriate abstraction reduces complexity, because it extracts commonality and hides variation. For example, once you know the meaning of the colours used in traffic lights, you can ignore the variations in technology used to make the lights work. We didn’t need to retrain the whole driving population just because we switched to LED illumination!

In telecoms, we use a variety of abstractions to simplify our world. For instance, organisations like 3GPP and the TM Forum create design and operational standards that abstract away network implementations. However, their ambition is currently limited to compatibility and conformity, not to truly collapsing complexity. That is a higher-order kind of problem.

Why does so much complexity remain, despite their valiant efforts? The reason is that whilst the components may have compatible interfaces, the industry as a whole has great difficulty predicting cause and effect: network science is an emerging discipline. We don’t really understand how to compose the sub-parts to achieve our goals, and can’t abstract and automate that engineering process. Indeed, the very idea that such compositionality is desirable is (in my experience) often seen as controversial!

To bring this all to life, let me tell the story of my first full-time job, to contrast telecoms with other more mature industrial technology businesses.

Making paper-making machines

Back in the early 1990s, I was a Lisp programmer on an EU-funded research project called “Machine Cell Operator’s Expert System” (MCOES). It’s in the pre-Web memory hole, but is documented on dead trees.

The project was about modelling complex machinery, such as found in paper-making factories. These giant plants are the size of Boeing 747s, costing about as much, and the paper emerges at about a third of the take-off speed of a jumbo jet! The roller bearings and their housings are complex multi-million dollar precision-engineered objects. Furthermore, every one of them is built-to-order and is thus unique.

MCOES automated the whole process of turning a bearing design into running CNC code to manufacture it. Just by putting in your parameters of load, paper type, etc. – out came everything you needed to make the machine.

“Software-defined manufacturing” was already a well-established discipline two decades ago. We have nothing like that in telecoms today. You can’t parameterise the network’s business goals, press a button, and automatically produce everything needed to make it integrate and operate.

The telecoms abstraction gap

Let’s go back to the three areas of complexity concern: design, configuration and operation. Telecoms networks are costly and unreliable because we have a deficit in the abstractions we are using:

  • In design, there are few (if any) abstractions to automate network integration. This is only good news if you are selling systems integration consultancy services.
  • In configuration, we are using the wrong abstractions (“bandwidth”) for packet data, which leads us to high costs and bad user experiences. For instance, capacity planning is sometimes done by fiddling until the system works, and then doubling the capacity to provide a safety buffer. This is alchemy, not science.
  • In operation, we are making a valiant effort with Software-Defined Networking (SDN), but the abstraction remains a weak one, and has yet to see widespread deployment. Linkages between business processes are brittle, resulting in the legendary slowness and inflexibility of telecommunications providers.

SDN is a particularly instructive example of the “abstraction gap” in telecoms. One major benefit of SDN is to manage the connectivity complexity of an earlier inappropriate abstraction: that of a global IP address space. It does this by re-inserting appropriate routing and connectivity separation. To see how a robust abstraction for connectivity should work, check out Recursive Internet Architecture (RINA).

Our nemesis: compounding complexity

The end result of these missing or inappropriate abstractions is an industry that is trapped in a paradigm from which it sees no escape. Network optimisations and middleboxes spawn complexity in all directions. Meanwhile, vendors hawk new cost-reducing technologies like network virtualisation, which fail to address the fundamental issues of complexity, whilst introducing new failure modes and complexity of their own.

If we stay on our current path we face unmanaged complexity and growing diseconomies of scale. Operational network management processes will dominate the budget, and starve us of resources for capex to renew the network. Network fragility will mean customers find bad experiences become more frequent, as mistakes become more common. Financial planning will become even harder: CFOs will become increasingly uncertain about the value to the business of spending on network assets.

The alternative: industrialised telecoms

What we want is managed complexity and economies of scale. We want the kind of predictability that the construction business has, where the risks are well understood, and the interrelationships of customer experience, cost and component performance are quantified, predictable and managed. They know precisely how to configure concrete, glass and girders to deliver an outcome that is fit-for-purpose in the eyes of the customer.

So how do we go from increasing (super-linear) to decreasing (sub-linear) complexity? The change required is from being a craft industry to a fully industrialised and automated one. We do that by finding the appropriate invariants for network design, configuration and operation.  These invariants abstract out “truths” we want the system to always exhibit.

Delivering a network service is then an act of maintaining the right design, configuration and operational invariants. When these all hold and align, the system will support the functions the business desires. Knowing what is meant to be “always true”, we can then understand what is normal variation, and only focus costly human remedial attention on abnormal variation.

The way forward

This process of telecoms industrialisation requires three parallel capability revolutions:

  • For network design, we need operational integration standards. These do for the “systems glue” what MCOES did for automating the running of CNC lathes.
  • For network configuration, we need a performance engineering calculus to describe and manipulate the trading space and allocate supply to demand.
  • For network operation, we need to look beyond even the current emerging SDN standards to a richer language of control for complex distributed systems.

The final task is to align all of these. We must establish verifiable standards for interworking of common telecoms business processes, like service provisioning, at all three stages. What we intended in the network and systems design must be capable of being configured to meet the business goals, and then run to actually meet those goals. This is easy to say, but hard to do!

Please get in touch to find out how we can help you meet your business goals

To keep up to date with the latest fresh thinking on telecommunication, please sign up for the Geddes newsletter