Oops! How did we miss something so basic?

We have been trying to build packet networks without having a basic and essential mathematical concept. This matters. A lot.

When I worked at Sprint many years ago, I soon discovered there is one Britishism that is unfamiliar to Americans. “Oops-a-daisy” is what we say to little kids when they fall over, that hurt tear forms, and the corners of the sad mouth turn down. It’s really quite useful phrase when a customer complains their streaming video keeps buffering.

The telecoms industry as a whole has a serious case of “oops-a-daisy”, but doesn’t know it yet. We’ve missed out a basic building block necessary for packet network engineering and digital supply chain quality management. This is, to say the least, rather unfortunate.

Whether your memories of doing mathematics at school are fond or foul, if you’re reading this you likely have a vague recollection of studying imaginary numbers (i² = -1 and all that). Of course, there’s nothing particularly unreal about imaginary numbers, or real about non-imaginary numbers. Mathematics is all just a big game we play in our heads, one that sometimes turns out to be useful.

If we hadn’t cracked imaginary numbers then we wouldn’t be able to do complex analysis. That would be spectacularly inconvenient: we would no longer be able to express and solve the equations for electromagnetism. So no models of how antennas work, which equals no signal. Say goodbye to every modern mobile phone network, in that case.

Now it turns out we’ve collectively made a bit of a blooper, because we’re missing the “imaginary numbers for probability”. Let me explain.

When we measure a packet network, there are two phenomena that we care about: packet delay, and packet loss. Delay is a continuous thing (it’s a time interval). Loss is a discrete thing (it either happened or it didn’t). We can use probability to model these things. The world of physical objects provides us with a wealth of mathematical theory to draw upon.

The nature of any probability function is that it maps an event to an outcome. You pick a card from a deck, and there is a probability it is an ace of spades, for example. A stacked deck or a loaded dice might deliver an uneven or clumpy probability function. The basic pattern is “some event happens” and “some value is associated with that event”. Got it?

There is an “obvious” way to model delay and loss: as two distinct types of observable event, each with its own probability function. You stick a packet into one end of the network. The loss function tells you how likely it is to arrive (one kind of event, discrete outcome); the delay function what kind of latency it got if it arrived (another kind of event, continuous outcome).

The telecoms textbooks and industry standards are chock full of examples of applications, each with different needs in terms of loss and delay. This is bread and butter stuff. Sadly, it is also a really unhelpful way to go about expressing network quality and performance.

Here’s the issue: what we want to do with any kind of engineered structure is to reason about the relationship of the demand load to the available supply. This could be a bridge, skyscraper, aircraft, or any one of thousands of kinds of load-carrying system. Any load will have a static component (e.g. the weight of the steel and concrete) and a dynamic one (e.g. wind and earthquakes).

The basic task of any structural engineer is to tell you whether the thing can carry the offered load, and do it with a known and predictable safety margin. Ideally you also have graceful failure in overload. These, regrettably, are not possible in mainstream packet networking. We’ve borked up our metrics by having too many, and of the wrong kind.

When we choose two different types of events, we create a nasty problem. The demand is now expressed in two different probability variables. Even worse, these are coupled, as a bigger packet buffer adds delay to reduce loss.

So we somehow need to relate both the two variables, and their mysterious coupling. This can’t be done. Zut alors! But that’s not all. When we try to “add up” the load from multiple demands (say a streaming video and a file store backup) it isn’t a meaningful operation. Merde!

That means the most basic requirement for doing “proper engineering” of network performance is missing. We can’t properly quantify demand or supply, or compare them correctly. The maths of physical objects is insufficient to capture the properties of virtual ones. It’s a wonder we’ve made it this far.

Given this embarrassing situation, what can we do? Well, this is where our “imaginary numbers for probability” come in. What if we reconsider the essential nature of packet loss? Rather than think of it as another type of “observable event”, it can instead be thought of as an “observable non-event”.

That sounds a bit Zen, and, well, yes. It is. The great thing is that we can now extend our basic probability function for the event of “arrival with a delay”. We do this by “bolting on” the loss as a “non-event” that is a kind of “infinite delay”. A single mathematical object can then encompass both loss and delay. After all, “lots and lots of delay” is functionally equivalent to loss.

Thankfully there is a ready-made bit of maths to make this probability glooping magic happen. They are called “improper” random variables. It’s not that they come from broken homes with terrible manners. They simply allow for the possibility that the event didn’t happen at all.

It’s like tossing a coin and someone steals it before it lands. The “improperness” of the “observable non-event” extends the basic “observable event”. It is a little bit like how an imaginary number extends a real one to make a complex number, like 3 + 4i.

Et voila! We now have a single metric space which can be used to describe both demand and supply, and be used to compare them. Even better, when expressed in the right way, it “adds up”, so forms an algebra. This is the monoid you are looking for.

Such an algebra is the basis of the ∆Q calculus (i.e. equations about supply and demand that you can actually solve). Without these ∆Q metrics, algebra and calculus, it simply isn’t possible to adequately express the demand for performance or its supply.

This basic construct is mandatory to properly model packet network performance. Its absence has widespread economic and customer experience impacts. We build infeasible products and only find out when we deploy them; we mis-size our services for the specific customer need; we fail to correctly plan capacity, wasting vast sums of capital; and we cannot isolate performance issues reliably.

At some point, we are going to have to ‘fess up to the public: we’ve built a whole Internet without figuring out the units for load! We can’t even do basic calculations about supply and demand, which means the economic model is insane. The good news is we can reassure the world that this is now a solved science problem, based on a new branch of applied mathematics.

These new “imaginary numbers for probability”, my dear reader friends, are the key to the magic kingdom of predictable performance engineering and rational resource economics. You know, the kind we take for granted in every other technical domain, where “oops-a-daisy” isn’t an acceptable excuse for failure.

The quality attenuation framework and ∆Q calculus was developed by Dr Neil Davies of Predictable Network Solutions Ltd in collaboration with associates. To learn more visit qualityattenuation.science.

If you fear being a datagram dunce, I offer educational webinars and training workshops in network performance science. Hit ‘reply’ to request details. 

 

For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.