Can there be competing QoS standards?

The old saying “the great thing about standards is there are so many to choose from” particularly applies to network QoS. How many do we really need?

Following on from a past article on the need for a “quality revolution” in telecoms, a reader wrote in to ask:

I was struck by the thought in the post below that suggests standards bodies need to make a choice [of QoS metrics]. … Are there are choices and alternatives, could there be competing solutions for QoS that sit side by side and enable one player (or another) to win the “QoS race”? From a technical perspective are you saying that multiple QoS standards can’t operate side by side over the same backbone infrastructure?

These are good questions, and to answer them we have to return to some basic principles. The key one is that there is a single objective reality to the world (as far as we know!), so there is only one way of interpreting the physical fact of how long it took for a specific web page to load or whether a particular streamed video buffered; and there is likewise only one way of interpreting whether a packet was lost, or by how long it was delayed.

What then matters is how we capture what is relevant and important about that objective reality. Any “QoS standard” has to say something about the effect of the network on the packets that are sent through it, and do it in a way that is “useful” to a network user or designer. That means the QoS standard has to be relatable in some way to the value delivered as performance of user applications.

Whilst there may be only one way of interpreting the effect of the network, there are many possible ways of expressing that effect. How many ways are good ones? With electricity we only have one “volt”: a single concept of electrical potential difference with a single scalar expression. This can be contrasted with, say, length: a single concept, but with many scalar expressions such as “foot” and “furlong”. Going further, an idea like “customer satisfaction” can be seen as many separate concepts, with many different expressions (e.g. net promoter score, counts of positive or negative words in reviews).

My proposition is that there is only one “ideal” theoretical way of interpreting the quality a network offers. However, there may be multiple practical expressions of that for different contexts and uses, rather like how you can express metres as a number or a bar chart. Before we think about what an “ideal” QoS metric looks like, let’s first reflect on what we have today, which is less than ideal.

Our present mental model of network quality is that you get a supply of bandwidth, and that bandwidth in turn has a quality property. That property is described through network measures like average packet loss, delay and jitter. (This is unfortunate, as there is no quality in averages.) These are then related to application performance measures, like PESQ score for voice, or time to first frame for video.

This approach has several critical problems:

  • The same network QoS measures can result in widely varying application performance outcomes. They are often poor QoE proxies.
  • Turning an application performance metric into a network one is as much art as it is science, and is easy to get wrong.
  • When different parties in a digital supply chain come together, there is no means to take an end-to-end measure and turn it into requirements for those sub-elements.
  • Conversely, when you take a set of sub-elements, each one can deliver its technical network performance metrics, but the end-to-end interaction does not deliver the desired application performance.

Contrast this with our “ideal” approach. In this case, the network measures and application performance would be readily related. It would be easy to compose and decompose network quality measures, and know what the resulting application performance would be. We would have a robust scientific model of cause and effect, allowing the safe engineering of the “safety margin” —essential in a competitive market that drives you to run networks “hot”.

The good news (and no surprise to long-time readers) is that this ideal exists: the science of quality attenuation, quantified by the probabilistic mathematics of ∆Q. And rather like the science of electromagnetism and the mathematics of complex numbers, there’s really only one of them on offer, to the best of our knowledge.

As Richard Feynman once said: “You’ll have to accept it. It’s the way nature works. If you want to know how nature works, we looked at it, carefully. Looking at it, that’s the way it looks. You don’t like it? Go somewhere else, to another universe where the rules are simpler, philosophically more pleasing, more psychologically easy. I can’t help it, okay?”

So, that’s the way it is, and there’s really only one way of going about modelling the world in a suitably “scientific” way. Now, for specific problems you might want to use alternative approaches to quantifying the quality attenuation, so as to focus attention of different aspects of the problem. For instance…

  • We may have a different emphasis on the [G]eographic, [S]erialisation and [V]ariable contention effects in different problem domains.
  • The “stationarity” (i.e. statistical stability) of the system is with respect to a timescale. Different timescales (seconds vs days) may need to be expressed differently. All those clever “adaptive” algorithms require some level of stationarity, and we’ve not begun to define or denote it.
  • There may be “higher order” and more complex “couplings” and “moments” that we must take account of in specific circumstances.
  • We are always concerned with the “tail” of probability distributions (where there’s a risk of under-deliver and failure), and there are many “lenses” we can use to classify and quantify the structure of those tails.

So whilst there is scope for a basic single universal standard metric, rather like length in metres or time in seconds, there are also these exceptions and extensions. Many are likely to be niche cases: it’s like wondering how gravity works near black holes, and its effect on time and length measurement. Not something the everyday user would have to consider. For example, existing standards for things like strong internal timing in telecoms networks would remain unaffected by a public packet QoS standard.

When it comes to practically representing these abstract quality ideas, there can be many competing ways, and in several senses. There are a number of forms of mathematics that can be used to model one underlying idea or problem. For instance, in computing we have the lambda calculus, Turing machines, and type theory… and more. Likewise, there may be many different ways to operationally measure a standard metric. Exactly which test traffic patterns should be injected, how, when and where? What’s the interface to observe and collate that data? Then there can be endless ways of expressing all the above symbolically, such as different file formats.

Ultimately, the grand prize is interoperability of quality, both within networks (different vendors, same QoS standard), as well as in-between networks (QoS SLAs that are useful and generate revenue rewards for delivery). We saw how Internet Protocol gave us a standard abstraction of connectivity. Now how do we repeat and build upon that success for quality, security, management, and more?

The present challenge is that the QoS problem is by its nature diffused among many stakeholders, and requires coordinating many disparate initiatives, some of which (unknowingly) have a weak scientific basis. Indeed, there are already too many competing QoS standards! The real issue is how to have fewer of them, but of far higher technical merit and commercial value.

For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.