QoS vs SoQ: How we got network quality backwards

First steps are fateful. Because we have unhelpfully categorised quality, subsequent attempts to engineer it in packet networks have run into trouble.

The issue of network quality has become strangely public and emotive in the last few years. The obscure art of packet scheduling is now entangled with net neutrality political ideology, driven along by strong feeling and feeble science. The shrillness of commentator opinion on the matter of “quality of service” seems inversely related to the actual understanding of the speaker!

But maybe, just perhaps, there is seasonal eggnog on all our faces? Because it seems that everyone has failed to notice that our basic terminology has things backwards. The very concept of “QoS” fundamentally miscategorises the problem, meaning that there is an inherent futility to the collective search for a solution.

In the QoS paradigm, quality is implicitly an attribute of the service, and a secondary one at that, since throughput is primary. This “quantity with a quality” model focuses on “giving quality” to the flows and applications that need it most. The resulting talk is all anchored on “fast lanes” and their possible perils.

Offering “improved quality” to real-time and interactive applications is relatively easy to achieve, using a combination of two techniques: (strict) priority plus over-provisioning (i.e. idleness). Thus “delivering QoS” has been frequently dismissed as a solved problem, and not worthy of further serious discussion.

This ignores three rather serious drawbacks of the current approach to QoS.

The first hindrance is the rather obvious one: the capital asset is expensive and finite, and must be run as “hot” as possible in a competitive market. An application with a tight quality need, even if low bandwidth, effectively uses enormous uncosted and unrewarded idle network capacity if it is to work reliably. There is an unaccounted-for “invisible load” on the network that the textbooks don’t generally talk about.

The second problem is subtler. With QoS traffic management schemes, so as to avoid under-delivery of quality to the “priority” class, we must constantly over-deliver. A “best of best effort” methodology sets a floor on the quality, but not a ceiling. This persistent over-delivery then creates a false buyer expectation that it is an entitlement. Users start running video and VR in the quality class meant for voice… and store up unhappiness, complaints and churn for the day it disappears.

The final reason is the real killer for QoS. Because the over-delivery to the higher “rich” classes must rob the “poor” classes of much-needed resources, the value of what is left is now significantly impaired or even totally destroyed if we let the “priority” load rise. Whereas before you could run, say, WebRTC voice in the generic “data” class, now you cannot. So you must not ever let the “priority QoS” load rise, rendering it unprofitable.

The chart below captures the problem. As the proportion of traffic in the “high quality” class rises, the “low class” is rapidly rendered useless. If we had three classes, for the bottom class to have any value, we would always have to run the whole network at trivial loads.

Extract from Fundamentals of Network Performance handbook. See below for purchase details.

The reason we’ve not seen differential QoS adopted more widely is that it presently forces an unwelcome choice: to be at least one of uneconomicineffectual or harmful. Indeed, should “fast lanes” take hold on the Internet using “QoS”, they will definitely harm the “best effort” class. This forces a vicious cycle, driving additional demand for more “fast lanes”, which then self-destruct as they become the de facto “new best effort”, requiring “even faster lanes”.

On the other hand, if you don’t build “fast lanes”, you exclude applications that need consistent performance or are sensitive to cost. Eventually you end up with a protocol arms race to “grab your unfair ration”, as we see companies like Google doing with QUIC. This leads to a systemic collapse via non-stationarity, which is observable.

So today you’re damned if you do, and damned if you don’t, when it comes to QoS-based “fast lanes”. Thankfully, there’s a way out of this awful dilemma, and it is to reframe and reverse the problem. Rather than quality being an attribute of the service, quality is the service! So instead of “quality of service” (QoS), we should instead focus on a “service of quality” (SoQ).

In the “SoQ” paradigm, we reject “packet monarchism”, and instead become metaphorical “packet Marxists”. Our focus with SoQ shifts away from ensuring lavish luxury for the “priority” class. In its place, we cap profligate resource usage, so as to prevent quality starvation for the masses. Throughput becomes secondary, as we define our objective as a sufficient “quantity of quality” for everyone.

This involves a very specific shift in engineering objective: SoQ, sees quality as “slow lanes” all the way down. It’s a change in perspective that makes all the difference.

With QoS, we aim to “give” quality by “priority”. This lets you “cut in ahead” of other traffic when you turn out of the “network side road”. In contrast, with SoQ, we ensure there are sufficient and suitably sized and spaced “gaps” in “main road flow” for “side road” traffic to be able to turn out and join.

Here’s the really wonderful thing about SoQ: all three of the problem of QoS go away. We assure delivery for the priority traffic, engineer away over-delivery, and prevent under-delivery to the non-priority classes. You can indeed have your delicious quality assurance cake, without the fattening idleness, or pipe-clogging excess sweet quality.

Given it’s Christmas coming up, that’s the only ‘lean and superfit’ thing we’re likely to relish for a while!

 

For the latest fresh thinking on telecommunications, please sign up for the free Geddes newsletter.