The Frequentist Fallacy

It is tempting to linearly extrapolate the past of networks into the future – be it for any individual network, or of networks collectively. This is a dangerous assumption, as it is not grounded in the actuality of network behaviour. Nor is it a safe form of inference for complex systems with emergent behaviours and power-law change processes. This article gives three reasons why you should avoid such fallacious reasoning about broadband networks. Such intuitive reasoning is endemic in the telecoms industry, and misleads us into naively continuing unsustainable and unprofitable behaviours.

The philosophy of parachutists

A man jumps off a skyscraper. No, that’s not going to work – no dramatic tension – let’s try again.

Two people – a man and a woman – jump off a skyscraper. Better, but still not quite right. The story needs some more character development.

Two statisticians jump off a skyscraper. OK, now we’re getting somewhere! Let’s see what happens to them.

The first statistician – Fred Frequentist – grins ever more widely as he tumbles down the edifice. The floors whizz past…90th, 89th, 88th…onward and downward. As he passes the 50th floor, a window cleaner sees him closing in.

“Hi! How’s is going?” – “Great so far!” responds Fred, joyfully.

Not long after, Fred is dead, due to terminal concrete poisoning. A parachute remains attached to his back, unopened.

Moments later, the second statistician launches herself into the windy void. Barbara Bayesian has a different experimental methodology, and wears a furrowed frown. She notes the floors passing, and sees the same window cleaner, who still stares in shock at Fred.

“Howdy! How high?” she asks – in response to which the cleaner shouts: “PULL THE DAMNED CORD!”.

Barbara makes it down safely, swooping in to land down the street just in time.

Of skyscrapers and statisticians

Fred’s frequentist methodology was to assume the parameters of his world were fixed: “how safe it is to be passing floors on a downward descent off a skyscraper?” As he passed each floor, he collected data, which in his experimental world was the potentially varying component. Each time the data he recorded was the same: safe and secure! Thus every floor he passes safely means his confidence on the safety of base jumping off skyscrapers rises. His grin grows.

Finally, he reaches a floor where the parameters are different to his prior assumptions: there is a solid interface between the levels, rather than a gaseous one.

Splat!

Barbara makes an opposite set of assumptions, based on Bayesian conditional reasoning. She assumes the data is fixed – your measurements are whatever they are – but the parameters may be varying.

That means she is wary about changes to her framing assumptions. She consults with someone who may be aware of such unplanned changes in the parameters, and responds accordingly to the feedback she gets.

Swoop!

Whether a frequentist or Bayesian approach is the right one depends on the nature of the data you are working with and its prior assumptions. You have to pick the right inference tool for the job.

Broadband businesses are frequently frequentists

This article is likely to be read a few thousand times. Of all the people who read it, I expect less than half a dozen are broadband Barabaras. Everyone else is a fallacious frequentist Fred.

You can tell you are a Fred, because your assumption is that ongoing improvements in silicon, optoelectronics, radios and network equipment will keep on delivering ever more (monoservice) broadband goodness, just like it has in the past. I’ve even got the t-shirt at home for frequentist monoservice broadband base-jumping: it reads “Fat pipes, always-on, get out of the way!”

Whoo hoo! More bandwidth is always better… Splat!

I’d like to disabuse you of safety of your assumptions. The nature of broadband and the Internet is a Bayesian world, not a frequentist one. Let me explain why.

Multiplexing is the medium

Good quality of experience on a broadband network is the emergent outcome of packet multiplexing. This is a statistical process, and thus is subject to the philosophical question of ‘frequentist” vs ‘Bayesian’ inference of results.

For example, your web pages generally arrive because enough good coincidences occurred: other people failed to inject enough packets at the right-wrong time to disrupt your flows. Every so often, however, you find a web page mysteriously fails to load, and you hit refresh, and it suddenly appears. That’s because bad coincidences happened the first time you tried, but not when you hit refresh.

The ratio between good and bad coincidences appears to be relatively stable to most broadband users. The temptation is to be a Fred, and assume that this will continue. However, it doesn’t work that way, and we know what happened to Fred.

Splat!

How networks stop working

Packet networks aren’t pipes, whether dumb or smart. (Indeed, the people who laughed at the Internet being described as a ‘series of tubes’ often unwittingly exhibit the same fundamental pipe-like beliefs!) Networks are instead complex dynamic systems with coupling of interacting flows. They exhibit behaviours that don’t match any physical systems, although there are similarities.

For example, networks have phase changes a bit like how a substance goes from gas to liquid to solid. There are clumping and condensation effects, and these reach critical saturation levels. So just because the water is liquid at 6C, 5C, 4C, 3C, … doesn’t mean it will stay liquid so as you go 2C, 1C, 0C, -1C. Likewise in networks, you hit the place where one region of operation and behaviours transitions into another.

In this case, what happens is that as the offered load increases, and contention starts to bite, the time taken to complete that work increases. Eventually you exceed the timeout for the transport (be it at the TCP level, or the application retries, or a human hits ‘refresh’). Then the work starts to increase by the re-transmissions. This further increases the period over which the ‘excessive’ contention is occurring. The condensation-clumps take longer to clear, and you have a positive feedback effect.

This clumping process is a bit like ice cubes that form for a little while, then melt, as conditions change. Over time the cubes start coalescing, and then form small icebergs. Note that there is a dramatic difference in the rate of creation versus the rate of destruction. The flow-destroying ‘ice condensation’ forms polynomially fast, but melting back to free flow is only linear!

So the growth in these undesirable effects looks easy to extrapolate, just up to the point when it starts to look horribly problematic.

Splat!

The three frequentist faults

Bearing these dynamic and sudden phase-change properties in mind, there are three reasons that invalidate a frequentist approach to the future of broadband:

  1. Structural changes in broadband supply. We are introducing additional layers of multiplexing. Those layers are being created both by the suppliers (e.g. FTTC adding in extra queues in the street) as well as by the users (e.g. WiFi access points, tethering to smartphones). That means less flow aggregation and isolation. Not good news.
  2. Structural changes in broadband demand. We are increasing the number of concurrent flows, as we add more devices per access point and more simultaneously active applications per device. This results in more clumping. However, we are also placing tighter ‘quality’ demands over shorter timescales to support applications like 2-way video or femtocells. This (in a monoservice network) means we can’t support as many concurrent flows. Not good news.
  3. Diminishing returns to lower packet serialisation times (i.e. ever more bandwidth). The packet delay budget is increasingly dominated by geographic and variable contention effects, rather than serialisation time. That also means non-stationarity (i.e. variability in loss and delay) is being created by these effects, which disrupts control loops like TCP or adaptive codecs. Indeed, increasing the instantaneous peak to mean traffic impulse by lowering serialisation further can become counter-productive by inducing further non-stationarity. Not good news.

Put these together and what do you get?

Splat!

Better to be a broadband Bayesian

The frequentist view is the future is just like the past, with low rate of skew in the emergent network properties. Because of past good experiences, many simply assume endless emergent good outcomes. However, the effective flow isolation needed to enable a given outcome was never really there! What you got in the past was the impression of isolation. This is because the duration of the ‘excessive’ clumping and contention of the network resource was very short-lived, and usually lower than the threshold to cause a noticeable effect.

Given the changing structure of supply and demand, that frequentist inference is unsafe and unsound. The emergent properties of multiplexing increasingly dominate simple bandwidth effects. So if you are going to design, market and operate broadband networks, you need to be aware of your own prior assumptions. You don’t want to end up as a Fred Frequentist, with a nasty case of contention concussion. It’s much safer to understand the true emergent behaviours of networks, and be more like Barbara Bayesian.

Swoop!

To keep up to date with the latest fresh thinking on telecommunication, please sign up for the Geddes newsletter