The silence of the landlines – the need for innovation in voice services

I’d like to discuss an aspect of telephony that in retrospect is obvious, but still ought to be pointed out.  Our assumptions are often elusive, and arise from historical roots that no longer need apply.

How is voice calling tied to its conversational context?

The earliest telephone systems were analogue, and were an evolution of the telegraph. Operators – of the human kind – connected calls and made introductions of callers to callees. A new etiquette also grew up on how to greet someone via telephone – should one say ‘ahoy’ or ‘hello’? We all know how that one worked out!

The moment we went to automated systems, and the human operator was taken out of the loop, something important changed. The caller became invisible to the callee, and vice versa. That means the caller could not signal anything other than “I wish to summon you – answer NOW!”; likewise, the callee didn’t know who was calling, their context, or their intent. The transition from “not speaking” to “speaking” to this day involves a kind of contextual step function. The call and conversation were seen as synonymous.

Over time, we made one small concession to this pattern of mutual invisibility: the introduction of caller ID. At the time, this was considered controversial and privacy-eroding, but today it is generally seen as normal and necessary. What it did was to open up a small gap between the call and the conversation. In those few moments when the caller ID is presented, but you have not answered, the conversation has begun, even if the call has not. The callee may infer intent from the identity of the caller, the timing of the call, and any pre-existing agreement of purpose.

Notwithstanding this tiny advance, the mainstream use of voice remains stuck when it comes to the addition of calling context. Phone calls are completely de-contextualised: the assumption remains that any voice service should by default reveal nothing whatsoever about the callee. There is no means to adapt the communication to your device, location, roaming status, local time of day, language preferences, (dis)ability status, and so on.

I hit this issue today when thinking through a client problem: should a mobile operator terminate a call to a WebRTC device with an announcement “this call is going off-net and we are not responsible for the resulting call quality”? That would break the omertà of the telephony operators, and is thus culturally unthinkable to them.

The ability to share context is why I am ultimately bullish on WebRTC, despite the Internet being a poor transport system for reliable real-time communications. This technology deliberately severs itself from legacy patterns of signalling and conversation, and has a natural affinity to being embedded in its contextual application. Furthermore, browsers and mobile application runtimes have the potential (although not yet realised) to deliver very rich context indeed. This is just simple time and location, but also social and transactional context drawn from other applications. As Tim Panton notes, WebRTC is also capable of transporting rich identity data from both the telco and Internet ecosystems. Context can piggy-back onto this identity data too.

Similarly, hypervoice technology is also about solving this context problem: how to and re-integrate voice into the (social enterprise or consumer application) context and activity stream from which it emerges? The conversation is much bigger in scope than a single call.

It is not hard to imagine radically different business models emerging as a result of addressing this context issue. A future voice service operator might get paid not for enabling an audio channel. Instead it may be about adding context: personalise the caller message, and optimally time calls to maximise the answer rate. This will upset the economics of (hyper)voice every bit as much as Google upset the model for (hyper)text.

My fear is that network operators are going to continue to do absolutely nothing whatsoever to evolve their core voice product. The result is the calamity we are seeing unfold with SMS. Mobile applications are going to intermediate the voice experience – as Viber does today – and all the new value will be extracted by third parties. The technology exists to intermediate landlines at the network edge too.

That lack of vision and product development is a terrible waste: we have platforms like Ameche from Voxeo Labs that make it relatively simple to extend your core voice offering, and deliver contextual calling. Products like Hullomail also easily integrate with existing voice messaging services and enablers. These richer capabilities avoid users having to go ‘over the top’ and incur the reliability and convenience penalty that often results. Voice innovation isn’t that hard!

Yet operators appear to be waiting for some standardised industry solution to turn up, or for the big vendors to supply some ready-packaged answer. This isn’t going to work — the world is moving on too fast. If you want to preserve and grow revenues, the time to act and get involved with the voice innovation community is now.

The alternative for operators is to see your voice revenue get gobbled up by those who understand context and conversations. This will not be done in a nice or painless way.

What stops operators from innovating in voice services? My guess is a fear of new capabilities cannibalising old business models, along with deeply-held (but invalid) beliefs over what is ‘right’ and ‘wrong’ about sharing caller and callee context.

In that spirit, I’ll leave the final word on the mortality of telephony to our friends Clarice and Hannibal:

Clarice Starling: If you didn’t kill him, then who did, sir?
Hannibal Lecter: Who can say. Best thing for him, really. His therapy was going nowhere.

To keep up to date with the latest fresh thinking on telecommunication, please sign up for the Geddes newsletter