In 1976, it took up to 100 minutes of mainframe time to decode 30 seconds of speech. The best hardware available was PDP-10 by Digital Equipment Corporation with 4 megs of RAM.
DARPA funded the program for five years and can claim at least partial credit for pioneering the field. This is the same agency that gave us the Internet, personal computers, and drones.
The United States military were not the only one messing with the speech recognition technology in the seventies and eighties. IBM, AT&T, Stanford Research and Microsoft used progressively more powerful systems, until finally in the nineties, the average size of the vocabulary of the commercial speech recognition system was larger than that of an average live human.
Which didn’t really result in any dramatic improvements as far as real-life application was concerned. After all, it was all mathematics. Math and series of mergers and acquisitions that gave us Nuance, Dragon Dictate, and ultimately Siri. Arguably the most annoying use case ever. You’re getting jumped in an ally and your iWatch delivers the famous “I don’t really have an answer to that” line. With an Aussie accent. To be fair, that’s actually the reverse process, text-to-speech, or speech synthesis, but you get the point.
Speech recognition was just math. Speech analytics is the real rocket science.
You can “teach” a supercomputer to play chess, but you can’t make the machine sentient.
As an outbound call center manager, you rely on your team of sales and lead gen agents getting into a rhythm over their day of calling. Your predictive dialer is set up to keep your team in a steady flow of conversations. But your agents keep getting voicemails.
Now your agents’ efficiency and morale is dropping fast.
These disruptions are all the more exacerbating if your dialer is supposedly using a voicemail detection tool.
How Voiso catches issues before they escalate
Voiso records conversations on demand and then transfers every call into searchable text. Once this data is available, you can store it indefinitely and perform analysis of any depth and complexity, limited only by the GPU computational power available. That’s a very, very high limit. Voiso AI can go through several years worth of phone calls in an extremely short time.
Semantic AI – what does this incident mean and what are you going to do about it?
Not every regulatory or compliance breach will involve the F-word. Human interactions in a customer service context are full of innuendos and passive-aggressive behavior. Sometimes, when the conversation escalates to a point where the customer is attacking your agent, and the agent is talking back with near insults, the offensive language might not really be present. In other words, off the shelf speech recognition will be of zero help to your business objectives, much like the PDP-10 based system from the 70s.
You need to run a “real” AI that can detect these sub-optimal conversations, and help your management defuse them before they explode into massively toxic outcomes.
Voiso combines the language model with phonetic analysis to get to the real conversational semantics. Semantics is a branch of logic concerned with meaning; Voiso will extract real meaning from the cryptic conversation, just like a human tapping the line would.
These are the ingredients:
Identify the language of the conversation, record the call and transfer audio data into searchable text. That’s the speech recognition part.
Add semantics and extract the real meaning of the vocabulary in use. Consider the grammatical category of nouns, adjectives and verbs, consider gender forms and synonyms.
Combine the above with a phonetic model: detect emotion and compare to past occurances.
Bottom line: calculating ballistic rocket trajectory is not that complex. It’s math and weather. Speech Analytics is the real rocket science. You can use this rocket science to:
- Score the conversation and identify potential compliance breaches
- Analyze agent performance and optimize your workforce
- Predict future behavior based on past performance
- Protect your organization from future incidents by identifying workplace conditions likely to trigger such behavior
- Prevent costly litigation by having a supervisor interfere before the disagreement escalates to a real toxic exchange