The future of computing will be bifurcated. On one hand, there will be entirely new models for computing such as voice, autonomous agents, and bots, with no traditional user interfaces. On the extreme opposite hand, there will be new user interfaces augmented with our ‘real’ worlds, such as the innovation done by Microsoft holographic computing technologies, along with virtual reality platforms coming to market from Google and Facebook. Bringing these trends to fruition, though, will require some key enabling technological limitations to be overcome.
It’s been slowly happening for a while now: Voice recognition will change one of the key interfaces with today’s computing and applications. Apple’s Siri, Google Now, Microsoft Cortana, and the super-hot Amazon Echo, along with their smart agents, are the practical embodiments of a growing trend toward the application of machine learning to voice and data. Andrew Ng, chief scientists at Baidu, says that 99 percent accuracy is the key milestone for speech recognition. Companies like Apple, Google, and Baidu are already above 95 percent and improving. Ng estimates that 50 percent of web searches will be voice-powered by 2019.
The next natural step for this accurate voice recognition technology is the incorporation of a learning bot that learns all about your life and assists with your tasks, via voice recognition, of course.
These new technologies will require voice recognition access, data access, and interoperability with connected assets. These agents will continually learn, access new data sources, and provide you as a user with a significant amount of value. But these great innovations also come with many (sometimes steep) costs. They will generate ever increasing numbers of API calls, requiring vast amounts of infrastructure, and require new levels of scale and management.
Voice represents the new computing interface model — one that many point to as the interaction model of the future. And we are just a few percentage points away from achieving the technical prowess to make it ready for prime time. In Mary Meeker’s recently posted internet trends for 2016, she calls out an observation.
The performance of these interfaces is a key to adoption, but often the voice recognition system is hosted by a third-party on another network. Voice-driven applications must send data to one of these providers, get a response, and process it. Delivering results in under 10 seconds to the user? That’s quite a high service level, considering most transactions lack visibility from end-to-end. Transaction tracing capability and tying the user request through the dependent systems and APIs will be critical to meet this user response time requirement.
Another example is the wild success of Amazon’s Alexa voice services. Their first device, the Echo, is #2 in electronics in Amazon’s store today (June 2016) even after 20 months. In this short amount of time, there have been over 1000 integrations, known as “Skills.” Some of the most impressive Skills are the replacement of existing interfaces. There are useful apps from reference lookups, news and stocks, home automation, travel, ordering goods and services, and of course, personal and social data.
Among the most popular Skills are Capital One’s offerings. Capital One has a dedicated mini-site focused on this functionality. Capital One is one of an elite cadre of ‘traditional’ companies that recognize and embrace the imperative to innovate and leverage new technologies to the benefit of their customers. They’re paving the way with their efforts, example, and important contributions to open source technologies. At the same time, though, rolling out new interaction schemes also brings challenges when integrating existing backend systems to new API driven-functionality such as those required by Alexa. Troubleshooting and ensuring a high-quality user experience needs a proper end-to-end view across multiple systems and technologies. That 10-second threshold is ambitious given the complexity of the systems involved., But as we’ve seen with the traditional web, as consumers adopt and grow comfortable with new technologies, the bar quickly goes higher — never in the opposite direction.