I’m really excited to share some recent enhancements to two of my favorite services: Amazon Connect and Amazon Lex. Amazon Connect is a self-service, cloud-based contact center service that makes it easy for any business to deliver better customer service at lower cost. Amazon Lex is a service for building conversational interfaces using voice and text. By integrating these two services you can take advantage of Lex‘s automatic speech recognition (ASR) and natural language processing/understading (NLU) capabilities to create great self-service experiences for your customers. To enable this integration the Lex team added support for 8kHz speech input – more on that later. Why should you care about this? Well, if a bot can solve the majority of your customer’s requests your customers spend less time waiting on hold and more time using your products.
If you need some more background on Amazon Connect or Lex I strongly recommend Jeff’s previous posts on these services – especially if you like LEGOs.
Let’s dive in and learn to use this new integration. We’ll take an application that we built on our Twitch channel and modify it for this blog. At the application’s core a user calls an Amazon Connect number which connects them to an Lex bot which invokes an AWS Lambda function based on an intent from Lex. So what does our little application do?
I want to finally settle the question of what the best code editor is: I like vim, it’s a spectacular editor that does one job exceptionally well – editing code (it’s the best). My colleague Jeff likes emacs, a great
operating system editor… if you were born with extra joints in your fingers. My colleague Tara loves Visual Studio and sublime. Rather than fighting over what the best editor is I thought we might let you, dear reader, vote. Don’t worry you can even vote for butterflies.
Interested in voting? Call +1 614-569-4019 and tell us which editor you’re voting for! We don’t store your number or record your voice so feel free to vote more than once for vim. Want to see the votes live? https://best-editor-ever.s3-website-us-east-1.amazonaws.com/.
Now, how do we build this little contraption? We’ll cover each component but since we’ve talked about Lex and Lambda before we’ll focus mostly on the Amazon Connect component. I’m going to assume you already have a connect instance running.
Let’s start with the Lex side of things. We’ll create a bot named
VoteEditor with two intents:
VoteEditor with a single slot called
ConnectToAgent with no slots. We’ll populate our editor slot full of different code editor names (maybe we’ll leave out emacs).
Our Lambda function will also be fairly simple. First we’ll create a Amazon DynamoDB table to store our votes. Then we’ll make a helper method to respond to Lex (build_response) – it will just wrap our message in a Lex friendly response format. Now we just have to figure out our flow logic.
Let’s make sure we understand the code. So, if we got a vote for an editor and it doesn’t exist yet then we add that editor with 1 vote. Otherwise we increase the number of votes on that editor by 1. If we get a request for an agent, we terminate the flow with a nice message. Easy. Now we just tell our Lex bot to use our Lambda function to fulfill our intents. We can test that everything is working over text in the Lex console before moving on.
Before we can use our Lex bot in a Contact Flow we have to make sure our Amazon Connect instance has access to it. We can do this by hopping over to the Amazon Connect service console, selecting our instance, and navigating to “Contact Flows”. There should be a section called Lex where you can add your bots!
Now that our Amazon Connect instance can invoke our Lex bot we can create a new Contact Flow that contains our Lex bot. We add the bot to our flow through the “Get customer input” widget from the “Interact” category.
Once we’re on the widget we have a “DTMF” tab for taking input from number keys on a phone or the “Amazon Lex” tab for taking voiceinput and passing it to the Lex service. We’ll use the Lex tab and put in some configuration.
Lots of options, but in short we add the bot we want to use (including the version of the bot), the intents we want to use from our bot, and a short prompt to introduce the bot (and maybe prompt the customer for input).
Our final contact flow looks like this:
A real world example might allow a customer to perform many transactions through a Lex bot. Then on an error or ConnectToAgent intent put the customer into a queue where they could talk to a real person. It could collect and store information about users and populate a rich interface for an agent to use so they could jump right into the conversation with all the context they need.
I want to especially highlight the advantage of 8kHz audio support in Lex. Lex originally only supported speech input that was sampled at a higher rate than the 8 kHz input from the phone. Modern digital communication appliations typically use audio signals sampled at a minimum of 16 kHz. This higher fidelity recroding makes it easier differentiate between sounds like “ess” (/s/) and “eff” (/f/) – or so the audio experts tell me. Phones, however, use a much lower quality recording. Humans, and their ears, are pretty good at using surrounding words to figure out what a voice is saying from a lower quality recording (just check the NASA apollo recordings for proof of this). Most digital phone systems are setup to use 8 kHz sampling by default – it’s a nice tradeoff in bandwidth and fidelity. That’s why your voice sometimes sounds different on the phone. On top of this fundmental sampling rate issue you also have to deal with the fact that a lot of phone call data is already lossy (can you hear me now?). There are thousands of different devices from hundreds of different manufacturers, and tons of different software implentations. So… how do you solve this recognition issue?
The Lex team decided that the best way to address this was to expand the set of models they were using for speech recognition to include an 8kHz model. Support for an 8 kHz telephony audio sampling rate provides increased speech recognition accuracy and fidelity for your contact center interactions. This was a great effort by the team that enables a lot of customers to do more with Amazon Connect.
One final note is that Amazon Connect uses the exact same PostContent endpoint that you can use as an external developer so you don’t have to be a Amazon Connect user to take advantage of this 8kHz feature in Lex.
I hope you guys enjoyed this post and as always the real details are in the docs and API Reference.
Leave a comment!