Building your first voice bot

Bot-building for voice is very similar to developing for text interfaces, but there are some nuances to pay attention to. This tutorial will show you some of the basic concepts of building for voice. More advanced details are discussed later.

Responding to an invocation

Begin by creating a new bot and connecting it to Actions on Google.

Open Bot Studio and create a new flow. Set the trigger type to start_voice_chat. This ensures this flow will be triggered whenever the bot is invoked without any parameters (e.g. "Okay Google, talk to MyBot").

Copy/paste this code into the flow and click Save.

states:
    say_hi:
        component: meya.text
        properties:
            text: "Hello, World!"

In the Actions on Google simulator, try saying Talk to <APP_NAME>.

In the image above you can see that the Google Assistant understand the input and called the bot. The bot's hello_world flow was launched because we used the start_voice_chat trigger.

You can see Hello, World! printed to the screen twice. The first represents what the user will hear, while the second is the text that will be displayed. You may not always want these to be the same. By default, your bot will speak whatever text is in the text field for any component that supports the text property. You can customize the spoken phrase by setting the speech property. Let's try that now.

After adding the speech field, your code should look like this:

states:
    say_hi:
        component: meya.text
        properties:
            text: "Hello, World!"
            speech: "It's a beautiful world!"

Save your work and go back to the simulator. Try invoking the bot again.

You should hear the bot say It's a beautiful world! and print Hello, World!.

📘
More often you'll use the speech field to clarify how the bot should pronounce words, acronyms, numbers, dates, as well as specifying things like emphasis and breath pauses using Speech Synthesis Markup Language (SSML).
For more information, check out the Actions on Google SSML reference.

Handling input

Try adding a meya.input_string component. Your code should look like this:

states:
    say_hi:
        component: meya.text
        properties:
            text: "Hello, World!"
    
    get_name:
        component: meya.input_string
        properties:
            text: "What's your name, friend?"
            output: name
            scope: user
    
    say_name:
        component: meya.text
        properties:
            text: "{{ user.name }} is a great name!"

Now let's see it in action.

Triggering specific flows

Let's add another flow. Instead of using the start_voice_chat trigger, we'll specify a keyword phrase that will start a trivia game.

Create a new flow and copy/paste this code:

states:
    say_ready:
        component: meya.input_string
        properties:
            text: "Hi there! Let's play a game. Ready?"
            
    get_answer:
        component: meya.input_cms
        properties:
            text: "What's the capital of Canada?"
            space: "trivia"
            key: "capital"
            language: "en"
            error_message: "Incorrect. Try again."
            require_match: true
        transitions:
            capital: answer
            
    say_correct:
        component: meya.text
        properties:
            text: "Correct, {{ flow.value }} is the capital of Canada."
        return: true

Set the trigger type to keyword and play a game as the keyword value. Click OK.

In Bot CMS, create a new space called trivia. Set Key to capital, IO Type to Input, Language to en - English, and Value to Ottawa.

Click Save and close Bot CMS to return to the Bot Studio.

Go back to the simulator and try saying Ask <APP_NAME> to play a game.

expect_user_action

Notice that the bot waited for you to respond to the intro phrase before moving on to the question. All components have a boolean property, expect_user_action, that tells the bot to wait for the the user's response before proceeding. As you might have guessed, expect_user_action is set to True by default for all input components, and False for other types of components. You can adjust this by explicitly setting expect_user_action.

A handy way of using this property is to have a catchall flow that's triggered when the user says something the bot doesn't understand (using the catchall trigger type). Normally, this would result in your bot saying the default message specified in your Actions on Google integration settings page. That solution, however, results in the conversation being closed, forcing the user to restart the conversation (e.g. "Okay Google, talk to <APP_NAME>"). This could result in a frustrating experience for the user.

Instead, put a single meya.text component in the catchall flow and set expect_user_action to true. Now, the bot will give the user another chance to express a valid intent.

Try adding the catchall flow now.

states:
    say_try_again:
        component: meya.text
        properties:
            text: "I didn't understand that. Try saying 'play a game'."
            expect_user_action: true

Additional features

We've developed a bot which uses a number of different features, including Bot CMS, SSML, nested flows, custom components, and more. You can get a copy here: https://github.com/meya-ai/voice-bot