Alexa Skills with Node-RED

I had a Amazon Echo Dot delivered on release day. I was looking forward to having a voice powered butler around the flat.

Amazon advertise WeMo support, but unfortunately they only support WeMo Sockets and I have a bunch of WeMo bulbs that I’d love to get to work

Alexa supports 2 sorts of skills:

  • Normal Skills
  • Home Skills

Normal Skills are triggered by a key word prefixed by either “Tell” or “Ask”. These are meant for information retrieval type services, e.g. you can say “Alexa, ask Network Rail what time the next train to London leaves Southampton Central”, which would retrieve train time table information. Implementing these sort of skills can be as easy as writing a simple REST HTTP endpoint that can handle the JSON format used by Alexa. You can also use AWS Lambda as an endpoint.

Home Skills are a little trickier, these only support Lambda as the endpoint. This is not so bad as you can write Lambda functions in a bunch of languages like Java, Python and JavaScript. As well as the Lambda function you also need to implement a website to link some sort of account for each user to their Echo device and as part of this you need to implement OAuth2.0 authentication and authorisation which can be a lot more challenging.

First Skill

The first step to create a skill is to define it in the Amazon Developer Console. Pick the Alexa tab and then the Alexa Skill Kit.

Defining a new Alexa Skill
Defining a new Alexa Skill

I called my skill Jarvis and set the keyword that will trigger this skill also to Jarvis.

On the next tab you define the interaction model for your skill. This is where the language processing gets defined. You outline the entities (Slots in Alexa parlance) that the skill will interact with and also the sentence structure to match for each Intent. The Intents are defined in a simple JSON format e.g.

{
  "intents": [
    {
      "intent": "TVInput",
      "slots": [
        {
          "name": "room",
          "type": "LIST_OF_ROOMS"
        },
        {
          "name": "input",
          "type": "LIST_OF_INPUTS"
        }
      ]
    },
    {
      "intent": "Lights",
      "slots": [
        {
          "name": "room",
          "type": "LIST_OF_ROOMS"
        },
        {
          "name": "command",
          "type": "LIST_OF_COMMANDS"
        },
        {
          "name": "level",
          "type": "AMAZON.NUMBER"
        }
      ]
    }
  ]
}

Slots can be custom types that you define on the same page or can make use of some built in types like AMAZON.NUMBER.

Finally on this page you supply the sample sentences prefixed with the Intent name and with the Slots marked.

TVInput set {room} TV to {input}
TVInput use {input} on the {room} TV
TVInput {input} in the {room}
Lights turn {room} lights {command}
Lights {command} {room} lights {level} percent

The next page is used to setup the endpoint that will actually implement this skill. As a first pass I decided to implement a HTTP endpoint as it should be relatively simple for Node-RED. There is a contrib package called node-red-contrib-alexa that supplies a collection of nodes to act as HTTP endpoints for Alexa skills. Unfortunately is has a really small bug that stops it working on Bluemix so I couldn’t use straight away. I’ve submitted a pull request that should fix things, so hopefully I’ll be able to give it a go soon.

The reason I want to run this on Bluemix is because endpoints need to have a SSL certificate and Bluemix has a wild card certificate for everything deployed on the .mybluemix.net domain which makes things a lot easier. (Update: When configuring the SSL section of the skill in the Amazon Developer Console, pick “My development endpoint is a sub-domain of a domain that has a wildcard certificate from a certificate authority” when using Bluemix)

Alexa Skill Flow
Alexa Skill Flow

The key parts of the flow are:

  • The HTTP-IN & HTTP-Response nodes to handle the HTTP Post from Alexa
  • The first switch node filters the type of request, sending IntentRequests for processing and rejecting other sorts
  • The second switch picks which of the 2 Intents defined earlier (TV or Lights)

    {
      "type": "IntentRequest",
      "requestId": "amzn1.echo-api.request.4ebea9aa-3730-4c28-9076-99c7c1555d26",
      "timestamp": "2016-10-24T19:25:06Z",
      "locale": "en-GB",
      "intent": { 
        "name": "TVInput", 
        "slots": { 
          "input": { 
            "name": "input",
            "value": "chromecast"
          }, 
          "room": {
            "name": "room",
            "value": "bedroom"
          }
        }
      }
    }
  • The Lights Function node parses out the relevant slots to get the command and device to control
  • The commands are output to a MQTT out node which published to a shared password secured broker where they are picked up by a second instance of Node-RED running on my home network and sent to the correct WeMo node to carry out the action
  • The final function node formats a response for Alexa to say (“OK”) when the Intent has been carried out

    rep = {};
    rep.version  = '1.0';
    rep.sessionAttributes = {};
    rep.response = {
        outputSpeech: {
            type: "PlainText",
            text: "OK"
        },
        shouldEndSession: true
    }
    
    msg.payload = rep;
    msg.headers = {
        "Content-Type" :"application/json;charset=UTF-8"
    }
    return msg;

It could do with some error handling and the TV Intent needs a similar MQTT output adding.

This works reasonably well, but having to say “Alexa, ask Jarvis to turn the bedroom light on” is a little long winded compared to just “Alexa turn on the bedroom light”, in order to use the shorter form you need to write a Home Skill.

I’ve started work on a Home Skill service and a matching Node-RED Node to go with it. When I’ve got it working I’ll post a follow up with details.