Amazon’s voice assistant, Alexa, has been around since late 2014, offering users the ability to integrate voice commands with a host of connected products and services, from smart lights and plugs to your personal grocery list. As a young developer, I have always been intrigued by Alexa Skills and wanted to explore them, but the prospect always felt a bit intimidating to me. I always thought, “what do I know about voice commands and natural language processing?” As it turns out, it was a lot simpler than I envisioned.
Last month I decided I was going to finally put my doubts to rest and create my own custom Alexa Skill to control something in my apartment; and not one that just turns on some compatible smart plug, I wanted to create something that would be completely customized from end to end. What I came out with was an IoT project that utilizes a custom Alexa Skill frontend, a lambda function backend, an MQTT topic created by AWS IoT core and an ESP32 microcontroller that is controlled by an Arduino program. All of this combines to make my “dumb” Lasko tower fan into a “smart” one, allowing it to be turned on and off with my voice through an Amazon Echo Dot. In this blog I will show you how you can create an Alexa Skill that will trigger a lambda function that posts to an MQTT topic, as well as how you can register an ESP32 to receive those messages and output to a digital pin in order to turn on and off a Lasko tower fan.
Alexa Skill Frontend
As you can see in the project flow diagram, the first step to creating my project was to make the Alexa Skill front end. This was done by simply navigating to the Alexa Developer Console, creating an account and clicking on create Skill. When creating a Skill, you simply need to give it a name and choose a model to begin with. There are several different models you can start off with such as smart home or music models that will provide some built-in functionality, but I chose to go with a completely custom Skill and hosted it (for now) within Alexa. From there, the main two things that need to be done to get the Skill frontend to work are creating an invocation name and adding custom intents. An invocation name is just what you will say to Alexa to get her to bring up your specific Skill. For mine, I picked “the fan” as my invocation name so that I could say something like “Alexa, ask the fan to turn on” and Alexa would know that I was referring to my Skill.
Next, I navigated into the interaction model tab and created a few custom intents. Intents essentially map user given commands to specific functionality within the Skill backend. For example, the two intents that I created are TurnOnFanIntent and TurnOffFanIntent. Within this Skill intent I added a few “sample utterances” which are phrases that will cause this intent to be invoked. For my TurnOnFanIntent the utterances I have are “turn on” and “turn the fan on.” This lets Alexa know that when a user asks for the invocation name corresponding to this Skill (the fan) and says, “turn on”, they are trying to invoke this intent. Alexa will then trigger the backend to handle this intent. For my Skill, these two intents were pretty much all I needed, but I could have added any number of intents to the front end to really build out the functionality of my Skill.
AWS IoT Core
Before getting into the backend functionality for my Alexa Skill, I needed to do a bit of setup within AWS IoT core. IoT core is what is going to bridge the gap between the lambda software and the ESP32 hardware, so to be able to fully create the backend, I first need to have something for that backend to interact with. To do this, I went into the AWS IoT console and registered my ESP32 as a new thing. A thing is essentially a cloud representation of a physical device such as a raspberry pi, Arduino board, or an ESP32 (like in my case). To work with AWS IoT, every physical device needs to be registered as a thing.
When creating a thing for this purpose, it is very important to create a certificate and save the thing certification, private key, and Amazon Root CA 1 somewhere safe and accessible. This is what will allow my ESP32 device to securely connect to AWS IoT later. After saving these files, I created a policy for my device under the secure tab. My created policy is shown below in figure two, with the region and account_id replaced with what I needed for my account. This permission policy allows my “thing” to send and receive MQTT messages with AWS IoT topics. MQTT is a lightweight pub/sub messaging protocol that makes it easy for IoT devices to communicate with software backends remotely. The lambda function publishes messages to topic esp32/sub which the ESP32 will then receive by subscribing to that same topic.
Lambda Function Backend
Next up is to create the backend for the Alexa Skill. I chose to do this using AWS lambda running Python 3.6. The reason for this is that you can find some readymade function templates within the AWS serverless app repository that will give a very good start into what is needed. The one that I chose to start from was called “alexa-skills-kit-python36-factskill.” By default, this application includes the Alexa Skill SDK and is made to return a random fact to users when triggered, but it was relatively easy to manipulate into doing what I wanted it to do. Once I had some starter code, another important step was to add permissions that allow my Lambda function to have full publish access to AWS IoT for it to be able to interact with our ESP32 board. This can be done by creating a role within IAM that includes AWSIoTWirelessFullPublishAccess and attaching it to AWS Lambda.
In terms of the function code, the function uses something called a SkillBuilder from the Alexa SDK which can be used to define an instance of your Skill. You can add any number of request handlers or exception handlers to this SkillBuilder. The handlers are really where the meat of the functionality for the Alexa Skill happens. This is where the requests sent by the Alexa Skill front end containing the specified intent will be handled. In my case, the main two handlers that I needed to create were a TurnOnFanHandler and a TurnOffFanHandler. These handlers are very similar so I will just go through the TurnOnFanHandler, shown below in figure two. First, I am checking if the request can be handled by this handler using an SDK function that checks if the intent name is “TurnOnFanIntent.”
Then I am handling the actual request by creating a connection to a boto3 client that will connect us to AWS IoT core and publishing a response to that client that holds a JSON with a “message” whose value is “on”. Boto3 is the AWS SDK for python, and allows me to connect to other AWS services, like AWS IoT in this case. This will send that message to our MQTT “sub” topic and will be received by the ESP32. The function then has Alexa respond to the user with “Ok”. Next, all that is left to do is to go back into the Alexa developer console and add the ARN for our Lambda function in as an endpoint for the Alexa Skill. This will allow our Alexa frontend to trigger the lambda backend. After that, the Alexa Skill itself will be fully configured and will successfully post into the MQTT topic when a user asks Alexa to turn the fan on.
Programming the ESP32 in Arduino
After the Alexa Skill is configured and can successfully transfer a message from the user into the MQTT subtopic, it is time to program the ESP32 to be able to receive those messages and take some action. For those that aren’t aware, an ESP32 is a type of microcontroller that can access the internet, which is extremely useful because it allows me to hook it up to my Wi-Fi network, connect to AWS IoT and then run continuously while only connected to a battery pack.
To upload code to my ESP32 I used the Arduino IDE, which is a desktop app that makes it super simple to write C code and upload it to compatible microcontrollers via a USB connection. The specific ESP32 that I used is the HiLetgo ESP-WROOM-32. To configure the IDE correctly, I selected tools -> boards -> board manager and picked the DOIT ESP DevKit V1. I was not able to find the exact board that I purchased, but this seemed to be compatible with the one that I was using. Then, for development purposes, I selected the correct port under tools -> port and set the baud rate to 9600. This allowed me to be able to write logs to the serial port that could be seen in the Arduino IDE, making it much easier to debug.
The actual code that I wrote is adapted from a very helpful tutorial found on the AWS blogs. The main structure of any Arduino code is a setup and a loop function. The setup is what is done when the program first starts running and then the loop function runs continuously after that. In my setup function I first set the serial output value to be 9600 (as we mentioned before) then call connectAWS before setting the pin value of pin 14 to be OUTPUT and set it to high. What this pin outputting does is take pin 14 on the physical ESP32 chip and set it to be an output pin, meaning that I can connect wires to this pin and send an electrical output through it based on the code that I write. What the code for connectAWS does is take a “secrets.h” file (which I haven’t included here due to privacy purposes) that contains a few key values, like my IoT thing name, my Wi-Fi network and password, and the certificates that I mentioned before, and uses them to connect first to Wi-Fi and then to AWS IoT. It also uses the WiFi and MQTTClient libraries which can be installed in the Arduino IDE from sketch -> libraries -> manage libraries.
Once the device has been connected both to Wi-Fi and then to AWS IoT, the code will move into the loop function. The loop will first call a function that publishes some basic information to our MQTT topic such as the time. This isn’t really an essential step, but it can be useful for debugging. It then calls client.loop, which will cause the MQTTHandler to go through an iteration of itself, which in this case just means calling the MessageHandler function that is also included in the code.
What the MessageHandler does is receive any incoming message from the MQTT topic and first convert it from a serialized JSON into a string. This will leave us with the JSON message that we initially sent from our lambda function, which will either be “on” or “off” based on the input from the Alexa front end. If it is “on” it will call digitalWrite on our output pin 14 and then wait 5 seconds before calling digital write again and removing the output. What this does is simulate the pushing of the “on” button on the fan, which will become a little clearer after reading about the actual circuitry that connects the ESP32 to the fan. If it receives “off” it will basically do the same things to simulate pushing the power button a second time, turning off the fan.
After writing this code, I connected my ESP32 to my computer via a USB cable and clicked “upload” in the Arduino IDE, and just like that my ESP32 was running my code and able to connect to the internet and AWS IoT and receive messages.
Wiring the ESP32 to the Fan
All that is left to get my project fully working is to wire my ESP into my fan and solder it in place. Essentially all I did in terms of wiring was connect the ground pin of the ESP32 to the negative diode on one side of an optocoupler and pin 14 of the ESP32 to a 220 ohm resistor that then connects to the positive diode on the same side. Then the other side of the optocoupler connects to two output wires that will wire directly into my fan. An optocoupler is used here to provide some separation between the high voltage circuit of the fan and the low voltage circuit of my ESP32 and to act as a switching device.
When pin 14 is engaged, it will connect the optocoupler and allow the two output wires to connect and close the circuit. After setting this up, I opened my fan and found the circuit board where the power button lives. After that it was as simple as connecting the output wires in my circuit to either diagonal side of the power button and soldering it all into place. This will allow the original button to work but will also serve to circumvent it when my circuit becomes closed (i.e. when pin 14 is engaged). After that, I closed my fan, housed my ESP32 circuit in a plastic electrical case, plugged it into a portable power bank to draw energy from and mounted it to the side of my fan using some Velcro strips. The finished product can be seen in figure 5. An image of the circuit can be found in figure 6.
All in all, I found this project to be a good introduction to both Alexa and IoT. I was able to learn how to create an Alexa Skill that triggers a lambda function to post to an MQTT topic in AWS IoT, which is then read by an Arduino program on an ESP32 and used to turn on and off a Lasko tower fan.
From an educational perspective, I really enjoyed how it allowed me to learn about a bunch of different technologies and how they all fit together while not having any one part of it be overly difficult to orchestrate. In terms of robustness, this is not something that will travel too well, as the code relies upon being on my home Wi-Fi network, so to move it anywhere else, I’d have to change the code and physically re-upload it to the ESP32. If I were to do it over again, I might try to do it using Bluetooth instead, but that would have caused other issues in terms of connecting to AWS so I’m not sure how feasible that would be in practice.
Nonetheless, as far as at home IoT projects, I’m happy with how well it works and it’s something that I see myself using every night when I hop into bed and every morning when I get out of bed. I will be looking to make new smart devices in my home in the future or upgrade this project to be able to control the speed of the fan or set it on a more precise time interval than it already supports. Thanks for reading and feel free to reach out to me via email at [email protected] if you have any questions!
AWS IoT enables communication between IoT devices and AWS cloud services: in this case, AWS Lambda to my ESP32
Serverless functions (like AWS Lambda) help link cloud services together: here, linking AWS Alexa with AWS IoT
Intuitive voice interfaces can allow users to quickly interact with software services. Amazon Alexa is a really intuitive way for developers to do this.
General purpose microcontrollers can make any device into a ‘smart’ device (some assembly required)
Credentials and certificates allow devices to securely communicate with cloud services
Small message payloads with asynchronous delivery (pub/sub) work well for constrained devices (low memory and low power, like my ESP32). In this case MQTT was actually better than HTTP would have been due to its lightweight nature.