Raspberry Pi 3 "Echo"

Building an Amazon Echo similar device out of a Raspberry Pi 3

I recently worked as an electronics hardware developer on a new smarthome system which is designed to have speech recognition as a way of controlling devices.

Over the course of researching soft- and hardware for this purpose while in Silicon Valley I also tested and reverse engineered the "Amazon Echo" - an electronically very well designed device and a huge success for one of the in-house manufactured devices from the electronic commerce and cloud computing company.

The latter also lays the groundwork for Amazon Echo and the speech recognition called "Alexa" utilized in the round tower like gadget. With a price tag of $180 and - more important - not yet available to customers outside the US I was quite happy back in Europe to see a github repo to allow implementing an Amazon Echo similiar device and especially speech recognition on cheap hardware like a Raspberry Pi.

I bought the quite new Raspberry Pi 3 - even if the github repo uses a Pi 2 - expecting some minor issues, what turned out to be true. A big help was to browse the "issues" related to the repo.

In short I avoided to install a new JDK because it already comes with new Raspian Jessie image. I put on the newest version of Node.js, used the WiFi which is onboard with the RasPi3 and tested different microphones because the one suggested on the github repo has some bad reviews. That's basically all I deviated from the original installation instructions, which worked out very well.

After only two hours or so everything was set up without problems. In the video below and for the first tests I used a webcam with an integrated microphone, a Logitech QuickCam Orbit AF, which I had lying around while the dedicated USB microphone was ordered but had not arrived.

Identifying the microphone chipset

The problems began when I got the new USB microphone, a "Lerox USB microphone" ordered - of cause - from Amazon. In the beginning I had barely no success getting "Alexa" recognizing my commands. I had pulsing sounds (which I hadn't before) and the speech recognition stopped before I could even tell the whole command. The microphone identifies as a "C-Media Electronics device" with a CM108-chipset.

Three efforts led me to success:
Microphone configuration with "alsamixer"

1. I adjusted the recording settings of the microphone with "alsamixer". It turned out to be a good setting (at least for the microphone used) when it is set to the highest "green" level available.

2. I changed the USB power supply for the Raspberry Pi 3. This is where the klicking sound while recording the commands came from. Might be more a bad design of the microphone than of the power supply, as I used a high quality PSU first.

Editing settings for the microphone
3. This might be the most important setting fiddling with microphone problems: I adjusted the values in the java source code (../samples/javaclient/src/main/java/com/amazon/alexa/avs/ASVApp.java) for "ENDPOINT_THRESHOLD" (minimum audio level threshold under which is considered silence) and "ENDPOINT_SECONDS" (amount of silence time before endpointing). Default was 5 respectively 2 which I changed to 7 and 4. After a "mvn install" to do a new build and the call "mvn exec:exec" it now almost works like the original Amazon Echo.

Audio device settings

4. You might have to set your microphone as default input source. You can do this by choosing "Menu -> Settings -> Audio Device Settings" selecting your soundcard (microphone), add elements and make the microphone the default. This is where you can also set the gain or any additional elements like auto gain control (AGC) when provided by the soundcard/ microphone. As far as I understand choosing and setting the microphone with "alsamixer" does the same but I'm not sure about it.

The next thing I will implement is the invocation with a spoken command like the Amazon Echo - where you can choose between "Alexa" and "Amazon". As far as I could reverse engineer it Amazon solves this with a bunch of Texas Instruments TLV320ADC3101 92dB SNR Low-Power Stereo ADCs, which have an integrated miniDSP and I guess this is where they put the algorithms (aka "magic") for recognizing the invocation command while after this streaming the rest to their cloud servers. You find a lot of technical details of the Amazon Echo in this awesome ifixit Amazon Echo teardown .

EDIT 4-10-2016: Added instructions of Elton "Eddie" Hartmans fork to the installation on my Raspberry Pi 3 and it's now possible to start voice commands either by clicking the button on the JAVA-GUI or by pressing a switch connected to the GPIOs of the Raspberry Pi.

EDIT 4-25-2016: If you want to use bluetooth speakers follow this awesome tutorial from David Roberts. Unfortunately I wasn't able to connect my microphone which is embedded in my bluetooth speaker BoomStar BT NFC X yet.

No comments:

Post a Comment