Voice to Robotics: Driving a Sphero RVR
· 5 min read

Driving a Sphero RVR with your voice
Nova 2 Sonic · Bedrock · IoT Core · AppSync · Amplify Gen 2 · CDK
Chiwai Chan — AWS User Group Wellington meet-up 26 May 2026
Who am I?
Tinkerer · Cloud · IoT · Robotics · Generative AI
https://chiwaichan.co.nz https://x.com/chiwaichanconz https://github.com/chiwaichan
What are we building?
- Speak into a browser microphone
- Amazon Nova 2 Sonic turns speech into a structured tool call
- Tool call is published to AWS IoT Core as an MQTT command
- A Sphero RVR receives it and drives
- Live telemetry streams back into the browser in real time
Voice → Cloud → Robot → Telemetry → Voice loop
Why a Sphero RVR?
- Off-the-shelf programmable robot with a well-documented UART protocol
- I ported the Sphero RVR SDK into the ESP32-S3 firmware myself
- Onboard locator (X/Y in metres) and IMU (yaw in degrees)

The three repos behind the demo
- ESP32-S3 firmware — the
sphero_rvrdevice group, extracted into a dedicated public repo:platformio-aws-iot-seeed-studio-esp32s3-sphero-rvr cdk-iot-sphero-rvr-streaming— CDK stack: IoT Rule → Lambda → AppSyncamplify-react-nova-sonic-voice-chat-sphero-rvr— React + Amplify Gen 2 frontend
One device group. One streaming pipeline. One serverless web app.
Voice in: Nova 2 Sonic as a robot controller
Amazon Nova 2 Sonic on Amazon Bedrock — bidirectional stream straight from the browser.
- Not a chatbot — a tool dispatcher
toolChoice: { any: {} }forces a tool call on every utterance- 5 tools, one per active firmware verb
The 5 tools (mirror the firmware MQTT commands)
| Tool | What it does |
|---|---|
drive | D-pad tank — immediate, runs until told otherwise |
forward | Drive forward a given distance, then auto-stop |
reverse | Drive backward a given distance, then auto-stop |
turn | Rotate in place by a given angle, then auto-stop |
cancel | Abort the running maneuver |
"go forward 1 metre" →
forward(distance_m: 1.0)→ MQTT publish.

Browser → Bedrock direct (no backend)
- React + AWS Amplify Gen 2 in the browser
- Amazon Cognito User Pool + Identity Pool gives temporary AWS credentials
- IAM policy scoped to
amazon.nova-2-sonic-v1:0and the RVR MQTT topic InvokeModelWithBidirectionalStreamCommandover HTTPS / TLS 1.2+- No EC2, no containers, no Lambda in the voice path
Browser → AWS IoT Core (commands down)
- Same temporary creds are also scoped for
iot:Publish - Frontend publishes via
IoTDataPlaneClientto:the-project/sphero-rvr/XIAOSpheroRVR/action - Cognito identity gets an AWS IoT Core policy attached at first publish
- Decouples the browser from the robot — robot can be anywhere on the internet


The edge: XIAO ESP32-S3
- Seeed Studio XIAO ESP32-S3 on the Expansion Board
- UART bridge between WiFi/MQTT and the RVR's SDK protocol
- Subscribes to the RVR's command topic over MQTT/TLS
- Publishes telemetry to
the-project/sphero-rvr/XIAOSpheroRVR/state - OLED display for "is it alive?" at-a-glance status
The Sphero RVR — what's on board
- Two internal processors (Nordic + ST) talking over UART
- IMU (pitch / roll / yaw), accelerometer, gyroscope
- Wheel encoders, locator (X / Y in metres), velocity, speed
- Color sensor, ambient light, compass / magnetometer
- Motor thermal protection, stall detection, gyro-max collision events
- Battery state with change notifications
forward / reverse — drive an exact distance
// On start: snapshot start_xy + lock current heading
LocatorData loc = rvr.getLocator();
_start_x = loc.x; _start_y = loc.y;
rvr.driveWithHeading(_speed, heading, reverseFlag);
// Every tick: stop when traveled distance >= target
float dx = loc.x - _start_x, dy = loc.y - _start_y;
_traveled = sqrtf(dx*dx + dy*dy);
if (_traveled >= _distance_m) { rvr.driveStop(); return DONE; }
"Drive forward 1 metre" actually means 1 metre, not "1 second-ish".
turn — rotate an exact number of degrees
// Accumulate signed yaw delta, normalised across the ±180° wrap
float delta = yaw - _lastYaw;
while (delta > 180.0f) delta -= 360.0f;
while (delta < -180.0f) delta += 360.0f;
_accumulated += (_degrees > 0) ? delta : -delta;
if (_accumulated >= _target) { driveStop(); return DONE; }
- Signed degrees: positive = CW, negative = CCW
- Multi-revolution allowed (
degrees: 720= two spins right) - "u-turn" / "spin around" handled by the model, not the firmware
Telemetry up: IoT Rule → Lambda → AppSync
- RVR publishes nested JSON to
the-project/sphero-rvr/+/state - AWS IoT Rule SQL:
SELECT *, topic(3) AS device_name FROM ... - AWS Lambda flattens it for GraphQL:
imu.pitch→imuPitchmotors.left_temp_c→motorLeftTempC
- AWS AppSync runs the
createRVRStatemutation - AppSync auto-persists to Amazon DynamoDB and pushes via subscription
Live telemetry in the browser
- Same React app uses AWS Amplify
generateClient() - Subscribes to
onCreateRVRStateGraphQL subscription - Every MQTT telemetry frame from the RVR appears in the UI within ~hundreds of ms
- Battery · IMU · motors · position · compass · light · color — all live
Voice in, MQTT out, GraphQL back — all on one Cognito identity.

Live demo
🎤 → ☁️ → 🤖 → 📊
(Fingers crossed.)
Thank you! — Questions?
Related write-ups: Sphero RVR + AWS IoT Core on XIAO ESP32-S3 · Real-time voice chat with Nova Sonic + Amplify
