Skip to main content

Voice to Robotics: Driving a Sphero RVR

· 5 min read
Chiwai Chan

bg right:40%

Driving a Sphero RVR with your voice

Nova 2 Sonic · Bedrock · IoT Core · AppSync · Amplify Gen 2 · CDK

Chiwai Chan — AWS User Group Wellington meet-up 26 May 2026


Who am I?

Tinkerer · Cloud · IoT · Robotics · Generative AI

 

https://chiwaichan.co.nz https://x.com/chiwaichanconz https://github.com/chiwaichan


What are we building?

  • Speak into a browser microphone
  • Amazon Nova 2 Sonic turns speech into a structured tool call
  • Tool call is published to AWS IoT Core as an MQTT command
  • A Sphero RVR receives it and drives
  • Live telemetry streams back into the browser in real time

Voice → Cloud → Robot → Telemetry → Voice loop


Why a Sphero RVR?

  • Off-the-shelf programmable robot with a well-documented UART protocol
  • I ported the Sphero RVR SDK into the ESP32-S3 firmware myself
  • Onboard locator (X/Y in metres) and IMU (yaw in degrees)

bg contain


The three repos behind the demo

  • ESP32-S3 firmware — the sphero_rvr device group, extracted into a dedicated public repo: platformio-aws-iot-seeed-studio-esp32s3-sphero-rvr
  • cdk-iot-sphero-rvr-streaming — CDK stack: IoT Rule → Lambda → AppSync
  • amplify-react-nova-sonic-voice-chat-sphero-rvr — React + Amplify Gen 2 frontend

One device group. One streaming pipeline. One serverless web app.


Voice in: Nova 2 Sonic as a robot controller

Amazon Nova 2 Sonic on Amazon Bedrock — bidirectional stream straight from the browser.

  • Not a chatbot — a tool dispatcher
  • toolChoice: { any: {} } forces a tool call on every utterance
  • 5 tools, one per active firmware verb

The 5 tools (mirror the firmware MQTT commands)

ToolWhat it does
driveD-pad tank — immediate, runs until told otherwise
forwardDrive forward a given distance, then auto-stop
reverseDrive backward a given distance, then auto-stop
turnRotate in place by a given angle, then auto-stop
cancelAbort the running maneuver

"go forward 1 metre" → forward(distance_m: 1.0) → MQTT publish.


bg contain


Browser → Bedrock direct (no backend)

  • React + AWS Amplify Gen 2 in the browser
  • Amazon Cognito User Pool + Identity Pool gives temporary AWS credentials
  • IAM policy scoped to amazon.nova-2-sonic-v1:0 and the RVR MQTT topic
  • InvokeModelWithBidirectionalStreamCommand over HTTPS / TLS 1.2+
  • No EC2, no containers, no Lambda in the voice path

Browser → AWS IoT Core (commands down)

  • Same temporary creds are also scoped for iot:Publish
  • Frontend publishes via IoTDataPlaneClient to: the-project/sphero-rvr/XIAOSpheroRVR/action
  • Cognito identity gets an AWS IoT Core policy attached at first publish
  • Decouples the browser from the robot — robot can be anywhere on the internet

bg contain


bg right:42%

The edge: XIAO ESP32-S3

  • Seeed Studio XIAO ESP32-S3 on the Expansion Board
  • UART bridge between WiFi/MQTT and the RVR's SDK protocol
  • Subscribes to the RVR's command topic over MQTT/TLS
  • Publishes telemetry to the-project/sphero-rvr/XIAOSpheroRVR/state
  • OLED display for "is it alive?" at-a-glance status

The Sphero RVR — what's on board

  • Two internal processors (Nordic + ST) talking over UART
  • IMU (pitch / roll / yaw), accelerometer, gyroscope
  • Wheel encoders, locator (X / Y in metres), velocity, speed
  • Color sensor, ambient light, compass / magnetometer
  • Motor thermal protection, stall detection, gyro-max collision events
  • Battery state with change notifications

forward / reverse — drive an exact distance

// On start: snapshot start_xy + lock current heading
LocatorData loc = rvr.getLocator();
_start_x = loc.x; _start_y = loc.y;
rvr.driveWithHeading(_speed, heading, reverseFlag);

// Every tick: stop when traveled distance >= target
float dx = loc.x - _start_x, dy = loc.y - _start_y;
_traveled = sqrtf(dx*dx + dy*dy);
if (_traveled >= _distance_m) { rvr.driveStop(); return DONE; }

"Drive forward 1 metre" actually means 1 metre, not "1 second-ish".


turn — rotate an exact number of degrees

// Accumulate signed yaw delta, normalised across the ±180° wrap
float delta = yaw - _lastYaw;
while (delta > 180.0f) delta -= 360.0f;
while (delta < -180.0f) delta += 360.0f;
_accumulated += (_degrees > 0) ? delta : -delta;
if (_accumulated >= _target) { driveStop(); return DONE; }
  • Signed degrees: positive = CW, negative = CCW
  • Multi-revolution allowed (degrees: 720 = two spins right)
  • "u-turn" / "spin around" handled by the model, not the firmware

Telemetry up: IoT Rule → Lambda → AppSync

  • RVR publishes nested JSON to the-project/sphero-rvr/+/state
  • AWS IoT Rule SQL: SELECT *, topic(3) AS device_name FROM ...
  • AWS Lambda flattens it for GraphQL:
    • imu.pitchimuPitch
    • motors.left_temp_cmotorLeftTempC
  • AWS AppSync runs the createRVRState mutation
  • AppSync auto-persists to Amazon DynamoDB and pushes via subscription

Live telemetry in the browser

  • Same React app uses AWS Amplify generateClient()
  • Subscribes to onCreateRVRState GraphQL subscription
  • Every MQTT telemetry frame from the RVR appears in the UI within ~hundreds of ms
  • Battery · IMU · motors · position · compass · light · color — all live

Voice in, MQTT out, GraphQL back — all on one Cognito identity.


bg contain


Live demo

🎤 → ☁️ → 🤖 → 📊

(Fingers crossed.)


Thank you! — Questions?

Related write-ups: Sphero RVR + AWS IoT Core on XIAO ESP32-S3 · Real-time voice chat with Nova Sonic + Amplify

https://chiwaichan.co.nz

https://x.com/chiwaichanconz

https://github.com/chiwaichan


bg contain