Skip to main content

Demo

Overview

This project is a real-time voice-to-sign-language translation system that takes spoken words from a browser microphone and translates them into physical ASL fingerspelling on a Pollen Robotics Amazing Hand. The system spans three repositories covering the frontend, cloud infrastructure, and edge AI agent.

Technologies Used

  • Frontend: React 19, Vite 7, TypeScript, Three.js, AWS Amplify Gen 2
  • Voice AI: Amazon Nova 2 Sonic (speech-to-speech, bidirectional streaming)
  • Edge AI: Strands Agents framework, Amazon Nova 2 Lite (tool-use reasoning)
  • Cloud: AWS IoT Core, AWS AppSync, AWS Lambda, Amazon DynamoDB, Amazon S3
  • Hardware: NVIDIA Jetson, Pollen Robotics Amazing Hand, Feetech SCS0009 servos
  • Infrastructure: AWS CDK, GitHub Actions CI/CD
  • Protocols: MQTT, GraphQL subscriptions, Serial (1M baud)

Key Features

  • Direct browser-to-Bedrock bidirectional streaming with Nova 2 Sonic
  • Forced tool use (send_text) to relay cleaned speech as text on every utterance
  • ASL fingerspelling with 26 alphabet letters (A-Z)
  • Real-time 3D hand visualisation synchronised with the physical hand via GraphQL subscriptions
  • Video recording with H.264 encoding, S3 upload, and presigned URLs
  • Fully serverless frontend with Cognito authentication