🎲 Reachy DM — a whimsical AI tabletop dungeon master you play out loud

Track: Thousand Token Wood (whimsical). Reachy DM is an AI Game Master for a Fallout tabletop RPG (Modiphius 2d20). You talk to it; it narrates vivid scenes, voices the NPCs in distinct designed voices, rolls the dice, reads your physical character sheet through a camera, sets the mood with your room's smart lights, and remembers your party — all embodied by a Reachy Mini robot. Every model in the stack is open-weights and under the 32B cap.

Code: github.com/olaservo/reachy_mini_conversation_app (branch cascade-integration)

The full experience is a local hardware rig (robot · smart lights · companion screen), so the demo video above is the showcase. This page explains how it works.

How it works — an all-Qwen cascade

A voice loop (Whisper STT · Silero VAD) drives a Qwen brain that orchestrates tools and speaks back in designed character voices, reading the table with vision when you show it something.

Stage Model Where
Speech-to-text Whisper OpenAI API
Voice activity Silero VAD local (CPU)
DM brain (logic + tool-calling) Qwen3-30B-A3B-Instruct-2507 (FP8) Modal (1×H100)
Character voices (11 designed) Qwen3-TTS-12Hz-1.7B Modal (L4)
Vision ("read the table") Qwen3-VL-8B-Instruct Modal (L40S)

The brain calls tools over MCP: dice / character sheet / player choice render as live Pip-Boy widgets (an MCP-Apps custom UI), plus per-character voices (speak_as), durable memory, the camera, robot motion/expression, and Home Assistant smart lighting.

Best Use of Modal

The three GPU models — brain (H100), character-voice TTS (L4), vision (L40S) — are all served on Modal (serverless vLLM + a custom FastAPI TTS server). Modal is the runtime compute backbone of the whole experience.

Badges

  • Best Agent — a genuine multi-step agent choosing among dice, sheets, choices, voices, vision, memory, lighting, and robot motion, all under the 32B cap.
  • Best Demo — a robot Game Master that voices NPCs, rolls dice on screen, dims your lights for combat, and reads your printed character sheet with a camera.
  • Off Brand — tool calls render as a custom Pip-Boy UI well past the default Gradio look.

Models & constraints

All-Qwen, open-weights, each model < 32B total (30B brain · 8B vision · 1.7B TTS), plus Whisper/Silero for speech. Built on a fork of the Pollen Robotics Reachy Mini conversation app.

Disclaimer

Reachy DM is a personal, non-commercial creative project built for the Build Small hackathon. It runs Fallout: The Roleplaying Game (Modiphius 2d20 system) using rules and content from a copy I own, as a fan would at their own table. It is not affiliated with, endorsed by, or sponsored by Bethesda Softworks or Modiphius Entertainment. Fallout is a trademark of Bethesda Softworks; Fallout: The Roleplaying Game is published by Modiphius Entertainment. All rights to those properties belong to their respective owners.