Monolith Manual
Icebox Monolith — Operator & User Reference · v6
System Overview
The Icebox Monolith is a self-hosted, multi-service studio and event management platform running on an Orange Pi 5 Plus (RK3588 · 8 GB RAM). It operates as a Docker Swarm stack with 22 services behind an nginx reverse proxy, accessible via Tailscale Funnel at orangepi5-plus.tail8c906d.ts.net.
Architecture at a Glance
- ›Portal — Next.js 14 front-end, port 3000. All browser-facing pages.
- ›19 specialised AI agents — FastAPI Python services, each specialising in a domain.
- ›Nginx — Routes /api/*, /admin, /repair, and all portal paths.
- ›Gatekeeper (007) — Security audit agent; also manages API tokens.
- ›Forge — Hardware sentinel: temperature, memory, disk telemetry.
- ›Yabo — Host OS agent (systemd). Manages the machine outside Docker.
Admin Override PIN
PIN XXXXXX bypasses authentication on every protected portal page (Runner, Tenant, Admin). Type it into the password/key field and submit — no username required.
API Tokens
Tokens beginning with ice_ can be entered in any protected login field as an alternative to a staff key. See the section for how to issue them.
Memory & Performance
The Orange Pi 5 Plus has 8 GB of RAM. The full Monolith stack uses approximately 3–4 GB under normal operation. Several optimisations are in place to prevent memory exhaustion.
XTTS Voice Synthesis
XTTSv2 (neural TTS) is disabled by default (XTTS_ENABLED=falsein the IVR environment). Loading XTTSv2 requires 4–6 GB of RAM — enough to crash the system on an 8 GB board. When disabled, the IVR falls back to SignalWire's built-in TTS. To enable: set XTTS_ENABLED=true in /home/ice/.env.
Whisper Speech Recognition
The Whisper model (~700 MB) is unloaded from memory immediately after each transcription. gc.collect() and torch.cuda.empty_cache() are called after every use. This keeps the IVR's steady-state RAM at under 100 MB between calls.
Service Memory Limits
- ›IVR — 1.5 GB limit (XTTS disabled) / 256 MB reservation.
- ›Echo — 3 GB limit / 512 MB reservation.
- ›Siren — 2 GB limit / 256 MB reservation.
- ›Beats — 512 MB limit (music gen disabled) / 128 MB reservation.
- ›All other services — 256–512 MB limits.
Forge Autonomous Healing
When available RAM drops below 10 %, Forge automatically drops the Linux page cache and prunes stopped Docker containers. See the section for full details.
Master Image
A master image snapshot is kept at /mnt/master_vault/models/MASTER_IMAGE/. Only the most recent image is retained — old images are deleted after each build. Run sudo monolith-master-image to rebuild. Current image: monolith_master_20260620_112423.
Live Portal
The home page at / is the public-facing live stream hub. It shows the HLS video player, a scrolling banner carousel, and quick links to all sub-portals.
First Visit — Terms Modal
First-time visitors see a Terms of Access modal. Accepting stores a flag in localStorage — the modal will not appear again on the same browser.
Venue Assist
The Venue Assist button places an automated call via the IVR to the studio. The caller ID is the Monolith's SignalWire number. Use it for on-the-floor support requests.
Access Portals Panel
- ›Fan Zone — Live cam and merch drop.
- ›Studio Access — Tenant dashboard for booked clients.
- ›Staff Operations — Runner task board for staff.
Fan Zone
Public page at /fan. Requires no login. Shows the live stream, real-time viewer count, session info, and the merch drop panel.
Live Stream
Powered by the Optic agent (port 8010). If the stream is offline the player shows a placeholder. Optic handles HLS encoding via OBS/FFmpeg running on the host.
Intro Music Player
The ▶ PLAY INTRO button (bottom-right, all pages) plays a random track from the 17-song intro playlist. When a song ends the player automatically picks a different random track and continues — no restart needed. Clicking ⏸ INTRO pauses playback; clicking again resumes from the same position. Playback state is saved to localStorage and restored on page reload within 5 minutes.
Bimodal Toggle
The toggle in the top-right of the Fan Zone switches the audio engine between two playback modes managed by the Echo and Vibe agents. Mode A is standard stereo; Mode B routes through spatial/DSP processing.
Merch Drop
Items shown are pulled from the Fan agent (port 8003). Inventory is managed via the admin dashboard. Shop links route to the configured e-commerce endpoint.
Booking
Public booking inquiry form at /booking. Submits to the Booking agent (port 8005), which logs the inquiry and triggers a follow-up workflow.
Event Types Supported
- ›Recording Session
- ›Video Studio Session
- ›Live Streaming
- ›Music Video Production
- ›Podcast / Interview
- ›Corporate Event · Private Event · Album Release Party
IVR Deep Links
When callers request an emailed link from the IVR, they receive a URL with the event type pre-selected so the form opens ready to fill:
- ›Music Studio callers → /booking?type=Recording+Session
- ›Video Studio callers → /booking?type=Video+Studio+Session
After Submission
The team is notified and will respond within 24 hours. If the form fails, call the studio directly.
Confirmation
A green confirmation screen replaces the form on success. The "Submit Another" button resets the form for a second inquiry.
Studio Access (Tenant Portal)
Protected page at /tenant. Clients with a booked session use this to view their schedule, session time remaining, and billing summary.
Logging In
- ›Enter your Tenant ID (e.g. TENANT-001) and the 6-digit access key delivered via IVR.
- ›Admin PIN XXXXXX — type in the Key field, Tenant ID field can be anything.
- ›ice_ tokens — paste your issued API token into the Key field.
Dashboard Panels
- ›Studio Status — Real-time availability from the Tenant agent (port 8002).
- ›Hours Remaining — Countdown against your booked block.
- ›Upcoming Bookings — Next two confirmed sessions with room and time.
- ›Account — Current balance, last payment, and hourly rate.
Getting Your Access Key
Call XXX-XXX-XXXX. The IVR will ask for your Tenant ID and deliver a one-time 6-digit key.
Staff Operations (Runner)
Staff-only page at /runner. Displays the live task board for runners, dispatchers, and logistics personnel.
Logging In
- ›Staff key issued by management — enter in the password field and submit.
- ›Admin PIN XXXXXX grants immediate access.
- ›ice_ tokens — paste your issued API token.
Task Board
Tasks are pulled from the Runner agent (port 8004) every 30 seconds. Filter by status using the tab bar at the top.
Task Statuses
- ›Active — In progress right now.
- ›Pending — Queued, not yet started.
- ›Done — Completed this session.
- ›Blocked — Cannot proceed; requires attention.
Priority Levels
- ›High — Time-critical; handle immediately.
- ›Normal — Standard priority.
- ›Low — When capacity allows.
Map
The Venue Map panel (powered by Leaflet) shows location pins for active tasks within the facility. Tap a pin for task details.
Control Center (Admin)
Internal dashboard at /admin. Requires the admin PIN or an ice_ token. This is the operations command center — hardware telemetry, agent grid, terminal feed, and the G-Code generator.
System Health Bar
- ›CPU Temp — RK3588 core temperature via Forge. Alert threshold: 75 °C.
- ›RAM — Used / total memory. Polled every 15 s.
- ›Disk — Root filesystem utilisation.
- ›Agents Online — Count of agents responding to health checks.
Agent Grid
Shows all 14 agents with live online/offline status badges. Click any agent tile to open its dedicated page or API explorer.
Terminal Feed
Real-time WebSocket event log from the Admin service (port 8014). Shows service status changes, drag-and-drop transfer events, and command outputs.
Wake-Word Status
Shows whether the Echo agent's wake-word listener is armed. When armed, saying the trigger phrase activates voice command mode on the host.
G-Code Generator
See the section below.
Studio Binaries
The Studio Binaries panel lists seven host-level studio and telephony tools with their installation status and file sizes:
- ›jackd — JACK Audio Connection Kit daemon
- ›ardour — Professional DAW
- ›lmms — Linux MultiMedia Studio
- ›obs — OBS Studio (streaming/recording)
- ›kdenlive — Video editor
- ›shotcut — Video editor
- ›asterisk — PBX / telephony engine
Why binaries showed "missing"
The Admin service runs inside a Docker container — it has its own isolated filesystem and cannot see the host's /usr/bin, /usr/local/bin, or /usr/sbin by default. Path lookups like /usr/bin/obs would fail even when obs is installed on the host, causing every binary to report as missing.
The fix was adding read-only bind mounts in docker-compose.yml so host directories are exposed inside the container at a /host/ prefix:
- ›/usr/bin → /host/usr/bin
- ›/usr/local/bin → /host/usr/local/bin
- ›/usr/sbin → /host/usr/sbin
The binary paths in admin/main.py were updated to match (e.g. /host/usr/bin/obs). All seven binaries now report present.
Note: jackd and ardour are symlinks pointing to /mnt/master_vault/usr/bin/. Their reported size will show 0.0 MB because Path.stat() returns the symlink size, not the target. This is cosmetic — the binaries are present and functional.
Security Gatekeeper (007)
Internal dashboard at /007. The central security and integrity console for the Monolith — webhook validation, bearer token management, rate limiting, audit logging, settlement ledger integrity, and root-level override authority.
Vault Bridge
Shows whether the integrity bridge between all services is live. Click Run Check to test connections. If the bridge is severed, a Restore button appears — use it to re-establish the link.
God Mode
Root-level override that bypasses all authentication restrictions for a timed window. Activate via IVR passphrase or DTMF code XXXX. Status shows method, activation time, and expiry. God Mode self-expires — it cannot be left on indefinitely.
Admin Session
Tracks the open admin session (call SID and opened time) and voice biometric enrollment status. Enrollment is triggered through the or via the IVR voice-enrollment prompt.
Settlement Ledger
A hash-chained JSON-LD ledger that records every financial settlement event. Click Verify Chain to check end-to-end integrity. Any break in the chain is reported immediately with the affected entry count.
Audit Log
Scrolling security event log — every authentication attempt, token validation, and webhook check is recorded with IP address, success/failure status, and notes. Auto-refreshes every 30 seconds. Click Refresh to pull immediately.
API Token Management
The Gatekeeper issues and validates ice_ API tokens. See the section for full details on issuing, using, and revoking tokens.
Agent System
The Monolith runs 19 specialised AI agents. Each is a FastAPI Python service.
Core orchestration and KDS order routing.
Telephony: call handling, voice enrolment, key delivery.
Client session management and studio access control.
Fan portal data: viewer stats, merch inventory.
Task dispatch and runner board management.
Event booking intake, calendar sync, follow-ups.
Routing specialist — call transfer and trunk logic.
Commerce and logistics coordination.
Security audit, API token issuance and validation.
Transaction escrow and payment flow management.
Visual verification — stream health, camera feeds.
Audio engine: wake-word, voice commands, DSP.
Fleet coordinator — peripheral and device management.
Hardware sentinel: thermal, memory, disk telemetry.
Admin dashboard backend, WebSocket event bus.
Alert engine — threshold triggers and notifications.
Repair swarm — remote machine diagnostics over WS.
Number Hub — virtual number pool, voicemail, customer mini-sites.
Threat detection and incident escalation.
Health Endpoints
Every agent exposes GET /health returning {"status":"ok"}. The Admin dashboard polls these every 15 seconds via WebSocket heartbeat.
Yabo — Host OS Agent
Yabo runs as a systemd service on the host (outside Docker) at /etc/systemd/system/yabo.service. It manages OS-level tasks, monitors the host, and has full vector coding capabilities. If Yabo is down: sudo systemctl restart yabo.
G-Code Generator
Located on the Control Center page (/admin). Converts plain-English CNC descriptions into G-code programs using the AI agent.
How to Use
- ›Type a description of the operation in the text area (e.g. "drill a 10mm hole 5mm deep at X50 Y50").
- ›Press Generate G-Code or use Ctrl + Enter.
- ›The generated code appears in the green monospace output panel below.
- ›Click Copy to copy to clipboard.
What It Can Generate
- ›Drilling cycles (G81/G83) with depth, feed, and retract parameters.
- ›Milling profiles — pockets, contours, facing passes.
- ›Tool change sequences with correct M6 / T commands.
- ›Coordinate system setup: G54–G59 work offsets, G17/G18/G19 planes.
- ›Lathe turning operations with CSS (constant surface speed).
Tips
- ›Specify units ("metric" or "imperial") in your description for correct G20/G21.
- ›Mention material for appropriate feed and speed suggestions.
- ›Always verify generated G-code against your machine's controller before running.
Repair Swarm
The Repair Swarm is an AI-powered remote diagnostics service running at port 8017. A host agent manages incoming client connections; worker clones spawn per connection, run diagnostics, apply fixes, and self-terminate on completion.
Client Connector
The repair page at /repair auto-detects your OS and shows the correct command. Windows uses PowerShell. macOS and Linux use a curl pipe python3 command. Android uses a Termux command that installs Python and curl automatically. A Download .py File button is available on every tab as a fallback.
Connection Flow
- ›Client connects via WebSocket to /repair/connect.
- ›Host agent spawns a dedicated worker clone for the session.
- ›Worker runs OS triage: disk, RAM, services, network, CPU, temperature, logs, dependencies, firewall, filesystem, updates, malware scan.
- ›Findings and fixes are streamed back to the client in real time.
- ›On disconnect or completion the worker shreds its own state and terminates.
Supported Platforms
- ›Linux — Full tool suite: df, free, systemctl, journalctl, netstat, apt.
- ›macOS (darwin) — launchctl, diskutil, dscacheutil, brew, pmset.
- ›Windows — PowerShell: Invoke-WebRequest, Get-Service, wmic, chkdsk, ipconfig.
- ›Android — Termux: getprop, dumpsys, pm, adb-compatible commands.
OS Detection Note
The macOS platform string darwin contains the substring win. The OS detection logic checks for darwin before windows to avoid misidentification. Repair service v4+.
Concurrent Sessions
The swarm supports up to 10 simultaneous client repairs. Each worker is isolated — one client's session cannot observe another's.
Forge Sentinel
Forge is the hardware sentinel for the Orange Pi 5 Plus. It runs at port 8013 and polls the board every 30 seconds. It is the only service with autonomous call authority — it will call the owner directly when the system needs attention.
Hardware Monitoring
- ›All thermal zones: CPU (A76 big cores), CPU (A55 little cores), GPU, NPU, board sensors.
- ›CPU frequencies on all 8 cores; GPU, NPU, and DDR memory bus governors.
- ›RAM and swap usage; disk utilisation on every mounted filesystem.
- ›Kernel dmesg scan for hardware errors on every poll cycle.
Autonomous Alerts (calls owner)
- ›Any thermal zone exceeds 90 °C → immediate call.
- ›RAM or disk usage exceeds 90 % → immediate call.
- ›New master image built → call with a summary of which services changed versions.
- ›15-minute cooldown per event type to prevent repeated calls.
Autonomous Healing
- ›High memory: drops Linux page cache, prunes stopped Docker containers.
- ›High disk: prunes unused Docker images and stopped containers.
- ›Beats music generator restart skipped when MUSIC_GEN_ENABLED=false.
- ›5-minute cooldown between healing attempts for the same issue.
Active Controls (on command)
- ›Set CPU governor: performance / powersave / ondemand / schedutil.
- ›Trigger thermal emergency — drops board to powersave mode immediately.
- ›Rebuild and redeploy any service in the Docker stack.
- ›Warm up the Whisper speech-to-text model.
- ›Build a new master image snapshot of all running service versions.
- ›Place an immediate outbound call with any custom message.
Disaster Recovery
- ›Run full backup to the 1 TB external drive via monolith-backup.
- ›Check DR backup status and integrity.
- ›Full triage report: hardware + DR status combined.
NVMe Management
Monitors the 256 GB Micron NVMe drive at /mnt/nvme. The NVMe handles /tmp bind, 8 GB swap, and service scratch volumes. Forge can run a full automated repair sequence if the drive degrades.
AI Chat Interface
Forge has a built-in chat at /forge powered by Claude. It can read live hardware state and take corrective actions based on natural language instructions. Available tools: thermal zones, memory, storage, CPU governor, thermal emergency.
API
- ›GET /forge/thermal — all thermal zone readings.
- ›GET /forge/memory — RAM and swap snapshot.
- ›GET /forge/hardware/summary — full board snapshot.
- ›GET /forge/triage — hardware + disaster recovery report.
- ›POST /forge/cpu/governor — set CPU scaling governor.
- ›POST /forge/master-image — trigger master image build.
Python Coding Class 101
An interactive beginner Python course available through the IVR and the web portal. Students work one-on-one with Coach Py, an AI teacher, through eight structured lessons.
Curriculum
- ›Lesson 1 — Variables and data types.
- ›Lesson 2 — Strings and string methods.
- ›Lesson 3 — Lists and indexing.
- ›Lesson 4 — Dictionaries and key-value storage.
- ›Lesson 5 — Loops (for and while).
- ›Lesson 6 — Functions and scope.
- ›Lesson 7 — Classes and objects.
- ›Lesson 8 — Error handling and exceptions.
How to Access
- ›Web: visit /python — dedicated Python Class 101 portal with curriculum and pricing.
- ›IVR: call XXX-XXX-XXXX, press 8 for Jiko, then press 3 for Python Coding Class 101.
- ›The IVR reads a full course description, then captures your email and sends you the direct link.
Pricing
- ›Per session — single class access.
- ›Daily pass — unlimited classes for one day.
- ›Monthly subscription — full curriculum access.
- ›Yearly subscription — best value, full access for 12 months.
Features
- ›Live code examples runnable directly in the browser — no setup required.
- ›Real-time feedback and explanations from Coach Py.
- ›Progress tracked per session.
API Tokens
Time-limited access tokens with the prefix ice_ can be issued to clients, partners, or staff for portal access without sharing a permanent staff key.
Issuing a Token
Send a POST request to the Gatekeeper agent with your master key:
POST http://gatekeeper:8008/keys/issue
X-Master-Key: <SECURITY_TOKEN_SECRET>
{
"label": "Studio Client A",
"duration_days": 30
}The response includes the ice_ token string. Copy it and give it to the client.
Using a Token
- ›Navigate to any protected page (Runner, Tenant).
- ›Paste the full ice_... token into the password / key field.
- ›Submit — the portal validates it against Gatekeeper and grants access.
Token Durations
- ›10-day trial — short studio visit or event coverage.
- ›20-day access — two-week production block.
- ›30-day access — monthly client retainer.
- ›Custom duration — set any number of days in the duration_days field.
Managing Tokens
# List all tokens GET /keys X-Master-Key: <secret> # Revoke a token DELETE /keys/<token> X-Master-Key: <secret>
Expiry
Expired tokens return a 401 at validation time. The client sees "Invalid or expired API token." Tokens cannot be extended — issue a new one and revoke the old.
Telephony & IVR
The Monolith's telephony stack uses SignalWire with webhook mode (required due to CGNAT). Inbound calls hit the Funnel URL and are routed to the IVR agent (port 8001).
Studio Numbers
- ›Main line: XXX-XXX-XXXX
- ›Both numbers configured in SignalWire pointing to the same Funnel endpoint.
IVR Main Menu
- ›Press 1 — Fan Zone: live events, streaming, merch.
- ›Press 2 — Studio access: recording studio key delivery.
- ›Press 3 — Music studio: book sessions, production services.
- ›Press 4 — Video studio: book sessions, post-production.
- ›Press 5 — Number Hub: browse and claim phone numbers.
- ›Press 6 — Tech support and remote repair connector.
- ›Press 7 — Leave a message (concerns, suggestions, other).
- ›Press 8 — Jiko Editor and Python Coding Classes.
- ›Press 9 — Staff and admin options.
Press 7 — Leave a Message Sub-Menu
Callers select a category, then hear a set of guiding questions before recording up to 3 minutes. Press # to finish.
- ›Press 1 — Concern — Prompts: nature of concern, service/staff involved, when it occurred and whether recurring, desired resolution.
- ›Press 2 — Suggestion — Prompts: what to change or add, which area it applies to, how it improves the experience, specific implementation ideas.
- ›Press 3 — Other — Prompts: topic and reason, relevant dates/services/people, what response is needed.
Press 8 — Jiko Sub-Menu
- ›Press 1 — G-Code Generator description and email link.
- ›Press 2 — HueForge mode description and email link.
- ›Press 3 — Python Coding Class 101 description and email link.
- ›Press 4 — Python Coding Class 201 (intermediate) email link.
- ›Press 5 — Python Coding Class 301 (advanced) email link.
- ›Press 0 — Return to main menu.
Voice Enrolment
Staff can enrol a voice print for wake-word and voice-command authentication. Enrolment is triggered via the Admin dashboard's Enrolment Status panel or by calling the studio and following the IVR voice-enrolment prompt.
Asterisk PBX
Internal extension 200 at 100.97.30.90 (Tailscale IP). Password: XXXXXX. Trunk configuration files are at /etc/asterisk/ on the host.
Troubleshooting
- ›Calls not routing → check IVR health at http://ivr:8001/health from Admin terminal.
- ›Voice commands not triggering → check Echo agent wake-word status on the Admin page.
- ›IVR session stuck → sessions auto-expire after 10 minutes of inactivity.
Number Hub (Telco)
The Number Hub is a standalone virtual phone number service at /telco. You stock a pool of SignalWire toll-free numbers; customers pick one, set up their account, and get an IVR-backed voicemail box plus a 2-page mini-website — all completely independent of your other Monolith pages and services.
Admin — Managing the Number Pool
Go to /telco/admin and enter your master key.
- ›Add a number — paste the SignalWire number in E.164 format (e.g. +18005551234) and click Add. It appears as available immediately.
- ›Remove a number — only available (unassigned) numbers can be removed. Click Remove next to the number.
- ›Deactivate an account — releases the number back to the pool and disables the customer's site.
- ›Accounts tab — lists every customer with their number, total voicemail count, and unread count.
Customer Onboarding Flow
- ›Customer visits /telco and sees the available number grid.
- ›They click a number → taken to /telco/onboard with that number pre-selected.
- ›They fill in: business name, a URL handle (e.g. mikes-plumbing), email, password, page headline, tagline, about text, and accent color.
- ›On submit their account is created, the number is marked assigned, and they land on their public page.
Page 1 — Public Business Card
URL: /telco/[handle] — no login required. This is the page the customer shares with their own clients.
- ›Shows their business name, tagline, and about text.
- ›Large clickable phone number with a Call Now button.
- ›Accent color customised per account (set during onboarding, changeable later).
- ›Links to their messages inbox via the My Messages button.
Page 2 — Voicemail Inbox
URL: /telco/[handle]/messages — password protected.
- ›Customer enters their account password to access the inbox.
- ›New (unlistened) messages are highlighted with a colored left border and a NEW badge.
- ›Each entry shows: caller number, duration, timestamp, and transcript (when available).
- ›Controls per message: ▶ play (opens recording URL), ✓ mark read, ✕ delete.
- ›Refresh button pulls the latest messages from the server.
Inbound Call Flow
- ›Caller dials the customer's 800 number.
- ›SignalWire fires a webhook to /api/telco/webhook/call.
- ›The Telco agent looks up which account owns that number.
- ›Plays the customer's custom greeting (or a default if none is set).
- ›Records up to 2 minutes of voicemail and stores it against the account.
- ›Recording metadata is saved immediately; the customer sees it in their inbox on next refresh.
SignalWire Configuration
In your SignalWire dashboard, set the inbound webhook for each pooled number to:
POST https://orangepi5-plus.tail8c906d.ts.net/api/telco/webhook/call
Use the same URL for all numbers in the pool — the Telco agent identifies the correct account from the To field in the webhook payload.
Custom Voicemail Greeting
Each account's greeting defaults to "You have reached [Name]. Please leave a message after the tone." Customers can update it via the API:
PUT /api/telco/accounts/[handle]
X-Access-Key: <their-password>
{ "voicemail_greeting": "Thanks for calling Acme. Leave your name and number." }Telco Agent
The backend service runs internally at telco:8018. It is not exposed directly through nginx — all calls route through the portal's /api/telco/* proxy. The voicemail database is stored at /mnt/master_vault/telco.db and is included in all master image snapshots.
Troubleshooting
- ›Numbers not showing → check Telco agent health via Admin terminal: curl http://telco:8018/health
- ›Webhook not firing → confirm the SignalWire number's webhook URL is set to the Funnel address above.
- ›Account handle taken → handles are globally unique; customer must choose a different one.
- ›Voicemails not appearing → verify recording webhook URL is reachable; check SignalWire recording callback logs.
Music Generation (Beats)
AI-powered music production agent at /beats/. Converts a text or voice description into a fully mastered audio track using your studio's sound libraries — SF2 soundfonts, LMMS samples, and ZynAddSubFX banks.
Current Status
Music generation is currently disabled by default (MUSIC_GEN_ENABLED=false in the environment). The MusicGen model requires 4–6 GB of RAM on load; disabling it keeps that memory available for other services. To re-enable: set MUSIC_GEN_ENABLED=true in /home/ice/.env and redeploy the stack.
Generating an Instrumental
POST to /beats/generate with a JSON body:
POST /beats/generate
{ "prompt": "dark trap beat, 140 BPM, C minor, heavy 808 bass, hi-hat rolls, melancholy piano melody" }Returns a job_id. Poll /beats/jobs/{job_id} for status.
Voice Description
Upload an audio file describing the track to /beats/generate/voice. Echo's STT transcribes it, then the track is generated automatically.
Full Song with Vocals
Submit a text prompt and a vocal audio file together for a complete production:
POST /beats/generate/with-vocals (multipart/form-data) prompt= "soulful R&B, 75 BPM, D minor, warm Rhodes chords, smooth bass" vocals= <audio file>
Vocals are processed (noise gate, compression, reverb), mixed at proper levels against the instrumental, and the result is mastered to broadcast standard (-14 LUFS).
Add Vocals to an Existing Job
POST /beats/mix-vocals/{job_id} (multipart/form-data)
vocals= <audio file>Download the Track
GET /beats/download/{job_id} → 320 kbps MP3Sound Libraries Used
- ›FluidR3_GM / TimGM6mb — Full GM soundfonts for all melodic instruments.
- ›LMMS samples — Drums, beats, basses, instruments, waveforms.
- ›ZynAddSubFX banks — Synthesizer patches: pads, leads, brass, choir, and more.
Job Statuses
- ›queued — Waiting to start.
- ›parsing — Claude is generating the arrangement.
- ›building_midi — Constructing MIDI tracks.
- ›rendering — FluidSynth rendering MIDI to audio.
- ›mixing_drums — Layering drum samples via FFmpeg.
- ›mixing_vocals — Processing and mixing vocals.
- ›mastering — Applying mastering chain (-14 LUFS).
- ›done — Track ready to download.
Icebox Monolith Manual · v6 · 2026