Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.spatius.ai/llms.txt

Use this file to discover all available pages before exploring further.

Spatius does not stream a video of the Avatar. It takes the audio the Avatar should speak, turns that audio into motion data, and lets AvatarKit render the Avatar on the user’s device. The short version:
TTS audio -> Motion Server -> motion data -> AvatarKit -> Avatar moves on screen
The audio is usually TTS output from your agent response. It is not the user’s microphone audio unless your product intentionally makes the Avatar repeat or relay that audio.

What Happens

One Avatar response follows this sequence:
  1. Your app chooses an Avatar and mounts an AvatarView.
  2. Your app brings the connection online for the integration mode you use.
  3. The audio the Avatar should say reaches Motion Server.
  4. Motion Server generates motion data for that audio.
  5. AvatarKit plays the audio and renders the matching Avatar movement locally.
  6. State and error callbacks tell your app when the Avatar is idle, playing, paused, failed, or ready to retry.

What Each Component Does

ComponentRole
Your appChooses the Avatar, provides avatar speech audio, controls lifecycle, interrupts responses, and handles recovery.
Motion ServerReceives avatar speech audio and generates motion data.
AvatarKitLoads avatar assets, receives audio and motion data, plays audio, and renders the Avatar locally.
This boundary is the most important idea on the page: Motion Server sends data, not video. AvatarKit renders on the device.

Where Modes Differ

All modes use the same idea: audio in, motion data out, AvatarKit renders locally. The difference is who connects to Motion Server and how the output reaches AvatarKit.
ModeMotion Server connectionHow AvatarKit gets data
Basic ModeAvatarKit connects from the client.AvatarKit receives audio and motion data directly.
Custom ModeYour backend connects through the Spatius Server SDK.Your backend forwards encoded audio and motion messages to AvatarKit.
LiveKit PluginThe agent worker starts the plugin.Motion Server publishes audio and motion data into the LiveKit room; AvatarKit receives room data.
Use the mode pages for implementation details. This page is only the mental model.

What Your App Still Owns

Spatius handles audio-to-motion and local rendering. Your product still decides:
  • Which Avatar ID to load.
  • Where avatar speech audio comes from.
  • When to initialize, connect, interrupt, retry, and clean up.
  • How to refresh tokens and recover from failed connections.
  • What UI to show for loading, idle, playing, paused, and failed states.

Where to Go Next

If you are thinking about…Read
Which digital human appears on screenAvatars
When to initialize, connect, and clean upSessions & Lifecycle
Audio format, motion data, response end, or interruptionAudio and Motion Data
UI state, errors, token expiry, or reconnectsState & Events

Common Failure Paths

SymptomLikely causeRead
Avatar does not appearAvatar ID, assets, or view mount problemAvatars
Connection fails or token expiresSession Token or connection recovery problemState & Events
Audio is silent, distorted, or out of syncWrong audio format or sample rateAudio and Motion Data
Audio plays but the Avatar does not moveMotion data is not reaching AvatarKitAudio and Motion Data
Playback never returns to idleThe response end was not marked correctlyAudio and Motion Data