How Spatius Works

Spatius does not stream a video of the Avatar. It takes the audio the Avatar should speak, turns that audio into motion data, and lets AvatarKit render the Avatar on the user’s device. The short version:

TTS audio -> Motion Server -> motion data -> AvatarKit -> Avatar moves on screen

The audio is usually TTS output from your agent response. It is not the user’s microphone audio unless your product intentionally makes the Avatar repeat or relay that audio.

What Happens

One Avatar response follows this sequence:

Your app chooses an Avatar and mounts an AvatarView.
Your app brings the connection online for the integration mode you use.
The audio the Avatar should say reaches Motion Server.
Motion Server generates motion data for that audio.
AvatarKit plays the audio and renders the matching Avatar movement locally.
State and error callbacks tell your app when the Avatar is idle, playing, paused, failed, or ready to retry.

What Each Component Does

Component	Role
Your app	Chooses the Avatar, provides avatar speech audio, controls lifecycle, interrupts responses, and handles recovery.
Motion Server	Receives avatar speech audio and generates motion data.
AvatarKit	Loads avatar assets, receives audio and motion data, plays audio, and renders the Avatar locally.

This boundary is the most important idea on the page: Motion Server sends data, not video. AvatarKit renders on the device.

Where Modes Differ

All modes use the same idea: audio in, motion data out, AvatarKit renders locally. The difference is who connects to Motion Server and how the output reaches AvatarKit.

Mode	Motion Server connection	How AvatarKit gets data
Basic Mode	AvatarKit connects from the client.	AvatarKit receives audio and motion data directly.
Custom Mode	Your backend connects through the Spatius Server SDK.	Your backend forwards encoded audio and motion messages to AvatarKit.
LiveKit Plugin	The agent worker starts the plugin.	Motion Server publishes audio and motion data into the LiveKit room; AvatarKit receives room data.

Use the mode pages for implementation details. This page is only the mental model.

What Your App Still Owns

Spatius handles audio-to-motion and local rendering. Your product still decides:

Which Avatar ID to load.
Where avatar speech audio comes from.
When to initialize, connect, interrupt, retry, and clean up.
How to refresh tokens and recover from failed connections.
What UI to show for loading, idle, playing, paused, and failed states.

Where to Go Next

If you are thinking about…	Read
Which digital human appears on screen	Avatars
When to initialize, connect, and clean up	Sessions & Lifecycle
Audio format, motion data, response end, or interruption	Audio and Motion Data
UI state, errors, token expiry, or reconnects	State & Events

Common Failure Paths

Symptom	Likely cause	Read
Avatar does not appear	Avatar ID, assets, or view mount problem	Avatars
Connection fails or token expires	Session Token or connection recovery problem	State & Events
Audio is silent, distorted, or out of sync	Wrong audio format or sample rate	Audio and Motion Data
Audio plays but the Avatar does not move	Motion data is not reaching AvatarKit	Audio and Motion Data
Playback never returns to idle	The response end was not marked correctly	Audio and Motion Data

Getting Started

Concepts

LiveKit Plugin

Basic Mode

Custom Mode

Resources

How Spatius Works

What Happens

What Each Component Does

Where Modes Differ

What Your App Still Owns

Where to Go Next

Common Failure Paths

Getting Started

Concepts

LiveKit Plugin

Basic Mode

Custom Mode

Resources

Documentation Index

​What Happens

​What Each Component Does

​Where Modes Differ

​What Your App Still Owns

​Where to Go Next

​Common Failure Paths

What Happens

What Each Component Does

Where Modes Differ

What Your App Still Owns

Where to Go Next

Common Failure Paths