Skip to main content

What is Backend Mode Integration?

Backend Mode Integration is a Standalone Integration centered on the Server SDK: your backend connects to Motion Server, sends avatar speech audio, receives encoded audio payloads and motion data payloads, then delivers those payloads to AvatarKit clients. Use this path when your backend owns ASR, LLM, TTS, turn-taking, provider keys, and the connection to Motion Server.

At a glance

DimensionBackend Mode Integration
Dev effort๐Ÿ”ด High
Latency profileโš™๏ธ Low and tunable; your backend controls chunking, buffering, and downstream delivery.
You build๐Ÿ–ฅ๏ธ Server SDK session, ๐ŸŽ™๏ธ audio pipeline, ๐Ÿ”€ transport, ๐Ÿงฉ client message feed, recovery, and observability.
You do not build๐Ÿšซ Avatar rendering or motion generation. AvatarKit renders locally; Motion Server generates motion data.
Best first demo๐Ÿงช Backend Mode demos after you understand the Server SDK role.
Client support๐ŸŒ Web / ๐ŸŽ iOS / ๐Ÿค– Android / ๐Ÿ“ฑ Flutter

Core backend flow

The Server SDK is the required piece in this path. The client does not connect to Motion Server directly, and it usually does not hold a Spatius Session Token. Instead, the client receives encoded audio payloads and motion data payloads from your backend and feeds them into AvatarKit with yieldAudioData() and yieldFramesData().
Do not confuse a Backend Mode server with a Direct Mode token endpoint. A Direct Mode token endpoint only mints short-lived Session Tokens. A Backend Mode server is part of the runtime: it owns the audio pipeline, connects to Motion Server with the Server SDK, and transports encoded audio payloads + motion data payloads to clients.

SDK roles

SDK surfaceRuns inWhat it does
Server SDKYour backendAuthenticates with Spatius, connects to Motion Server, sends avatar speech audio, and receives encoded audio payloads and motion data payloads.
Client SDKWeb, iOS, Android, or Flutter clientReceives encoded audio payloads and motion data payloads from your backend and renders them locally.

Requirements

RequirementWhy it matters
Server SDKRequired. Your backend uses it to authenticate, connect to Motion Server, and provide encoded audio payloads and motion data payloads for clients.
API KeyStored only on your backend. Used by the Server SDK, never by client apps.
App ID and Avatar IDIdentify the app and avatar your clients render.
Downstream transportYour backend must deliver encoded audio payloads and motion data payloads to clients. This can be your own WebSocket, LiveKit, or another supported transport.

Transport options

Transport is a second decision inside Backend Mode Integration. LiveKit is optional.
TransportUse whenGuide
Your own WebSocketYou want the simplest Backend Mode shape: your backend relays encoded audio payloads and motion data payloads directly to clients. This is the reference demo path.Your own transport
LiveKit as downstream transportYou already want LiveKit between your backend and clients, but you are not using LiveKit Agents as the agent framework.Backend Mode with LiveKit
Agora as downstream transportYou already want Agora between your backend and clients, but you are not using Agora Convo AI as the agent platform.Python SDK Agora Egress Mode
Backend Mode does not require LiveKit or Agora. The required integration point is the Server SDK on your backend. RTC providers are downstream transport options after your backend receives data from Motion Server.

Architecture variants

Your own transport

Your backend receives encoded audio payloads and motion data payloads from the Server SDK and relays them to clients over your own WebSocket or network layer.

Third-party realtime transport

Your backend still uses the Server SDK. The RTC provider only carries the downstream audio payloads and motion data payloads between your backend and your clients.

Next steps

Server SDK role

Backend SDK responsibility and language references.

Client SDK role

Client-side rendering responsibility.

Browse all demos

Run reference implementations for Web, iOS, Android, and Flutter.

Your own transport

Direct WebSocket relay between your backend and clients.