Skip to main content
A session goes through five stages from SDK startup to view destruction, followed by Cleanup. Each stage corresponds to an AvatarKit class or a connection prerequisite.

Stage 1: Initialize

Call AvatarSDK.initialize(appId, configuration) once when your app starts. This sets process-level configuration such as App ID, region, audio format, driving mode, and log level. AvatarKit must be initialized before you load an Avatar or create a view. Reference: Web | iOS | Android | Flutter

Stage 2: Load Avatar

AvatarManager.load(id, onProgress?) downloads avatar assets by avatar-id and returns an Avatar instance. In-flight loads for the same ID are reused, and successful loads are cached.
MethodPurpose
load(id, onProgress?)Downloads and returns an Avatar instance.
clear(id)Deletes the cache for a specific Avatar.
clearAll()Deletes all Avatar caches.
The onProgress callback has three states: downloading, completed, and failed. progress ranges from 0 to 1. Reference: Web | iOS | Android | Flutter

Stage 3: Render

Mount the loaded Avatar into an AvatarView. The view creates the render surface and automatically creates the associated AvatarController. Do not construct the controller directly. After mounting:
  • The Avatar can play idle animation.
  • Your app can access the AvatarController.
  • Once the connection path is online and motion data arrives, playback can start.
CallbackWhen it fires
onFirstRenderingThe first frame has been rendered.
Reference: Web | iOS | Android | Flutter

Stage 4: Authentication

Authentication differs by integration path. Only Direct Mode requires the client to hold a Spatius Session Token; the other paths authenticate inside their own transport.
IntegrationWhat the client needs before connectingWhere credentials live
Direct ModeA valid Session Token set via AvatarSDK.setSessionToken() before controller.start(). Your backend exchanges your API Key for the Session Token via the Spatius Console API, then returns it to the client.API Key on your backend; Session Token issued per session.
LiveKit Agents IntegrationA LiveKit room token to join the room. No Spatius Session Token in the client.API Key + App ID in the agent worker (livekit-plugins-spatius).
Agora Convo AI IntegrationAn Agora RTC token to join the channel. No Spatius Session Token in the client.API Key + App ID in the Convo AI avatar provider block, or in the TEN graph when using spatius_avatar_python.
Backend ModeWhatever the backend’s transport requires (e.g. its own auth token). No Spatius Session Token in the client.API Key + App ID on the backend (used by the Server SDK to talk to Motion Server).
The API Key must only ever live on your backend.
Direct Mode reconnect: if a sessionTokenExpired / sessionTokenInvalid error fires during reconnect, fetch a fresh Session Token from your backend and call AvatarSDK.setSessionToken() again before restarting the connection. RTC / Platform Integration / Backend Mode reconnects don’t go through this flow — the Spatius client never sees a Motion Server token in those paths.
For Direct Mode session-token issuance details, see Server API: Session Token Auth Flow.

Stage 5: Connect & Interact

Use AvatarController to start connections, send and receive data, and control playback. Connection ownership differs by mode in this stage:
IntegrationConnection ownerWhat your app does
Direct ModeClient-side AvatarKit connects directly to Motion Server.Set the Session Token, call controller.start(), then call send(audio, end) to send audio.
LiveKit Agents IntegrationThe agent worker starts livekit-plugins-spatius, and Motion Server pushes data into the LiveKit room.The client joins the room, and AvatarKit automatically renders from the room.
Agora Convo AI IntegrationThe Convo AI avatar provider or TEN extension starts the Motion Server session, and Motion Server pushes data into the Agora channel.The client joins the channel, and AvatarKit automatically renders from the channel.
Backend ModeBackend Server SDK connects to Motion Server.The backend starts the session and feeds encoded audio payloads and motion data payloads to the client through your custom transport.
Do not treat controller.start() as a universal connection step. It is only the Direct Mode connection path. In Backend Mode and Platform Integrations, the Motion Server connection is established outside the client.
MethodPurpose
start()Starts the Motion Server connection in Direct Mode integrations.
send(audio, end)Sends avatar speech audio in Direct Mode integrations.
yieldAudioData() / yieldFramesData()Feeds encoded audio payloads and motion data payloads in Backend Mode integrations.
interrupt()Interrupts the current response and clears buffers.
pause() / resume()Pauses or resumes while keeping buffers.
close()Closes the connection.
For audio source and timing guidance, see Audio. For state observation, see State & Events. Reference: Web | iOS | Android | Flutter

Cleanup

Release resources when the Avatar is no longer used.
avatarView.dispose()
Web and Android must explicitly call dispose(). iOS cleans up automatically when the view is released.
Reference: Web AvatarView | iOS AvatarView | Android AvatarView | Flutter