Skip to main content
Integrations and SDKsLiveKit

LiveKit quickstart

Build a voice AI agent with Speechmatics STT and TTS using LiveKit Agents.

Build a real-time voice AI agent with Speechmatics and LiveKit in minutes.

LiveKit Agents is a framework for building voice AI applications using WebRTC. With the Speechmatics plugin, you get accurate speech recognition and natural text-to-speech for your voice agents.

Features

  • Real-time transcription — Low-latency speech-to-text as users speak
  • Speaker diarization — Identify and track multiple speakers
  • Smart turn detection — Know when the user has finished speaking
  • Natural TTS voices — Choose from multiple voice options
  • Noise robustness — Accurate recognition in challenging audio environments
  • Global language support — Works with diverse accents and dialects

Prerequisites

Setup

This guide assumes LiveKit Cloud. If you want to self-host LiveKit instead, follow LiveKit's self-hosting guide and configure LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET for your deployment: https://docs.livekit.io/transport/self-hosting/

1. Create project

mkdir voice-agent && cd voice-agent

2. Install dependencies

uv init
uv add "livekit-agents[speechmatics,openai,silero]==1.4.2" python-dotenv

3. Install and authenticate the LiveKit CLI

Install the LiveKit CLI. For additional installation options, see the LiveKit CLI setup guide: https://docs.livekit.io/home/cli/cli-setup/

macOS:

brew install livekit-cli

Linux:

curl -sSL https://get.livekit.io/cli | bash

Windows:

winget install LiveKit.LiveKitCLI

Authenticate and link your LiveKit Cloud project:

lk cloud auth

4. Configure environment

Run the LiveKit CLI to write your LiveKit Cloud credentials to a .env.local file:

lk app env -w

This creates a .env.local file with your LiveKit credentials. Add your Speechmatics and OpenAI keys:

.env.local
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
SPEECHMATICS_API_KEY=your_speechmatics_key
OPENAI_API_KEY=your_openai_key

5. Create your agent

Create a main.py file:

main.py
from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import openai, silero, speechmatics
from livekit.plugins.speechmatics import TurnDetectionMode

load_dotenv(".env.local")


class VoiceAssistant(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful voice assistant. Be concise and friendly."
)


async def entrypoint(ctx: agents.JobContext):
await ctx.connect()

# Speech-to-Text: Speechmatics
stt = speechmatics.STT(
turn_detection_mode=TurnDetectionMode.SMART_TURN,
)

# Language Model: OpenAI
llm = openai.LLM(model="gpt-4o-mini")

# Text-to-Speech: Speechmatics
tts = speechmatics.TTS()

# Voice Activity Detection: Silero
vad = silero.VAD.load()

# Create and start session
session = AgentSession(
stt=stt,
llm=llm,
tts=tts,
vad=vad,
)

await session.start(
room=ctx.room,
agent=VoiceAssistant(),
room_input_options=RoomInputOptions(),
)

await session.generate_reply(
instructions="Say a short hello and ask how you can help."
)


if __name__ == "__main__":
agents.cli.run_app(
agents.WorkerOptions(entrypoint_fnc=entrypoint),
)

6. Run your agent

Run your agent in dev mode to connect it to LiveKit and make it available from anywhere on the internet:

python main.py dev

Open the LiveKit Agents Playground to test your agent.

Run your agent in console mode to speak to it locally in your terminal:

python main.py console

Next steps