Generating speech

AgentMark generates speech audio with prompts that declare speech_config in frontmatter. The text to speak goes in a <SpeechPrompt> tag.

Example configuration

example.prompt.mdx

---
name: speech
speech_config:
  model_name: tts-1-hd
  voice: "nova"
  speed: 1.0
  output_format: "mp3"
---

<System>
Please read this text aloud.
</System>

<SpeechPrompt>
This is a test for the speech prompt to be spoken aloud.
</SpeechPrompt>

Tag	Description
`<SpeechPrompt>`	The text to convert to speech. AgentMark reads the contents at compile time and sends it to the TTS model.
`<System>`	Optional system-level instructions passed to models that support them.

Available configuration

Property	Type	Description	Required
`model_name`	`string`	The name of the model to use for speech generation.	Yes
`voice`	`string`	Voice identifier (provider-specific; e.g. for OpenAI TTS: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`).	No
`output_format`	`string`	Audio output format (e.g., `mp3`, `opus`, `aac`, `flac`).	No
`instructions`	`string`	Additional instructions for speech generation (provider-specific).	No
`speed`	`number`	Playback speed multiplier.	No

Running a speech prompt

See Running prompts → Speech generation for the SDK code pattern using Vercel AI SDK’s experimental_generateSpeech. (The experimental_ prefix is upstream — the API may evolve.)

Have Questions?

We’re here to help! Choose the best way to reach us:

Email us at hello@agentmark.co for support
Schedule an Enterprise Demo to learn about our business solutions

Introduction

Getting Started

Build

Evaluate

Observe

Configure

Deploy

Integrations

Example configuration

Tags

Available configuration

Running a speech prompt

Have Questions?

Introduction

Getting Started

Build

Evaluate

Observe

Configure

Deploy

Integrations

Documentation Index

​Example configuration

​Tags

​Available configuration

​Running a speech prompt

​Have Questions?

Example configuration

Tags

Available configuration

Running a speech prompt

Have Questions?