@capawesome-team/capacitor-speech-synthesis¶

Capacitor plugin for synthesizing speech from text (also known as text-to-speech) with advanced features like voice selection, pitch, and rate control.

Deliver Live Updates to your Capacitor app with Capawesome Cloud

Features¶

We are proud to offer one of the most complete and feature-rich Capacitor plugins for speech synthesis. Here are some of the key features:

🖥️ Cross-platform: Supports Android, iOS and Web.
🌐 Multiple Languages: Supports many different languages.
🗣️ Multiple Voices: Supports multiple voices for each language.
🎚️ Customization: Customize the pitch, rate, volume and voice of the speech.
🎧 Background Audio: Synthesize speech from text while your application runs in the background.
📜 Queue Strategy: Add or flush the utterance to the queue.
🔊 Events: Listen for events like boundary, end, error and start.
🤝 Compatibility: Compatible with the Audio Recorder, Speech Recognition and Native Audio plugins.
⚔️ Battle-Tested: Used in more than 50 projects.
📦 SPM: Supports Swift Package Manager for iOS.
🔁 Up-to-date: Always supports the latest Capacitor version.
⭐️ Support: Priority support from the Capawesome Team.

Missing a feature? Just open an issue and we'll take a look!

Compatibility¶

Plugin Version	Capacitor Version	Status
7.x.x	>=7.x.x	Active support
6.x.x	6.x.x	Deprecated

Installation¶

This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:

npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>

Attention: Replace <YOUR_LICENSE_KEY> with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.

Next, install the package:

npm install @capawesome-team/capacitor-speech-synthesis
npx cap sync

Android¶

Proguard¶

If you are using Proguard, you need to add the following rules to your proguard-rules.pro file:

-keep class io.capawesome.capacitorjs.plugins.** { *; }

Configuration¶

No configuration required for this plugin.

Usage¶

import { SpeechSynthesis, AudioSessionCategory, QueueStrategy } from '@capawesome-team/capacitor-speech-synthesis';

const activateAudioSession = async () => {
  await SpeechSynthesis.activateAudioSession({ category: AudioSessionCategory.Ambient });
};

const cancel = async () => {
  await SpeechSynthesis.cancel();
};

const deactivateAudioSession = async () => {
  await SpeechSynthesis.deactivateAudioSession();
};

const getLanguages = async () => {
  const result = await SpeechSynthesis.getLanguages();
  return result.languages;
};

const getVoices = async () => {
  const result = await SpeechSynthesis.getVoices();
  return result.voices;
};

const initialize = async () => {
  await SpeechSynthesis.initialize();
};

const isAvailable = async () => {
  const result = await SpeechSynthesis.isAvailable();
  return result.isAvailable;
};

const isSpeaking = async () => {
  const result = await SpeechSynthesis.isSpeaking();
  return result.isSpeaking;
};

const isLanguageAvailable = async () => {
  const result = await SpeechSynthesis.isLanguageAvailable({ language: 'en-US' });
  return result.isAvailable;
};

const isVoiceAvailable = async () => {
  const result = await SpeechSynthesis.isVoiceAvailable({ voiceId: 'com.apple.ttsbundle.Samantha-compact' });
  return result.isAvailable;
};

const speak = async () => {
  // Add an utterance to the utterance queue to be spoken
  const { utteranceId } = await SpeechSynthesis.speak({
    language: 'en-US',
    pitch: 1.0,
    queueStrategy: QueueStrategy.Add,
    rate: 1.0,
    text: 'Hello, World!',
    voiceId: 'com.apple.ttsbundle.Samantha-compact',
    volume: 1.0,
  });
  // Wait for the utterance to finish
  await new Promise(resolve => {
    void SpeechSynthesis.addListener('end', event => {
      if (event.utteranceId === utteranceId) {
        resolve();
      }
    });
  });
};

const synthesizeToFile = async () => {
  // Add an utterance to the utterance queue to be synthesized to a file
  const { path, utteranceId } = await SpeechSynthesis.synthesizeToFile({
    language: 'en-US',
    pitch: 1.0,
    queueStrategy: QueueStrategy.Add,
    rate: 1.0,
    text: 'Hello, World!',
    voiceId: 'com.apple.ttsbundle.Samantha-compact',
    volume: 1.0,
  });
  // Wait for the utterance to finish
  await new Promise(resolve => {
    void SpeechSynthesis.addListener('end', event => {
      if (event.utteranceId === utteranceId) {
        resolve();
      }
    });
  });
  // Return the path to the synthesized audio file
  return path;
};


const addListeners = () => {
  SpeechSynthesis.addListener('boundary', (event) => {
    console.log('boundary', event);
  });

  SpeechSynthesis.addListener('end', (event) => {
    console.log('end', event);
  });

  SpeechSynthesis.addListener('error', (event) => {
    console.log('error', event);
  });

  SpeechSynthesis.addListener('start', (event) => {
    console.log('start', event);
  });
};

const removeAllListeners = async () => {
  await SpeechSynthesis.removeAllListeners();
};

API¶

activateAudioSession(...)
cancel()
deactivateAudioSession()
getLanguages()
getVoices()
initialize()
isAvailable()
isSpeaking()
isLanguageAvailable(...)
isVoiceAvailable(...)
speak(...)
synthesizeToFile(...)
addListener('boundary', ...)
addListener('end', ...)
addListener('error', ...)
addListener('start', ...)
removeAllListeners()
Interfaces
Type Aliases
Enums

activateAudioSession(...)¶

activateAudioSession(options: ActivateAudioSessionOptions) => Promise<void>

Activate the audio session. This method is not mandatory. It can be used to set the audio session category before speaking.

Only available on iOS.

Param	Type
`options`	`ActivateAudioSessionOptions`

Since: 6.0.0

cancel()¶

cancel() => Promise<void>

Remove all utterances from the utterance queue.

Since: 6.0.0

deactivateAudioSession()¶

deactivateAudioSession() => Promise<void>

Deactivate the audio session.

Only available on iOS.

Since: 6.0.0

getLanguages()¶

getLanguages() => Promise<GetLanguagesResult>

Get the available languages for speech synthesis.

Returns: Promise<GetLanguagesResult>

Since: 6.0.0

getVoices()¶

getVoices() => Promise<GetVoicesResult>

Get the available voices for speech synthesis.

Returns: Promise<GetVoicesResult>

Since: 6.0.0

initialize()¶

initialize() => Promise<void>

Initialize the plugin. This method must be called before any other method.

Only available on Android and iOS.

Since: 6.0.0

isAvailable()¶

isAvailable() => Promise<IsAvailableResult>

Check if speech synthesis is available on the current device.

Returns: Promise<IsAvailableResult>

Since: 6.0.0

isSpeaking()¶

isSpeaking() => Promise<IsSpeakingResult>

Check if speech synthesis is currently speaking.

Returns: Promise<IsSpeakingResult>

Since: 6.0.0

isLanguageAvailable(...)¶

isLanguageAvailable(options: IsLanguageAvailableOption) => Promise<IsLanguageAvailableResult>

Check if a language is available for speech synthesis.

Param	Type
`options`	`IsLanguageAvailableOption`

Returns: Promise<IsLanguageAvailableResult>

Since: 6.0.0

isVoiceAvailable(...)¶

isVoiceAvailable(options: IsVoiceAvailableOption) => Promise<IsVoiceAvailableResult>

Check if a voice is available for speech synthesis.

Param	Type
`options`	`IsVoiceAvailableOption`

Returns: Promise<IsVoiceAvailableResult>

Since: 6.0.0

speak(...)¶

speak(options: SpeakOptions) => Promise<SpeakResult>

Add an utterance to the utterance queue to be spoken.

The end event will be emitted when the utterance has finished.

Param	Type
`options`	`SpeakOptions`

Returns: Promise<SpeakResult>

Since: 6.0.0

synthesizeToFile(...)¶

synthesizeToFile(options: SynthesizeToFileOptions) => Promise<SynthesizeToFileResult>

Add an utterance to the utterance queue to be synthesized to a file.

The end event will be emitted when the utterance has finished.

Only available on Android and iOS.

Param	Type
`options`	`SpeakOptions`

Returns: Promise<SynthesizeToFileResult>

Since: 7.1.0

addListener('boundary', ...)¶

addListener(eventName: 'boundary', listenerFunc: (event: BoundaryEvent) => void) => Promise<PluginListenerHandle>

Called hen the spoken utterance reaches a word boundary.

Param	Type
`eventName`	`'boundary'`
`listenerFunc`	`(event: BoundaryEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('end', ...)¶

addListener(eventName: 'end', listenerFunc: (event: EndEvent) => void) => Promise<PluginListenerHandle>

Called when the spoken utterance has finished.

Param	Type
`eventName`	`'end'`
`listenerFunc`	`(event: EndEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('error', ...)¶

addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>

Called when an error occurs during speech synthesis.

Param	Type
`eventName`	`'error'`
`listenerFunc`	`(event: ErrorEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('start', ...)¶

addListener(eventName: 'start', listenerFunc: (event: StartEvent) => void) => Promise<PluginListenerHandle>

Called when the spoken utterance has started.

Param	Type
`eventName`	`'start'`
`listenerFunc`	`(event: StartEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

removeAllListeners()¶

removeAllListeners() => Promise<void>

Remove all listeners for the plugin.

Since: 6.0.0

Interfaces¶

ActivateAudioSessionOptions¶

Prop	Type	Description	Since
`category`	`AudioSessionCategory`	The audio session category to set.	6.0.0

GetLanguagesResult¶

Prop	Type	Description	Since
`languages`	`string[]`	The available languages as BC-47 language tags.	6.0.0

GetVoicesResult¶

Prop	Type	Description	Since
`voices`	`Voice[]`	The available voices.	6.0.0

Voice¶

Prop	Type	Description	Since
`default`	`boolean`	Whether or not the voice is the default voice. Only available on Web.	6.0.0
`gender`	`'female' \| 'male'`	The gender of the voice. Only available on iOS.	6.0.0
`id`	`string`	The identifier of the voice.	6.0.0
`isNetworkConnectionRequired`	`boolean`	Whether or not the voice is available via a local or remote service.	6.0.0
`language`	`string`	The BC-47 language tag for the language of the voice.	6.0.0
`name`	`string`	The name of the voice.	6.0.0

IsAvailableResult¶

Prop	Type	Description	Since
`isAvailable`	`boolean`	Whether or not speech synthesis is available on the current device.	6.0.0

IsSpeakingResult¶

Prop	Type	Description	Since
`isSpeaking`	`boolean`	Whether or not an utterance is currently being spoken.	6.0.0

IsLanguageAvailableResult¶

Prop	Type	Description	Since
`isAvailable`	`boolean`	Whether or not the language is available for speech synthesis.	6.0.0

IsLanguageAvailableOption¶

Prop	Type	Description	Since
`language`	`string`	The BC-47 language tag for the language to check.	6.0.0

IsVoiceAvailableResult¶

Prop	Type	Description	Since
`isAvailable`	`boolean`	Whether or not the voice is available for speech synthesis.	6.0.0

IsVoiceAvailableOption¶

Prop	Type	Description	Since
`voiceId`	`string`	The identifier of the voice to check.	6.0.0

SpeakResult¶

Prop	Type	Description	Since
`utteranceId`	`string`	The identifier of the utterance that is being spoken.	6.0.0

SpeakOptions¶

Prop	Type	Description	Default	Since
`language`	`string`	The BC-47 language tag for the language to use for speech synthesis. On iOS, this option is only used when the `voiceId` option is not provided.		6.0.0
`pitch`	`number`	The pitch that the utterance will be spoken at.	`1.0`	6.0.0
`queueStrategy`	`QueueStrategy`	The queue strategy to use for the utterance.	`QueueStrategy.Add`	6.0.0
`rate`	`number`	The speed at which the utterance will be spoken at.	`1.0`	6.0.0
`text`	`string`	The text that will be synthesized when the utterance is spoken.		6.0.0
`voiceId`	`string`	The identifier of the voice to use for speech synthesis.		6.0.0
`volume`	`number`	The volume that the utterance will be spoken at.	`1.0`	6.0.0

SynthesizeToFileResult¶

Prop	Type	Description	Since
`path`	`string`	The path to which the synthesized audio file will be saved. The file is available as soon as the `end` event is emitted. Only available on Android and iOS.	7.1.0

PluginListenerHandle¶

Prop	Type
`remove`	`() => Promise<void>`

BoundaryEvent¶

Prop	Type	Description	Since
`endIndex`	`number`	The index of the last character in the word.	6.0.0
`startIndex`	`number`	The index of the first character in the word.	6.0.0
`utteranceId`	`string`	The identifier of the utterance that is being spoken.	6.0.0
`word`	`string`	The word that was spoken.	6.0.0

EndEvent¶

Prop	Type	Description	Since
`utteranceId`	`string`	The identifier of the utterance that has finished.	6.0.0

ErrorEvent¶

Prop	Type	Description	Since
`message`	`string`	The error message.	6.0.0
`utteranceId`	`string`	The identifier of the utterance that caused the error.	6.0.0

StartEvent¶

Prop	Type	Description	Since
`utteranceId`	`string`	The identifier of the utterance that has started.	6.0.0

Type Aliases¶

SynthesizeToFileOptions¶

SpeakOptions

Enums¶

AudioSessionCategory¶

Members	Value	Description	Since
`Ambient`	`'AMBIENT'`	The audio session category for ambient sound. Audio from other apps mixes with your audio. Screen locking and the Silent switch silence your audio.	6.0.0
`Playback`	`'PLAYBACK'`	The audio session category for playback. App audio continues with the Silent switch set to silent or when the screen locks.	6.0.0

QueueStrategy¶

Members	Value	Description	Since
`Add`	`0`	Add the utterance to the end of the queue.	6.0.0
`Flush`	`1`	Flush the queue and add the utterance to the beginning of the queue.	6.0.0

Changelog¶

See CHANGELOG.md.

Breaking Changes¶

See BREAKING.md.

License¶

See LICENSE.