Skip to content

@capawesome-team/capacitor-speech-synthesis

Capacitor plugin for synthesizing speech from text.

Features

  • 🖥️ Cross-platform: Supports Android, iOS and Web.
  • 🌐 Multiple Languages: Supports many different languages.
  • 🗣️ Multiple Voices: Supports multiple voices for each language.
  • 🎚️ Customization: Customize the pitch, rate, volume and voice of the speech.
  • 🎧 Background Audio: Synthesize speech from text while your application runs in the background.
  • 📜 Queue Strategy: Add or flush the utterance to the queue.
  • 🎤 Events: Listen for events like boundary, end, error and start.
  • 🔁 Up-to-date: Always supports the latest Capacitor version.
  • ⭐️ Support: First-class support from the Capawesome Team.

Installation

This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:

npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>

Attention: Replace <YOUR_LICENSE_KEY> with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.

Next, install the package:

npm install @capawesome-team/capacitor-speech-synthesis
npx cap sync

Configuration

Usage

import { SpeechSynthesis, AudioSessionCategory, QueueStrategy } from '@capawesome-team/capacitor-speech-synthesis';

const activateAudioSession = async () => {
  await SpeechSynthesis.activateAudioSession({ category: AudioSessionCategory.Ambient });
};

const cancel = async () => {
  await SpeechSynthesis.cancel();
};

const deactivateAudioSession = async () => {
  await SpeechSynthesis.deactivateAudioSession();
};

const getLanguages = async () => {
  const result = await SpeechSynthesis.getLanguages();
  return result.languages;
};

const getVoices = async () => {
  const result = await SpeechSynthesis.getVoices();
  return result.voices;
};

const initialize = async () => {
  await SpeechSynthesis.initialize();
};

const isAvailable = async () => {
  const result = await SpeechSynthesis.isAvailable();
  return result.isAvailable;
};

const isSpeaking = async () => {
  const result = await SpeechSynthesis.isSpeaking();
  return result.isSpeaking;
};

const isLanguageAvailable = async () => {
  const result = await SpeechSynthesis.isLanguageAvailable({ language: 'en-US' });
  return result.isAvailable;
};

const isVoiceAvailable = async () => {
  const result = await SpeechSynthesis.isVoiceAvailable({ voiceId: 'com.apple.ttsbundle.Samantha-compact' });
  return result.isAvailable;
};

const speak = async () => {
  await SpeechSynthesis.speak({
    language: 'en-US',
    pitch: 1.0,
    queueStrategy: QueueStrategy.Add,
    rate: 1.0,
    text: 'Hello, World!',
    voiceId: 'com.apple.ttsbundle.Samantha-compact',
    volume: 1.0,
  });
};

const addListeners = () => {
  SpeechSynthesis.addListener('boundary', (event) => {
    console.log('boundary', event);
  });

  SpeechSynthesis.addListener('end', (event) => {
    console.log('end', event);
  });

  SpeechSynthesis.addListener('error', (event) => {
    console.log('error', event);
  });

  SpeechSynthesis.addListener('start', (event) => {
    console.log('start', event);
  });
};

const removeAllListeners = async () => {
  await SpeechSynthesis.removeAllListeners();
};

API

activateAudioSession(...)

activateAudioSession(options: ActivateAudioSessionOptions) => Promise<void>

Activate the audio session. This method is not mandatory. It can be used to set the audio session category before speaking.

Only available on iOS.

Param Type
options ActivateAudioSessionOptions

Since: 6.0.0


cancel()

cancel() => Promise<void>

Remove all utterances from the utterance queue.

Since: 6.0.0


deactivateAudioSession()

deactivateAudioSession() => Promise<void>

Deactivate the audio session.

Only available on iOS.

Since: 6.0.0


getLanguages()

getLanguages() => Promise<GetLanguagesResult>

Get the available languages for speech synthesis.

Returns: Promise<GetLanguagesResult>

Since: 6.0.0


getVoices()

getVoices() => Promise<GetVoicesResult>

Get the available voices for speech synthesis.

Returns: Promise<GetVoicesResult>

Since: 6.0.0


initialize()

initialize() => Promise<void>

Initialize the plugin. This method must be called before any other method.

Only available on Android and iOS.

Since: 6.0.0


isAvailable()

isAvailable() => Promise<IsAvailableResult>

Check if speech synthesis is available on the current device.

Returns: Promise<IsAvailableResult>

Since: 6.0.0


isSpeaking()

isSpeaking() => Promise<IsSpeakingResult>

Check if speech synthesis is currently speaking.

Returns: Promise<IsSpeakingResult>

Since: 6.0.0


isLanguageAvailable(...)

isLanguageAvailable(options: IsLanguageAvailableOption) => Promise<IsLanguageAvailableResult>

Check if a language is available for speech synthesis.

Param Type
options IsLanguageAvailableOption

Returns: Promise<IsLanguageAvailableResult>

Since: 6.0.0


isVoiceAvailable(...)

isVoiceAvailable(options: IsVoiceAvailableOption) => Promise<IsVoiceAvailableResult>

Check if a voice is available for speech synthesis.

Param Type
options IsVoiceAvailableOption

Returns: Promise<IsVoiceAvailableResult>

Since: 6.0.0


speak(...)

speak(options: SpeakOptions) => Promise<SpeakResult>

Add an utterance to the utterance queue to be spoken.

The end event will be emitted when the utterance has finished.

Param Type
options SpeakOptions

Returns: Promise<SpeakResult>

Since: 6.0.0


addListener('boundary', ...)

addListener(eventName: 'boundary', listenerFunc: (event: BoundaryEvent) => void) => Promise<PluginListenerHandle>

Called hen the spoken utterance reaches a word boundary.

Param Type
eventName 'boundary'
listenerFunc (event: BoundaryEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('end', ...)

addListener(eventName: 'end', listenerFunc: (event: EndEvent) => void) => Promise<PluginListenerHandle>

Called when the spoken utterance has finished.

Param Type
eventName 'end'
listenerFunc (event: EndEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('error', ...)

addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>

Called when an error occurs during speech synthesis.

Param Type
eventName 'error'
listenerFunc (event: ErrorEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('start', ...)

addListener(eventName: 'start', listenerFunc: (event: StartEvent) => void) => Promise<PluginListenerHandle>

Called when the spoken utterance has started.

Param Type
eventName 'start'
listenerFunc (event: StartEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


removeAllListeners()

removeAllListeners() => Promise<void>

Remove all listeners for the plugin.

Since: 6.0.0


Interfaces

ActivateAudioSessionOptions

Prop Type Description Since
category AudioSessionCategory The audio session category to set. 6.0.0

GetLanguagesResult

Prop Type Description Since
languages string[] The available languages as BC-47 language tags. 6.0.0

GetVoicesResult

Prop Type Description Since
voices Voice[] The available voices. 6.0.0

Voice

Prop Type Description Since
default boolean Whether or not the voice is the default voice. Only available on Web. 6.0.0
gender 'female' | 'male' The gender of the voice. Only available on iOS. 6.0.0
id string The identifier of the voice. 6.0.0
isNetworkConnectionRequired boolean Whether or not the voice is available via a local or remote service. 6.0.0
language string The BC-47 language tag for the language of the voice. 6.0.0
name string The name of the voice. 6.0.0

IsAvailableResult

Prop Type Description Since
isAvailable boolean Whether or not speech synthesis is available on the current device. 6.0.0

IsSpeakingResult

Prop Type Description Since
isSpeaking boolean Whether or not an utterance is currently being spoken. 6.0.0

IsLanguageAvailableResult

Prop Type Description Since
isAvailable boolean Whether or not the language is available for speech synthesis. 6.0.0

IsLanguageAvailableOption

Prop Type Description Since
language string The BC-47 language tag for the language to check. 6.0.0

IsVoiceAvailableResult

Prop Type Description Since
isAvailable boolean Whether or not the voice is available for speech synthesis. 6.0.0

IsVoiceAvailableOption

Prop Type Description Since
voiceId string The identifier of the voice to check. 6.0.0

SpeakResult

Prop Type Description Since
utteranceId string The identifier of the utterance that is being spoken. 6.0.0

SpeakOptions

Prop Type Description Default Since
language string The BC-47 language tag for the language to use for speech synthesis. On iOS, this option is only used when the voiceId option is not provided. 6.0.0
pitch number The pitch that the utterance will be spoken at. 1.0 6.0.0
queueStrategy QueueStrategy The queue strategy to use for the utterance. QueueStrategy.Add 6.0.0
rate number The speed at which the utterance will be spoken at. 1.0 6.0.0
text string The text that will be synthesized when the utterance is spoken. 6.0.0
voiceId string The identifier of the voice to use for speech synthesis. 6.0.0
volume number The volume that the utterance will be spoken at. 1.0 6.0.0

PluginListenerHandle

Prop Type
remove () => Promise<void>

BoundaryEvent

Prop Type Description Since
endIndex number The index of the last character in the word. 6.0.0
startIndex number The index of the first character in the word. 6.0.0
utteranceId string The identifier of the utterance that is being spoken. 6.0.0
word string The word that was spoken. 6.0.0

EndEvent

Prop Type Description Since
utteranceId string The identifier of the utterance that has finished. 6.0.0

ErrorEvent

Prop Type Description Since
message string The error message. 6.0.0
utteranceId string The identifier of the utterance that caused the error. 6.0.0

StartEvent

Prop Type Description Since
utteranceId string The identifier of the utterance that has started. 6.0.0

Enums

AudioSessionCategory

Members Value Description Since
Ambient 'AMBIENT' The audio session category for ambient sound. Audio from other apps mixes with your audio. Screen locking and the Silent switch silence your audio. 6.0.0
Playback 'PLAYBACK' The audio session category for playback. App audio continues with the Silent switch set to silent or when the screen locks. 6.0.0

QueueStrategy

Members Value Description Since
Add 0 Add the utterance to the end of the queue. 6.0.0
Flush 1 Flush the queue and add the utterance to the beginning of the queue. 6.0.0

Changelog

See CHANGELOG.md.

License

See LICENSE.