@capawesome-team/capacitor-speech-synthesis¶
Capacitor plugin for synthesizing speech from text (also known as text-to-speech).
Features¶
- 🖥️ Cross-platform: Supports Android, iOS and Web.
- 🌐 Multiple Languages: Supports many different languages.
- 🗣️ Multiple Voices: Supports multiple voices for each language.
- 🎚️ Customization: Customize the pitch, rate, volume and voice of the speech.
- 🎧 Background Audio: Synthesize speech from text while your application runs in the background.
- 📜 Queue Strategy: Add or flush the utterance to the queue.
- 🎤 Events: Listen for events like
boundary
,end
,error
andstart
. - 🤝 Compatibility: Compatible with the Speech Recognition and Native Audio plugin.
- 📦 SPM: Supports Swift Package Manager for iOS.
- 🔁 Up-to-date: Always supports the latest Capacitor version.
- ⭐️ Support: First-class support from the Capawesome Team.
Compatibility¶
Plugin Version | Capacitor Version | Status |
---|---|---|
6.x.x | 6.x.x | Deprecated |
7.x.x | >=7.x.x | Active support |
Installation¶
This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:
npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>
Attention: Replace <YOUR_LICENSE_KEY>
with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.
Next, install the package:
Android¶
Proguard¶
If you are using Proguard, you need to add the following rules to your proguard-rules.pro
file:
Configuration¶
No configuration required for this plugin.
Usage¶
import { SpeechSynthesis, AudioSessionCategory, QueueStrategy } from '@capawesome-team/capacitor-speech-synthesis';
const activateAudioSession = async () => {
await SpeechSynthesis.activateAudioSession({ category: AudioSessionCategory.Ambient });
};
const cancel = async () => {
await SpeechSynthesis.cancel();
};
const deactivateAudioSession = async () => {
await SpeechSynthesis.deactivateAudioSession();
};
const getLanguages = async () => {
const result = await SpeechSynthesis.getLanguages();
return result.languages;
};
const getVoices = async () => {
const result = await SpeechSynthesis.getVoices();
return result.voices;
};
const initialize = async () => {
await SpeechSynthesis.initialize();
};
const isAvailable = async () => {
const result = await SpeechSynthesis.isAvailable();
return result.isAvailable;
};
const isSpeaking = async () => {
const result = await SpeechSynthesis.isSpeaking();
return result.isSpeaking;
};
const isLanguageAvailable = async () => {
const result = await SpeechSynthesis.isLanguageAvailable({ language: 'en-US' });
return result.isAvailable;
};
const isVoiceAvailable = async () => {
const result = await SpeechSynthesis.isVoiceAvailable({ voiceId: 'com.apple.ttsbundle.Samantha-compact' });
return result.isAvailable;
};
const speak = async () => {
// Add an utterance to the utterance queue to be spoken
const { utteranceId } = await SpeechSynthesis.speak({
language: 'en-US',
pitch: 1.0,
queueStrategy: QueueStrategy.Add,
rate: 1.0,
text: 'Hello, World!',
voiceId: 'com.apple.ttsbundle.Samantha-compact',
volume: 1.0,
});
// Wait for the utterance to finish
await new Promise(resolve => {
void SpeechSynthesis.addListener('end', event => {
if (event.utteranceId === utteranceId) {
resolve();
}
});
});
};
const synthesizeToFile = async () => {
// Add an utterance to the utterance queue to be synthesized to a file
const { path, utteranceId } = await SpeechSynthesis.synthesizeToFile({
language: 'en-US',
pitch: 1.0,
queueStrategy: QueueStrategy.Add,
rate: 1.0,
text: 'Hello, World!',
voiceId: 'com.apple.ttsbundle.Samantha-compact',
volume: 1.0,
});
// Wait for the utterance to finish
await new Promise(resolve => {
void SpeechSynthesis.addListener('end', event => {
if (event.utteranceId === utteranceId) {
resolve();
}
});
});
// Return the path to the synthesized audio file
return path;
};
const addListeners = () => {
SpeechSynthesis.addListener('boundary', (event) => {
console.log('boundary', event);
});
SpeechSynthesis.addListener('end', (event) => {
console.log('end', event);
});
SpeechSynthesis.addListener('error', (event) => {
console.log('error', event);
});
SpeechSynthesis.addListener('start', (event) => {
console.log('start', event);
});
};
const removeAllListeners = async () => {
await SpeechSynthesis.removeAllListeners();
};
API¶
activateAudioSession(...)
cancel()
deactivateAudioSession()
getLanguages()
getVoices()
initialize()
isAvailable()
isSpeaking()
isLanguageAvailable(...)
isVoiceAvailable(...)
speak(...)
synthesizeToFile(...)
addListener('boundary', ...)
addListener('end', ...)
addListener('error', ...)
addListener('start', ...)
removeAllListeners()
- Interfaces
- Type Aliases
- Enums
activateAudioSession(...)¶
Activate the audio session. This method is not mandatory. It can be used to set the audio session category before speaking.
Only available on iOS.
Param | Type |
---|---|
options |
ActivateAudioSessionOptions |
Since: 6.0.0
cancel()¶
Remove all utterances from the utterance queue.
Since: 6.0.0
deactivateAudioSession()¶
Deactivate the audio session.
Only available on iOS.
Since: 6.0.0
getLanguages()¶
Get the available languages for speech synthesis.
Returns: Promise<GetLanguagesResult>
Since: 6.0.0
getVoices()¶
Get the available voices for speech synthesis.
Returns: Promise<GetVoicesResult>
Since: 6.0.0
initialize()¶
Initialize the plugin. This method must be called before any other method.
Only available on Android and iOS.
Since: 6.0.0
isAvailable()¶
Check if speech synthesis is available on the current device.
Returns: Promise<IsAvailableResult>
Since: 6.0.0
isSpeaking()¶
Check if speech synthesis is currently speaking.
Returns: Promise<IsSpeakingResult>
Since: 6.0.0
isLanguageAvailable(...)¶
Check if a language is available for speech synthesis.
Param | Type |
---|---|
options |
IsLanguageAvailableOption |
Returns: Promise<IsLanguageAvailableResult>
Since: 6.0.0
isVoiceAvailable(...)¶
Check if a voice is available for speech synthesis.
Param | Type |
---|---|
options |
IsVoiceAvailableOption |
Returns: Promise<IsVoiceAvailableResult>
Since: 6.0.0
speak(...)¶
Add an utterance to the utterance queue to be spoken.
The end
event will be emitted when the utterance has finished.
Param | Type |
---|---|
options |
SpeakOptions |
Returns: Promise<SpeakResult>
Since: 6.0.0
synthesizeToFile(...)¶
Add an utterance to the utterance queue to be synthesized to a file.
The end
event will be emitted when the utterance has finished.
Only available on Android and iOS.
Param | Type |
---|---|
options |
SpeakOptions |
Returns: Promise<SynthesizeToFileResult>
Since: 7.1.0
addListener('boundary', ...)¶
addListener(eventName: 'boundary', listenerFunc: (event: BoundaryEvent) => void) => Promise<PluginListenerHandle>
Called hen the spoken utterance reaches a word boundary.
Param | Type |
---|---|
eventName |
'boundary' |
listenerFunc |
(event: BoundaryEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('end', ...)¶
addListener(eventName: 'end', listenerFunc: (event: EndEvent) => void) => Promise<PluginListenerHandle>
Called when the spoken utterance has finished.
Param | Type |
---|---|
eventName |
'end' |
listenerFunc |
(event: EndEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('error', ...)¶
addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>
Called when an error occurs during speech synthesis.
Param | Type |
---|---|
eventName |
'error' |
listenerFunc |
(event: ErrorEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('start', ...)¶
addListener(eventName: 'start', listenerFunc: (event: StartEvent) => void) => Promise<PluginListenerHandle>
Called when the spoken utterance has started.
Param | Type |
---|---|
eventName |
'start' |
listenerFunc |
(event: StartEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
removeAllListeners()¶
Remove all listeners for the plugin.
Since: 6.0.0
Interfaces¶
ActivateAudioSessionOptions¶
Prop | Type | Description | Since |
---|---|---|---|
category |
AudioSessionCategory |
The audio session category to set. | 6.0.0 |
GetLanguagesResult¶
Prop | Type | Description | Since |
---|---|---|---|
languages |
string[] |
The available languages as BC-47 language tags. | 6.0.0 |
GetVoicesResult¶
Prop | Type | Description | Since |
---|---|---|---|
voices |
Voice[] |
The available voices. | 6.0.0 |
Voice¶
Prop | Type | Description | Since |
---|---|---|---|
default |
boolean |
Whether or not the voice is the default voice. Only available on Web. | 6.0.0 |
gender |
'female' | 'male' |
The gender of the voice. Only available on iOS. | 6.0.0 |
id |
string |
The identifier of the voice. | 6.0.0 |
isNetworkConnectionRequired |
boolean |
Whether or not the voice is available via a local or remote service. | 6.0.0 |
language |
string |
The BC-47 language tag for the language of the voice. | 6.0.0 |
name |
string |
The name of the voice. | 6.0.0 |
IsAvailableResult¶
Prop | Type | Description | Since |
---|---|---|---|
isAvailable |
boolean |
Whether or not speech synthesis is available on the current device. | 6.0.0 |
IsSpeakingResult¶
Prop | Type | Description | Since |
---|---|---|---|
isSpeaking |
boolean |
Whether or not an utterance is currently being spoken. | 6.0.0 |
IsLanguageAvailableResult¶
Prop | Type | Description | Since |
---|---|---|---|
isAvailable |
boolean |
Whether or not the language is available for speech synthesis. | 6.0.0 |
IsLanguageAvailableOption¶
Prop | Type | Description | Since |
---|---|---|---|
language |
string |
The BC-47 language tag for the language to check. | 6.0.0 |
IsVoiceAvailableResult¶
Prop | Type | Description | Since |
---|---|---|---|
isAvailable |
boolean |
Whether or not the voice is available for speech synthesis. | 6.0.0 |
IsVoiceAvailableOption¶
Prop | Type | Description | Since |
---|---|---|---|
voiceId |
string |
The identifier of the voice to check. | 6.0.0 |
SpeakResult¶
Prop | Type | Description | Since |
---|---|---|---|
utteranceId |
string |
The identifier of the utterance that is being spoken. | 6.0.0 |
SpeakOptions¶
Prop | Type | Description | Default | Since |
---|---|---|---|---|
language |
string |
The BC-47 language tag for the language to use for speech synthesis. On iOS, this option is only used when the voiceId option is not provided. |
6.0.0 | |
pitch |
number |
The pitch that the utterance will be spoken at. | 1.0 |
6.0.0 |
queueStrategy |
QueueStrategy |
The queue strategy to use for the utterance. | QueueStrategy.Add |
6.0.0 |
rate |
number |
The speed at which the utterance will be spoken at. | 1.0 |
6.0.0 |
text |
string |
The text that will be synthesized when the utterance is spoken. | 6.0.0 | |
voiceId |
string |
The identifier of the voice to use for speech synthesis. | 6.0.0 | |
volume |
number |
The volume that the utterance will be spoken at. | 1.0 |
6.0.0 |
SynthesizeToFileResult¶
Prop | Type | Description | Since |
---|---|---|---|
path |
string |
The path to which the synthesized audio file will be saved. The file is available as soon as the end event is emitted. Only available on Android and iOS. |
7.1.0 |
PluginListenerHandle¶
Prop | Type |
---|---|
remove |
() => Promise<void> |
BoundaryEvent¶
Prop | Type | Description | Since |
---|---|---|---|
endIndex |
number |
The index of the last character in the word. | 6.0.0 |
startIndex |
number |
The index of the first character in the word. | 6.0.0 |
utteranceId |
string |
The identifier of the utterance that is being spoken. | 6.0.0 |
word |
string |
The word that was spoken. | 6.0.0 |
EndEvent¶
Prop | Type | Description | Since |
---|---|---|---|
utteranceId |
string |
The identifier of the utterance that has finished. | 6.0.0 |
ErrorEvent¶
Prop | Type | Description | Since |
---|---|---|---|
message |
string |
The error message. | 6.0.0 |
utteranceId |
string |
The identifier of the utterance that caused the error. | 6.0.0 |
StartEvent¶
Prop | Type | Description | Since |
---|---|---|---|
utteranceId |
string |
The identifier of the utterance that has started. | 6.0.0 |
Type Aliases¶
SynthesizeToFileOptions¶
Enums¶
AudioSessionCategory¶
Members | Value | Description | Since |
---|---|---|---|
Ambient |
'AMBIENT' |
The audio session category for ambient sound. Audio from other apps mixes with your audio. Screen locking and the Silent switch silence your audio. | 6.0.0 |
Playback |
'PLAYBACK' |
The audio session category for playback. App audio continues with the Silent switch set to silent or when the screen locks. | 6.0.0 |
QueueStrategy¶
Members | Value | Description | Since |
---|---|---|---|
Add |
0 |
Add the utterance to the end of the queue. | 6.0.0 |
Flush |
1 |
Flush the queue and add the utterance to the beginning of the queue. | 6.0.0 |
Changelog¶
See CHANGELOG.md.
License¶
See LICENSE.