@capawesome-team/capacitor-speech-recognition¶
Capacitor plugin to transcribe speech into text (also known as speech-to-text).
Features¶
- 🖥️ Cross-platform: Supports Android, iOS and Web.
- 🌐 Multiple Languages: Supports many different languages.
- 🛠 Permissions: Check and request permissions for recording audio.
- 🎙 Events: Listen for events like
start
,end
,speechStart
,speechEnd
,error
,partialResults
, andresults
. - 🔇 Silence Detection: Automatically detects silence to stop the recording.
- 📊 Silence Threshold: Define what's considered "silence" for your recordings.
- 💬 Contextual Strings: Provide an array of phrases that should be recognized, even if they are not in the system vocabulary.
- 🤝 Compatibility: Compatible with the Speech Synthesis and Native Audio plugin.
- 📦 SPM: Supports Swift Package Manager for iOS.
- 🔁 Up-to-date: Always supports the latest Capacitor version.
- ⭐️ Support: First-class support from the Capawesome Team.
Compatibility¶
Plugin Version | Capacitor Version | Status |
---|---|---|
6.x.x | 6.x.x | Deprecated |
7.x.x | >=7.x.x | Active support |
Installation¶
This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:
npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>
Attention: Replace <YOUR_LICENSE_KEY>
with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.
Next, install the package:
Android¶
Proguard¶
If you are using Proguard, you need to add the following rules to your proguard-rules.pro
file:
iOS¶
Privacy Descriptions¶
Add the NSSpeechRecognitionUsageDescription
and NSMicrophoneUsageDescription
keys to the ios/App/App/Info.plist
file, which tells the user why your app is requesting location information:
<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition is used to transcribe speech into text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone is used to record audio for speech recognition.</string>
Configuration¶
No configuration required for this plugin.
Usage¶
import { SpeechRecognition } from '@capawesome-team/capacitor-speech-recognition';
const startListening = async () => {
await SpeechRecognition.startListening({
language: 'en-US',
silenceThreshold: 2000,
});
};
const stopListening = async () => {
await SpeechRecognition.stopListening();
};
const checkPermissions = async () => {
const { audioRecording, speechRecognition } = await SpeechRecognition.checkPermissions();
};
const requestPermissions = async () => {
const { audioRecording, speechRecognition } = await SpeechRecognition.requestPermissions({
permissions: ['audioRecording', 'speechRecognition'],
});
};
const isAvailable = async () => {
const { available } = await SpeechRecognition.isAvailable();
return available;
};
const isListening = async () => {
const { listening } = await SpeechRecognition.isListening();
return listening;
};
const getSupportedLanguages = async () => {
const { languages } = await SpeechRecognition.getSupportedLanguages();
return languages;
};
const addListeners = () => {
SpeechRecognition.addListener('start', () => {
console.log('Speech recognition started');
});
SpeechRecognition.addListener('end', () => {
console.log('Speech recognition ended');
});
SpeechRecognition.addListener('error', (event) => {
console.error('Speech recognition error:', event.message);
});
SpeechRecognition.addListener('partialResult', (event) => {
console.log('Partial result:', event.result);
});
SpeechRecognition.addListener('result', (event) => {
console.log('Final result:', event.result);
});
SpeechRecognition.addListener('speechStart', () => {
console.log('User started speaking');
});
SpeechRecognition.addListener('speechEnd', () => {
console.log('User stopped speaking');
});
};
const removeAllListeners = async () => {
await SpeechRecognition.removeAllListeners();
};
API¶
getLanguages()
isAvailable()
isListening()
startListening(...)
stopListening(...)
checkPermissions()
requestPermissions(...)
addListener('end', ...)
addListener('error', ...)
addListener('partialResult', ...)
addListener('result', ...)
addListener('speechEnd', ...)
addListener('speechStart', ...)
addListener('start', ...)
removeAllListeners()
- Interfaces
- Type Aliases
- Enums
getLanguages()¶
Get the available languages for speech recognition.
Attention: On Android, this method is unfortunately not supported by all devices. If the method is not supported, the promise will never resolve. It's recommended to set a timeout for the promise.
Only available on Android and iOS.
Returns: Promise<GetLanguagesResult>
Since: 6.0.0
isAvailable()¶
Check if the speech recognizer is available on the device.
Returns: Promise<IsAvailableResult>
Since: 6.0.0
isListening()¶
Check if the speech recognizer is currently listening.
Returns: Promise<IsListeningResult>
Since: 6.0.0
startListening(...)¶
Start listening for speech.
Param | Type |
---|---|
options |
StartListeningOptions |
Since: 6.0.0
stopListening(...)¶
Stop listening for speech.
Param | Type |
---|---|
options |
StopListeningOptions |
Since: 6.0.0
checkPermissions()¶
Check permissions for the plugin.
Returns: Promise<PermissionStatus>
Since: 6.0.0
requestPermissions(...)¶
requestPermissions(permissions?: SpeechRecognitionPluginPermission | undefined) => Promise<PermissionStatus>
Request permissions for the plugin.
Param | Type |
---|---|
permissions |
SpeechRecognitionPluginPermission |
Returns: Promise<PermissionStatus>
Since: 6.0.0
addListener('end', ...)¶
Called when the speech recognizer has stopped listening.
Param | Type |
---|---|
eventName |
'end' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('error', ...)¶
addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>
Called when an error occurs.
Param | Type |
---|---|
eventName |
'error' |
listenerFunc |
(event: ErrorEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('partialResult', ...)¶
addListener(eventName: 'partialResult', listenerFunc: (event: PartialResultEvent) => void) => Promise<PluginListenerHandle>
Called when a partial result is available.
Param | Type |
---|---|
eventName |
'partialResult' |
listenerFunc |
(event: PartialResultEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('result', ...)¶
addListener(eventName: 'result', listenerFunc: (event: ResultEvent) => void) => Promise<PluginListenerHandle>
Called when the final results are available.
Param | Type |
---|---|
eventName |
'result' |
listenerFunc |
(event: ResultEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('speechEnd', ...)¶
Called when the user has stopped speaking.
Only available on Android and Web.
Param | Type |
---|---|
eventName |
'speechEnd' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('speechStart', ...)¶
Called when the user has started to speak.
Only available on Android and Web.
Param | Type |
---|---|
eventName |
'speechStart' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener('start', ...)¶
Called when the speech recognizer has started listening.
Param | Type |
---|---|
eventName |
'start' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
removeAllListeners()¶
Remove all listeners for this plugin.
Since: 6.0.0
Interfaces¶
GetLanguagesResult¶
Prop | Type | Description | Since |
---|---|---|---|
languages |
string[] |
The supported languages for speech recognition as BCP-47 language tags. | 6.0.0 |
IsAvailableResult¶
Prop | Type | Description | Since |
---|---|---|---|
isAvailable |
boolean |
Whether or not the speech recognizer is available on the device. | 6.0.0 |
IsListeningResult¶
Prop | Type | Description | Since |
---|---|---|---|
isListening |
boolean |
Whether or not the speech recognizer is currently listening. | 6.0.0 |
StartListeningOptions¶
Prop | Type | Description | Default | Since |
---|---|---|---|---|
audioSessionCategory |
AudioSessionCategory |
The audio session category to use for speech recognition. Only available on iOS. | AudioSessionCategory.Record |
7.2.0 |
contextualStrings |
string[] |
An array of phrases that should be recognized, even if they are not in the system vocabulary. Only available on Android (SDK 33+) and iOS. | 7.3.0 | |
deactivateAudioSessionOnStop |
boolean |
Whether or not to deactivate your app's audio session on stop. Only available on iOS. | true |
7.2.0 |
language |
string |
The BC-47 language tag for the language to use for speech recognition. | 6.0.0 | |
silenceThreshold |
number |
The number of milliseconds of silence before the speech recognition ends. Only available on Android (SDK 33+) and iOS. | 2000 |
6.0.0 |
StopListeningOptions¶
Prop | Type | Description | Default | Since |
---|---|---|---|---|
deactivateAudioSession |
boolean |
Whether or not to deactivate your app's audio session. Only available on iOS. | true |
7.2.0 |
PermissionStatus¶
Prop | Type | Description | Since |
---|---|---|---|
audioRecording |
PermissionState |
Permission state for recording audio. | 7.1.0 |
recordAudio |
PermissionState |
Permission state for speech recognition. | 6.0.0 |
speechRecognition |
PermissionState |
Permission state for speech recognition. Only available on iOS. | 7.1.0 |
SpeechRecognitionPluginPermission¶
Prop | Type |
---|---|
permissions |
SpeechRecognitionPermissionType[] |
PluginListenerHandle¶
Prop | Type |
---|---|
remove |
() => Promise<void> |
ErrorEvent¶
Prop | Type | Description | Since |
---|---|---|---|
message |
string |
The error message. | 6.0.0 |
PartialResultEvent¶
Prop | Type | Description | Since |
---|---|---|---|
result |
string |
The partial result of the speech recognition. | 6.0.0 |
ResultEvent¶
Prop | Type | Description | Since |
---|---|---|---|
result |
string |
The final result of the speech recognition. | 6.0.0 |
Type Aliases¶
PermissionState¶
'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'
SpeechRecognitionPermissionType¶
'audioRecording' | 'speechRecognition'
Enums¶
AudioSessionCategory¶
Members | Value | Description | Since |
---|---|---|---|
Record |
'RECORD' |
The category for recording audio while also silencing playback audio. | 7.2.0 |
PlayAndRecord |
'PLAY_AND_RECORD' |
The category for recording (input) and playback (output) of audio. | 7.2.0 |
Changelog¶
See CHANGELOG.md.
License¶
See LICENSE.