@capawesome-team/capacitor-speech-recognition¶

Capacitor plugin to transcribe speech into text.

Features¶

🖥️ Cross-platform: Supports Android, iOS and Web.
🌐 Multiple Languages: Supports many different languages.
🛠 Permissions: Check and request permissions for recording audio.
🎙 Events: Listen for events like start, end, speechStart, speechEnd, error, partialResults, and results.
🔇 Silence Detection: Automatically detects silence to stop the recording.
📊 Silence Threshold: Define what's considered "silence" for your recordings.
🔁 Up-to-date: Always supports the latest Capacitor version.
⭐️ Support: First-class support from the Capawesome Team.

Installation¶

This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:

npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>

Attention: Replace <YOUR_LICENSE_KEY> with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.

Next, install the package:

npm install @capawesome-team/capacitor-speech-recognition
npx cap sync

iOS¶

Privacy Descriptions¶

Add the NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription keys to the ios/App/App/Info.plist file, which tells the user why your app is requesting location information:

<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition is used to transcribe speech into text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone is used to record audio for speech recognition.</string>

Configuration¶

No configuration required for this plugin.

Usage¶

import { SpeechRecognition } from '@capawesome-team/capacitor-speech-recognition';

const startListening = async () => {
  await SpeechRecognition.startListening({
    language: 'en-US',
    silenceThreshold: 2000,
  });
};

const stopListening = async () => {
  await SpeechRecognition.stopListening();
};

const checkPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.checkPermissions();
  return recordAudio;
};

const requestPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.requestPermissions();
  return recordAudio;
};

const isAvailable = async () => {
  const { available } = await SpeechRecognition.isAvailable();
  return available;
};

const isListening = async () => {
  const { listening } = await SpeechRecognition.isListening();
  return listening;
};

const getSupportedLanguages = async () => {
  const { languages } = await SpeechRecognition.getSupportedLanguages();
  return languages;
};

const addListeners = () => {
  SpeechRecognition.addListener('start', () => {
    console.log('Speech recognition started');
  });
  SpeechRecognition.addListener('end', () => {
    console.log('Speech recognition ended');
  });
  SpeechRecognition.addListener('error', (event) => {
    console.error('Speech recognition error:', event.message);
  });
  SpeechRecognition.addListener('partialResult', (event) => {
    console.log('Partial result:', event.result);
  });
  SpeechRecognition.addListener('result', (event) => {
    console.log('Final result:', event.result);
  });
  SpeechRecognition.addListener('speechStart', () => {
    console.log('User started speaking');
  });
  SpeechRecognition.addListener('speechEnd', () => {
    console.log('User stopped speaking');
  });
};

const removeAllListeners = async () => {
  await SpeechRecognition.removeAllListeners();
};

API¶

getLanguages()
isAvailable()
isListening()
startListening(...)
stopListening()
checkPermissions()
requestPermissions()
addListener('end', ...)
addListener('error', ...)
addListener('partialResult', ...)
addListener('result', ...)
addListener('speechEnd', ...)
addListener('speechStart', ...)
addListener('start', ...)
removeAllListeners()
Interfaces
Type Aliases

getLanguages()¶

getLanguages() => Promise<GetLanguagesResult>

Get the available languages for speech recognition.

Attention: On Android, this method is unfortunately not supported by all devices. If the method is not supported, the promise will never resolve. It's recommended to set a timeout for the promise.

Only available on Android and iOS.

Returns: Promise<GetLanguagesResult>

Since: 6.0.0

isAvailable()¶

isAvailable() => Promise<IsAvailableResult>

Check if the speech recognizer is available on the device.

Returns: Promise<IsAvailableResult>

Since: 6.0.0

isListening()¶

isListening() => Promise<IsListeningResult>

Check if the speech recognizer is currently listening.

Returns: Promise<IsListeningResult>

Since: 6.0.0

startListening(...)¶

startListening(options?: StartListeningOptions | undefined) => Promise<void>

Start listening for speech.

Param	Type
`options`	`StartListeningOptions`

Since: 6.0.0

stopListening()¶

stopListening() => Promise<void>

Stop listening for speech.

Since: 6.0.0

checkPermissions()¶

checkPermissions() => Promise<PermissionStatus>

Check permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0

requestPermissions()¶

requestPermissions() => Promise<PermissionStatus>

Request permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0

addListener('end', ...)¶

addListener(eventName: 'end', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the speech recognizer has stopped listening.

Param	Type
`eventName`	`'end'`
`listenerFunc`	`() => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('error', ...)¶

addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>

Called when an error occurs.

Param	Type
`eventName`	`'error'`
`listenerFunc`	`(event: ErrorEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('partialResult', ...)¶

addListener(eventName: 'partialResult', listenerFunc: (event: PartialResultEvent) => void) => Promise<PluginListenerHandle>

Called when a partial result is available.

Param	Type
`eventName`	`'partialResult'`
`listenerFunc`	`(event: PartialResultEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('result', ...)¶

addListener(eventName: 'result', listenerFunc: (event: ResultEvent) => void) => Promise<PluginListenerHandle>

Called when the final results are available.

Param	Type
`eventName`	`'result'`
`listenerFunc`	`(event: ResultEvent) => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('speechEnd', ...)¶

addListener(eventName: 'speechEnd', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has stopped speaking.

Only available on Android and Web.

Param	Type
`eventName`	`'speechEnd'`
`listenerFunc`	`() => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('speechStart', ...)¶

addListener(eventName: 'speechStart', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has started to speak.

Only available on Android and Web.

Param	Type
`eventName`	`'speechStart'`
`listenerFunc`	`() => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

addListener('start', ...)¶

addListener(eventName: 'start', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the speech recognizer has started listening.

Param	Type
`eventName`	`'start'`
`listenerFunc`	`() => void`

Returns: Promise<PluginListenerHandle>

Since: 6.0.0

removeAllListeners()¶

removeAllListeners() => Promise<void>

Remove all listeners for this plugin.

Since: 6.0.0

Interfaces¶

GetLanguagesResult¶

Prop	Type	Description	Since
`languages`	`string[]`	The supported languages for speech recognition as BCP-47 language tags.	6.0.0

IsAvailableResult¶

Prop	Type	Description	Since
`isAvailable`	`boolean`	Whether or not the speech recognizer is available on the device.	6.0.0

IsListeningResult¶

Prop	Type	Description	Since
`isListening`	`boolean`	Whether or not the speech recognizer is currently listening.	6.0.0

StartListeningOptions¶

Prop	Type	Description	Default	Since
`language`	`string`	The BC-47 language tag for the language to use for speech recognition.		6.0.0
`silenceThreshold`	`number`	The number of milliseconds of silence before the speech recognition ends. Only available on Android (SDK 33+) and iOS.	`2000`	6.0.0

PermissionStatus¶

Prop	Type	Description	Since
`recordAudio`	`PermissionState`	Permission state for recording audio.	6.0.0

PluginListenerHandle¶

Prop	Type
`remove`	`() => Promise<void>`

ErrorEvent¶

Prop	Type	Description	Since
`message`	`string`	The error message.	6.0.0

PartialResultEvent¶

Prop	Type	Description	Since
`result`	`string`	The partial result of the speech recognition.	6.0.0

ResultEvent¶

Prop	Type	Description	Since
`result`	`string`	The final result of the speech recognition.	6.0.0

Type Aliases¶

PermissionState¶

'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'

Changelog¶

See CHANGELOG.md.

License¶

See LICENSE.