Skip to content

@capawesome-team/capacitor-speech-recognition

Capacitor plugin to transcribe speech into text.

Features

  • 🖥️ Cross-platform: Supports Android, iOS and Web.
  • 🌐 Multiple languages: Supports many different languages.
  • 🛠 Permissions: Check and request permissions for recording audio.
  • 🎧 Listening: Check if the speech recognizer is available and currently listening.
  • 🎙 Events: Listen for events like beginningOfSpeech, endOfSpeech, error, partialResults, readyForSpeech, and results.
  • 🔇 Silence Detection: Automatically detects silence to stop the recording.
  • 🔁 Up-to-date: Always supports the latest Capacitor version.

Installation

This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:

npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>

Attention: Replace <YOUR_LICENSE_KEY> with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.

Next, install the package:

npm install @capawesome-team/capacitor-speech-recognition
npx cap sync

iOS

Privacy Descriptions

Add the NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription keys to the ios/App/App/Info.plist file, which tells the user why your app is requesting location information:

<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition is used to transcribe speech into text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone is used to record audio for speech recognition.</string>

Configuration

No configuration required for this plugin.

Usage

import { SpeechRecognition } from '@capawesome-team/capacitor-speech-recognition';

const startListening = async () => {
  await SpeechRecognition.startListening({
    language: 'en-US',
    maxResults: 5,
    shouldReturnPartialResults: true,
  });
};

const stopListening = async () => {
  await SpeechRecognition.stopListening();
};

const checkPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.checkPermissions();
  return recordAudio;
};

const requestPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.requestPermissions();
  return recordAudio;
};

const isAvailable = async () => {
  const { available } = await SpeechRecognition.isAvailable();
  return available;
};

const isListening = async () => {
  const { listening } = await SpeechRecognition.isListening();
  return listening;
};

const getSupportedLanguages = async () => {
  const { languages } = await SpeechRecognition.getSupportedLanguages();
  return languages;
};

const addListeners = () => {
  SpeechRecognition.addListener('beginningOfSpeech', () => {
    console.log('User has started to speak');
  });

  SpeechRecognition.addListener('endOfSpeech', () => {
    console.log('User has stopped speaking');
  });

  SpeechRecognition.addListener('error', (event) => {
    console.error(event.message);
  });

  SpeechRecognition.addListener('partialResults', (event) => {
    console.log('Partial results:', event.results);
  });

  SpeechRecognition.addListener('readyForSpeech', () => {
    console.log('Speech recognizer is listening');
  });

  SpeechRecognition.addListener('results', (event) => {
    console.log('Final results:', event.results);
  });
};

const removeAllListeners = async () => {
  await SpeechRecognition.removeAllListeners();
};

API

getSupportedLanguages()

getSupportedLanguages() => Promise<GetSupportedLanguagesResult>

Get the supported languages for speech recognition.

Only available on Android and iOS.

Returns: Promise<GetSupportedLanguagesResult>

Since: 6.0.0


isAvailable()

isAvailable() => Promise<IsAvailableResult>

Check if the speech recognizer is available on the device.

Returns: Promise<IsAvailableResult>

Since: 6.0.0


isListening()

isListening() => Promise<IsListeningResult>

Check if the speech recognizer is currently listening.

Returns: Promise<IsListeningResult>

Since: 6.0.0


startListening(...)

startListening(options?: StartListeningOptions | undefined) => Promise<void>

Start listening for speech.

Param Type
options StartListeningOptions

Since: 6.0.0


stopListening()

stopListening() => Promise<void>

Stop listening for speech.

Since: 6.0.0


checkPermissions()

checkPermissions() => Promise<PermissionStatus>

Check permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0


requestPermissions()

requestPermissions() => Promise<PermissionStatus>

Request permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0


addListener('beginningOfSpeech', ...)

addListener(eventName: 'beginningOfSpeech', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has started to speak.

Only available on Android and Web.

Param Type
eventName 'beginningOfSpeech'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('endOfSpeech', ...)

addListener(eventName: 'endOfSpeech', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has stopped speaking.

Only available on Android and Web.

Param Type
eventName 'endOfSpeech'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('error', ...)

addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>

Called when an error occurs.

Param Type
eventName 'error'
listenerFunc (event: ErrorEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('listeningState', ...)

addListener(eventName: 'listeningState', listenerFunc: (event: ListeningStateEvent) => void) => Promise<PluginListenerHandle>

Called when the listening state changes.

Param Type
eventName 'listeningState'
listenerFunc (event: ListeningStateEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('partialResults', ...)

addListener(eventName: 'partialResults', listenerFunc: (event: PartialResultsEvent) => void) => Promise<PluginListenerHandle>

Called when a partial result is available.

Param Type
eventName 'partialResults'
listenerFunc (event: PartialResultsEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('readyForSpeech', ...)

addListener(eventName: 'readyForSpeech', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the speech recognizer is listening for speech.

Param Type
eventName 'readyForSpeech'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('results', ...)

addListener(eventName: 'results', listenerFunc: (event: ResultsEvent) => void) => Promise<PluginListenerHandle>

Called when the final results are available.

Param Type
eventName 'results'
listenerFunc (event: ResultsEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


removeAllListeners()

removeAllListeners() => Promise<void>

Remove all listeners for this plugin.

Since: 6.0.0


Interfaces

GetSupportedLanguagesResult

Prop Type Description Since
languages string[] The supported languages for speech recognition as BCP-47 language tags. 6.0.0

IsAvailableResult

Prop Type Description Since
available boolean Whether or not the speech recognizer is available on the device. 6.0.0

IsListeningResult

Prop Type Description Since
listening boolean Whether or not the speech recognizer is currently listening. 6.0.0

StartListeningOptions

Prop Type Description Default Since
language string The language to use for speech recognition. 6.0.0
maxResults number The maximum number of results to return. 5 6.0.0
shouldReturnPartialResults boolean Whether or not to receive partial results. true 6.0.0

PermissionStatus

Prop Type Description Since
recordAudio PermissionState Permission state for recording audio. 6.0.0

PluginListenerHandle

Prop Type
remove () => Promise<void>

ErrorEvent

Prop Type Description Since
message string The error message. 6.0.0

ListeningStateEvent

Prop Type Description Since
listening boolean Whether or not the speech recognizer is listening. 6.0.0

PartialResultsEvent

Prop Type Description Since
results string[] The partial results of the speech recognition. 6.0.0

ResultsEvent

Prop Type Description Since
results string[] The final results of the speech recognition. 6.0.0

Type Aliases

PermissionState

'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'

Changelog

See CHANGELOG.md.

License

See LICENSE.