Package 'carelesswhisper' reference manual

Title:	Automatic Speech Recognition using Whisper.cpp
Description:	Wrapper for whisper.cpp to perform automatic speech recognition.
Authors:	mikefc
Maintainer:	mikefc <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.1
Built:	2025-01-14 05:37:14 UTC
Source:	https://github.com/coolbutuseless/carelesswhisper

Audio sample for testing

Description

Audio sample for testing

Usage

jfk
jfk

Format

An object of class audioSample of length 176000.

Record audio from the default input device

Description

Record audio from the default input device

Usage

record_audio(seconds)
record_audio(seconds)

Arguments

seconds

recording length

Value

Numeric vector of mono sound data sampled at 16kHz

Perform automatic speech recognition of the given sound sample

Description

Perform automatic speech recognition of the given sound sample

Usage

whisper(ctx, snd, params = list(), verbose = FALSE, details = FALSE)
whisper(ctx, snd, params = list(), verbose = FALSE, details = FALSE)

Arguments

`ctx`	whisper context (which you have previously created using `whisper_init()`)
`snd`	Sound data. 16kHz mono audio in a numeric vector with all values in the range [-1, 1]. This package includes the function 'record_audio()' which will record audio in this format. You could also use `audio::record()` or any other audio package you have access to.
`params`	parameters for whisper. A user should usually create a default set of parameters by calling `whisper_param_defaults()` and then modify.
`verbose`	logical. be verbose? default: FALSE.
`details`	logical. return detailed breakdown as a data.frame? default: FALSE

Value

Character string

Examples

## Not run: 
  ctx <- whisper_init()  # Initialise the model
  snd <- record_audio(2) # record 2 seconds of audio 
  whisper(ctx, snd)      # perform speech recognition

## End(Not run)

## Not run: 
  ctx <- whisper_init()  # Initialise the model
  snd <- record_audio(2) # record 2 seconds of audio 
  whisper(ctx, snd)      # perform speech recognition

## End(Not run)

Fetch a copy of the default whisper parameters

Description

n_threads: Number of threads to use when processing. Default: 4
translate: Translate from source language into english? Default: FALSE
language: language represented in audio. Use 'auto' to automatically detect language. Default: 'en'
max_len: maximum segment length in characters. Default: 0 (meaning no limit. Set to 1 to get one-word-per-segment.)

Usage

whisper_default_params()
whisper_default_params()

Value

Named list of default parameters

Initialise whisper by loading a model

Description

Initialise whisper by loading a model

Usage

whisper_init(
  model_path = system.file("ggml-tiny.bin", package = "carelesswhisper", mustWork = TRUE),
  verbose = FALSE
)
whisper_init(
  model_path = system.file("ggml-tiny.bin", package = "carelesswhisper", mustWork = TRUE),
  verbose = FALSE
)

Arguments

`model_path`	path to whisper.cpp model. By default this will use the "ggml-tiny.bin" file included with this package installation, which is a tiny multi-language model See README for this package, or the original whisper.cpp documentation, for how to download other models.
`verbose`	Be verbose about model initialisation? Logical. Default: FALSE

Value

whisper context (ctx)

Named list of two-letter language codes to use as `language` parameter

Description

Named list of two-letter language codes to use as language parameter

Usage

whisper_lang_codes
whisper_lang_codes

Format

An object of class list of length 99.

Package 'carelesswhisper'

Help Index

Audio sample for testing

Description

Usage

Format

Record audio from the default input device

Description

Usage

Arguments

Value

Perform automatic speech recognition of the given sound sample

Description

Usage

Arguments

Value

Examples

Fetch a copy of the default whisper parameters

Description

Usage

Value

Initialise whisper by loading a model

Description

Usage

Arguments

Value

Named list of two-letter language codes to use as language parameter

Description

Usage

Format

Named list of two-letter language codes to use as `language` parameter