Class: LLM::OpenAI::Audio

Inherits:
Object
  • Object
show all
Defined in:
lib/llm/providers/openai/audio.rb

Overview

The LLM::OpenAI::Audio class provides an audio object for interacting with OpenAI’s audio API.

Examples:

llm = LLM.openai(ENV["KEY"])
res = llm.audio.create_speech(input: "A dog on a rocket to the moon")
File.binwrite("rocket.mp3", res.audio.string)

Instance Method Summary collapse

Constructor Details

#initialize(provider) ⇒ LLM::OpenAI::Responses

Returns a new Audio object

Parameters:



18
19
20
# File 'lib/llm/providers/openai/audio.rb', line 18

def initialize(provider)
  @provider = provider
end

Instance Method Details

#create_speech(input:, voice: "alloy", model: "gpt-4o-mini-tts", response_format: "mp3", **params) ⇒ LLM::Response::Audio

Create an audio track

Examples:

llm = LLM.openai(ENV["KEY"])
res = llm.images.create_speech(input: "A dog on a rocket to the moon")
File.binwrite("rocket.mp3", res.audio.string)

Parameters:

  • input (String)

    The text input

  • voice (String) (defaults to: "alloy")

    The voice to use

  • model (String) (defaults to: "gpt-4o-mini-tts")

    The model to use

  • response_format (String) (defaults to: "mp3")

    The response format

  • params (Hash)

    Other parameters (see OpenAI docs)

Returns:

Raises:

See Also:



36
37
38
39
40
41
42
# File 'lib/llm/providers/openai/audio.rb', line 36

def create_speech(input:, voice: "alloy", model: "gpt-4o-mini-tts", response_format: "mp3", **params)
  req = Net::HTTP::Post.new("/v1/audio/speech", headers)
  req.body = JSON.dump({input:, voice:, model:, response_format:}.merge!(params))
  io = StringIO.new("".b)
  res = request(http, req) { _1.read_body { |chunk| io << chunk } }
  LLM::Response::Audio.new(res).tap { _1.audio = io }
end

#create_transcription(file:, model: "whisper-1", **params) ⇒ LLM::Response::AudioTranscription

Create an audio transcription

Examples:

llm = LLM.openai(ENV["KEY"])
res = llm.audio.create_transcription(file: LLM::File("/rocket.mp3"))
res.text # => "A dog on a rocket to the moon"

Parameters:

  • file (LLM::File)

    The input audio

  • model (String) (defaults to: "whisper-1")

    The model to use

  • params (Hash)

    Other parameters (see OpenAI docs)

Returns:

Raises:

See Also:



56
57
58
59
60
61
62
63
# File 'lib/llm/providers/openai/audio.rb', line 56

def create_transcription(file:, model: "whisper-1", **params)
  multi = LLM::Multipart.new(params.merge!(file:, model:))
  req = Net::HTTP::Post.new("/v1/audio/transcriptions", headers)
  req["content-type"] = multi.content_type
  req.body_stream = multi.body
  res = request(http, req)
  LLM::Response::AudioTranscription.new(res).tap { _1.text = _1.body["text"] }
end

#create_translation(file:, model: "whisper-1", **params) ⇒ LLM::Response::AudioTranslation

Create an audio translation (in English)

Examples:

# Arabic => English
llm = LLM.openai(ENV["KEY"])
res = llm.audio.create_translation(file: LLM::File("/bismillah.mp3"))
res.text # => "In the name of Allah, the Beneficent, the Merciful."

Parameters:

  • file (LLM::File)

    The input audio

  • model (String) (defaults to: "whisper-1")

    The model to use

  • params (Hash)

    Other parameters (see OpenAI docs)

Returns:

Raises:

See Also:



78
79
80
81
82
83
84
85
# File 'lib/llm/providers/openai/audio.rb', line 78

def create_translation(file:, model: "whisper-1", **params)
  multi = LLM::Multipart.new(params.merge!(file:, model:))
  req = Net::HTTP::Post.new("/v1/audio/translations", headers)
  req["content-type"] = multi.content_type
  req.body_stream = multi.body
  res = request(http, req)
  LLM::Response::AudioTranslation.new(res).tap { _1.text = _1.body["text"] }
end