Class: LLM::Gemini::Audio

Inherits:
Object
  • Object
show all
Defined in:
lib/llm/providers/gemini/audio.rb

Overview

The LLM::Gemini::Audio class provides an audio object for interacting with Gemini’s audio API.

Examples:

#!/usr/bin/env ruby
require "llm"

llm = LLM.gemini(key: ENV["KEY"])
res = llm.audio.create_transcription(input: "/audio/rocket.mp3")
res.text # => "A dog on a rocket to the moon"

Instance Method Summary collapse

Constructor Details

#initialize(provider) ⇒ LLM::Gemini::Responses

Returns a new Audio object

Parameters:



19
20
21
# File 'lib/llm/providers/gemini/audio.rb', line 19

def initialize(provider)
  @provider = provider
end

Instance Method Details

#create_speechObject

Raises:

  • (NotImplementedError)

    This method is not implemented by Gemini



26
27
28
# File 'lib/llm/providers/gemini/audio.rb', line 26

def create_speech
  raise NotImplementedError
end

#create_transcription(file:, model: "gemini-1.5-flash", **params) ⇒ LLM::Response

Create an audio transcription

Examples:

llm = LLM.gemini(key: ENV["KEY"])
res = llm.audio.create_transcription(file: "/audio/rocket.mp3")
res.text # => "A dog on a rocket to the moon"

Parameters:

  • file (String, LLM::File, LLM::Response)

    The input audio

  • model (String) (defaults to: "gemini-1.5-flash")

    The model to use

  • params (Hash)

    Other parameters (see Gemini docs)

Returns:

See Also:



42
43
44
45
46
47
48
49
# File 'lib/llm/providers/gemini/audio.rb', line 42

def create_transcription(file:, model: "gemini-1.5-flash", **params)
  res = @provider.complete [
    "Your task is to transcribe the contents of an audio file",
    "Your response should include the transcription, and nothing else",
    LLM.File(file)
  ], params.merge(role: :user, model:)
  res.tap { _1.define_singleton_method(:text) { choices[0].content } }
end

#create_translation(file:, model: "gemini-1.5-flash", **params) ⇒ LLM::Response

Create an audio translation (in English)

Examples:

# Arabic => English
llm = LLM.gemini(key: ENV["KEY"])
res = llm.audio.create_translation(file: "/audio/bismillah.mp3")
res.text # => "In the name of Allah, the Beneficent, the Merciful."

Parameters:

  • file (String, LLM::File, LLM::Response)

    The input audio

  • model (String) (defaults to: "gemini-1.5-flash")

    The model to use

  • params (Hash)

    Other parameters (see Gemini docs)

Returns:

See Also:



64
65
66
67
68
69
70
71
# File 'lib/llm/providers/gemini/audio.rb', line 64

def create_translation(file:, model: "gemini-1.5-flash", **params)
  res = @provider.complete [
    "Your task is to translate the contents of an audio file into English",
    "Your response should include the translation, and nothing else",
    LLM.File(file)
  ], params.merge(role: :user, model:)
  res.tap { _1.define_singleton_method(:text) { choices[0].content } }
end