Class: LLM::Gemini::Images

Inherits:
Object
  • Object
show all
Defined in:
lib/llm/providers/gemini/images.rb

Overview

The LLM::Gemini::Images class provides an images object for interacting with Gemini’s images API. Please note that unlike OpenAI, which can return either URLs or base64-encoded strings, Gemini’s images API will always return an image as a base64 encoded string that can be decoded into binary.

Examples:

#!/usr/bin/env ruby
require "llm"

llm = LLM.gemini(ENV["KEY"])
res = llm.images.create prompt: "A dog on a rocket to the moon"
File.binwrite "rocket.png", res.images[0].binary

Instance Method Summary collapse

Constructor Details

#initialize(provider) ⇒ LLM::Gemini::Responses

Returns a new Images object

Parameters:



24
25
26
# File 'lib/llm/providers/gemini/images.rb', line 24

def initialize(provider)
  @provider = provider
end

Instance Method Details

#create(prompt:, model: "gemini-2.0-flash-exp-image-generation", **params) ⇒ LLM::Response::Image

Note:

The prompt should make it clear you want to generate an image, or you might unexpectedly receive a purely textual response. This is due to how Gemini implements image generation under the hood.

Create an image

Examples:

llm = LLM.gemini(ENV["KEY"])
res = llm.images.create prompt: "A dog on a rocket to the moon"
File.binwrite "rocket.png", res.images[0].binary

Parameters:

  • prompt (String)

    The prompt

  • params (Hash)

    Other parameters (see Gemini docs)

Returns:

Raises:

See Also:



43
44
45
46
47
48
49
50
51
52
# File 'lib/llm/providers/gemini/images.rb', line 43

def create(prompt:, model: "gemini-2.0-flash-exp-image-generation", **params)
  req  = Net::HTTP::Post.new("/v1beta/models/#{model}:generateContent?key=#{secret}", headers)
  body = JSON.dump({
    contents: [{parts: {text: prompt}}],
    generationConfig: {responseModalities: ["TEXT", "IMAGE"]}
  }.merge!(params))
  req.body = body
  res = request(http, req)
  LLM::Response::Image.new(res).extend(response_parser)
end

#edit(image:, prompt:, model: "gemini-2.0-flash-exp-image-generation", **params) ⇒ LLM::Response::Image

Note:

The prompt should make it clear you want to generate an image, or you might unexpectedly receive a purely textual response. This is due to how Gemini implements image generation under the hood.

Edit an image

Examples:

llm = LLM.gemini(ENV["KEY"])
res = llm.images.edit image: LLM::File("cat.png"), prompt: "Add a hat to the cat"
File.binwrite "hatoncat.png", res.images[0].binary

Parameters:

  • image (LLM::File)

    The image to edit

  • prompt (String)

    The prompt

  • params (Hash)

    Other parameters (see Gemini docs)

Returns:

Raises:

See Also:



67
68
69
70
71
72
73
74
75
76
# File 'lib/llm/providers/gemini/images.rb', line 67

def edit(image:, prompt:, model: "gemini-2.0-flash-exp-image-generation", **params)
  req  = Net::HTTP::Post.new("/v1beta/models/#{model}:generateContent?key=#{secret}", headers)
  body = JSON.dump({
    contents: [{parts: [{text: prompt}, format_content(image)]}],
    generationConfig: {responseModalities: ["TEXT", "IMAGE"]}
  }.merge!(params)).b
  req.body_stream = StringIO.new(body)
  res = request(http, req)
  LLM::Response::Image.new(res).extend(response_parser)
end

#create_variationObject

Raises:

  • (NotImplementedError)

    This method is not implemented by Gemini



81
82
83
# File 'lib/llm/providers/gemini/images.rb', line 81

def create_variation
  raise NotImplementedError
end