Streaming
Contents
Introduction
The streaming API can be useful when you want to stream a conversation in real time, or when you want to avoid potential read timeouts during the generation of a response.
The stream
option can be set to an IO object, or the value true
to enable streaming – and at the end of the request, bot.chat
returns the same response as the non-streaming version which allows
you to process a response in the same way:
#!/usr/bin/env ruby
require "llm"
llm = LLM.openai(key: ENV["KEY"])
bot = LLM::Bot.new(llm)
bot.chat(stream: $stdout) do |prompt|
prompt.system "You are my math assistant."
prompt.user "Tell me the answer to 5 + 15"
prompt.user "Tell me the answer to (5 + 15) * 2"
prompt.user "Tell me the answer to ((5 + 15) * 2) / 10"
end.to_a
Scopes
- Conversation-level
There are three different ways to use the streaming API. It can be configured for the length of a conversation by passing thestream
throughLLM::Bot#initialize
. Note that in this case, we callLLM::Buffer#drain
to start the request:
#!/usr/bin/env ruby
require "llm"
llm = LLM.openai(key: ENV["KEY"])
bot = LLM::Bot.new(llm, stream: $stdout)
bot.chat "Hello", role: :user
bot.messages.drain
- Block-level
The streaming API can be enabled for the duration of a block given to theLLM::Bot#chat
method by passing thestream
option to the chat method:
#!/usr/bin/env ruby
require "llm"
llm = LLM.openai(key: ENV["KEY"])
bot = LLM::Bot.new(llm)
bot.chat(stream: $stdout) do |prompt|
prompt.system "You are my math assistant."
prompt.user "Tell me the answer to 5 + 15"
end.to_a
- Single request
The streaming API can also be enabled for a single request by passing thestream
option to the chat method without a block. Note that in this case, we callLLM::Buffer#drain
to start the request:
#!/usr/bin/env ruby
require "llm"
llm = LLM.openai(key: ENV["KEY"])
bot = LLM::Bot.new(llm)
bot.chat "You are my math assistant.", role: :system, stream: $stdout
bot.chat "Tell me the answer to 5 + 15", role: :user
bot.messages.drain