Changelog
Unreleased
v11.2.0
Changes since v11.1.0.
This release adds LLM::Function#skill? and LLM::Tool#skill? so
callers can inspect whether a function or tool is backed by a skill.
It introduces LLM::Transport::Request as a transport-agnostic request
object so providers no longer depend directly on Net::HTTP request
classes, and adds an optional Curb (libcurl) backend alongside symbolic
transport shortcuts such as transport: :curb.
MCP and A2A clients now accept persistent: true matching provider configuration.
Several fixes land for tool return callback emission, function comparison by
tool call ID, function array filtering, skill tool inheritance, and JSON generator
state compatibility on Ruby 4.
Add
Add
LLM::Function#skill?
Addskill?toLLM::Functionso callers can check whether a function is backed by a skill tool.Add
LLM::Tool.skill?andLLM::Tool#skill?
Add class-levelskill?and instance-levelskill?toLLM::Tool, matching the existingmcp?anda2a?pattern.Add
LLM::Transport::Request
AddLLM::Transport::Requestas a transport-agnostic request object and update providers to build requests without depending directly on Net::HTTP request classes. The built-in Net::HTTP transports still accept existing Net::HTTP request objects through a compatibility bridge, while alternative transports can handle the generic request shape directly.Add optional Curb transport support
AddLLM::Transport::Curb, an optional libcurl-backed transport that can be selected withtransport: :curb. Providers already emitLLM::Transport::Requestobjects, so the Curb backend can execute requests without routing through Net::HTTP.Add symbolic transport shortcuts
Allow providers, MCP HTTP clients, and A2A HTTP clients to accept transport shortcuts such astransport: :curbandtransport: :net_http_persistent.Add persistent HTTP selection to MCP and A2A clients
Allow MCP and A2A HTTP clients to acceptpersistent: true, matching provider configuration and selecting the persistent Net::HTTP transport by default.
Fix
Support JSON generation state on Ruby 4
Handle JSON generator state objects in the standard JSON adapter so schema objects serialize correctly when Ruby 4 calls customto_jsonmethods during provider request generation.Emit tool return callbacks for direct context waits
EmitLLM::Stream#on_tool_returnwhenLLM::Context#waitexecutes pending tool work directly instead of drainingLLM::Stream::Queue.Emit confirmed tool return callbacks once
EmitLLM::Stream#on_tool_returnfor confirmed and cancelled tool calls, and exclude confirmed functions from later waits so mixed confirmed and unconfirmed tool batches do not execute confirmed tools twice.Compare functions by tool call ID
AddLLM::Function#==,#eql?, and#hashso pending function collections can compare tool calls by provider-assigned ID instead of object identity.Preserve function array behavior after filtering
PreserveLLM::Function::Arraybehavior when subtracting function arrays so filtered tool batches can still spawn through the normal function array API.Prevent skills from inheriting skill-backed tools
Exclude skill-backed tools when a skill sub-agent usestools: inherit, preventing skills loaded through a parent context from being recursively exposed to nested skill agents.
v11.1.0
Changes since v11.0.0.
This release adds the inherit directive for skill sub-agents so they can
inherit access to the local, MCP, and A2A tools available to their parent
agent. It introduces class-level required %i[...] declarations to
LLM::Schema and wraps LLM::Function#arguments in LLM::Object for
method-style argument access. The OpenTelemetry tracer now samples all spans
regardless of environment, and the tool-call loop repair step prevents stale
history from being sent on follow-up requests.
Add
Add support for the
inheritdirective in skills
Add support for theinheritdirective so a skill sub-agent can inherit access to the local, MCP, and A2A tools available to its parent agent.Add class-level
required %i[...]support toLLM::Schema
Add class-levelrequired %i[...]declarations toLLM::Schema, so schema classes can mark existing properties as required the same wayLLM::Toolparams already can.Wrap function arguments in
LLM::Object
WrapLLM::Function#argumentsinLLM::Object, so function implementations can read arguments with method-style access while still invoking runners with keyword arguments.
Fix
Ensure all traces are sampled regardless of environment
Explicitly passSamplers::ALWAYS_ONwhen creating the OpenTelemetryTracerProviderso the in-memory exporter always captures every span, regardless of theOTEL_TRACES_SAMPLERenvironment variable.Always close the tool call loop before sending follow-up requests
Add a repair step inContext#talkthat closes assistant tool-call messages without matching tool responses before the next provider request is sent. This prevents stale tool-call history from being sent on follow-up requests, which some providers reject as invalid.
v11.0.0
Changes since v10.0.0.
This release removes several deprecated or unused APIs, including the #chat
alias from contexts and agents, the LLM::Function#register alias, and the
unused positional llm argument from MCP constructors. Generated MCP and A2A
tools are no longer added to the global tool registry by default.
On the additions side, it introduces the A2A (Agent2Agent) protocol client,
a new #ask convenience interface on contexts and agents, one-shot stdio MCP
requests outside #session, LLM::Function#def as a short alias for
LLM::Function#define, LLM::File#exist?, and LLM::Tool.a2a?.
Breaking
Remove the unused
llmargument from MCP clients
Remove the unused positionalllmargument fromLLM::MCP.new,LLM::MCP.stdio,LLM::MCP.http, andLLM.mcp.Stop globally registering generated MCP and A2A tools
Generated tools returned byLLM::Tool.mcp(...)andLLM::Tool.a2a(...)are no longer added to the globalLLM::Tool.registryorLLM::Function.registry. They still work when passed directly to a context or agent, but registry-based lookup now only sees normal loadedLLM::Toolsubclasses.Remove
LLM::Function#register
Remove theLLM::Function#registeralias and preferLLM::Function#defineorLLM::Function#defwhen binding a function to its implementation. Theregisteralias was too easy to confuse with the class-levelLLM::Tool.registerandLLM::Function.registerregistry APIs.Remove the
#chatalias from contexts and agents
Remove theLLM::Context#chatandLLM::Agent#chataliases. Prefer#talkfor all context and agent turns.
Add
Add
LLM::Function#def
AddLLM::Function#defas a short alias forLLM::Function#definewhen binding a function instance to its implementation.Add
LLM::MCP#session
AddLLM::MCP#sessionas an alias forLLM::MCP#run, and prefer it in examples for scoped stdio MCP sessions that should stay alive across discovery and tool calls.Add
#askto contexts and agents
AddLLM::Context#askandLLM::Agent#askas a RubyLLM-compatible convenience interface over#talk.#askaccepts a prompt, optionalwith:attachments, an optionalstream:target, and an optional block for streamed chunks, and returns anLLM::Response.Add
LLM::File#exist?
AddLLM::File#exist?as a small convenience wrapper for checking whether a local file exists on disk.Allow one-shot stdio MCP requests outside
#session
Allowmcp.tools,mcp.prompts,mcp.find_prompt(...), andmcp.call_tool(...)to work outsidemcp.sessionby starting and stopping a stdio transport on demand when needed. This makes stdio MCP usable without an explicit session block, while keepingmcp.sessionas the preferred pattern for efficient, stateful stdio workflows.Add A2A client support
AddLLM::A2A, a client for the Agent2Agent (A2A) protocol with REST and JSON-RPC bindings. Remote agent skills can be exposed asLLM::Toolclasses and used throughLLM::ContextorLLM::Agent, and the client also supports direct messaging, streaming, task operations, push notification configuration, extended agent cards, persistent HTTP transport selection, and optional RESTbase_pathprefixing.
Refactor shared MCP/A2A HTTP transport setup into
LLM::Transport::Utils, and extend
LLM::Transport::StreamDecoder to accept a callback block directly.
- Add
LLM::Tool.a2a?
AddLLM::Tool.a2a?and mark generated A2A-backed tool classes so callers can distinguish them from local or MCP tools.
Fix
Fix context and agent JSON serialization through
LLM.json
FixLLM::Context#to_jsonandLLM::Agent#to_jsonto serialize throughLLM.json.dump(...)instead of plainto_json.Fix block-form ORM agent DSL forwarding
Fix block-formmodel { ... },tools { ... }, andschema { ... }declarations in the ActiveRecord and Sequel agent wrappers so persisted agent models configure the internal agent class the same way asLLM::Agent.Fix missing
skillsin ORM agent wrappers
Fix the ActiveRecord and Sequel agent wrappers to exposeskills, so persisted agent models can declare skills the same way asLLM::Agent.Fix
acts_as_agent#ctxreturn type
Fix the ActiveRecordacts_as_agentwrapper so itsctxhelper returns the wrappedLLM::Agentinstead of returning the underlyingLLM::Contextdirectly.
v10.0.0
Changes since v9.0.0.
This release removes the LLM::Context#respond method, and
also removes the deprecated LLM::Bot alias. All class-level
agent tunables can now be resolved lazily via a Symbol (method name),
or a Proc. The LLM::Agent class can now confirm a tool call
before it happens, and the LLM::Schema class has been extended
to support Array[String,Integer] as a shorthand for
Array[AnyOf[String, Integer]]. The LLM::Stream class has
had its public method surface reduced to help avoid accidental
collisions.
Breaking
Unify context turns under
#talk
RemoveLLM::Context#respondand route responses-mode turns throughLLM::Context#talkwithmode: :responsesinstead.Remove the
LLM::Botalias
Remove the backward-compatibleLLM::Botalias forLLM::Context. UseLLM::Contextdirectly instead.
Add
Add shared option resolution through
LLM::Utils
AddLLM::Utils.resolve_optionfor resolving configured values as literals, procs, symbol-named methods, or duplicated hashes, and use it in agent and ORM option resolution paths.Resolve all class-level agent tunables via Proc
Letmodel,tools,skills,schema,stream, andtracerdeclared with a block be lazily evaluated against the agent instance at initialization time, matching howstreamandtraceralready worked.
Add LLM::Agent#params for direct access to the underlying context
parameters.
Ported from mruby-llm.
Support
Array[...]schema and tool param types
LetLLM::Schemaproperties andLLM::Toolparams acceptArray[...]type declarations, including mixed item unions that are serialized asanyOfarray items.Add
LLM::Provider#key?
Addkey?to providers so callers can check whether a non-blank API key has been configured.Add agent tool confirmation hooks
AddLLM::Agent.confirmandLLM::Agent#on_tool_confirmationso selected tools can be approved or cancelled before execution. Pending tool resolution now relies onLLM::Context#functionsso confirmed tools are not executed twice when mixed with unconfirmed tool calls.Add
LLM::Function#spawn(:call).wait
Add task-shaped sequential execution support for directLLM::Function#spawn(:call).wait.
Fix
- Reduce private internal methods on
LLM::Stream
Removetool_not_foundand__tools__fromLLM::Stream. The__tools__logic is inlined directly into__find__since that was its only caller. Thetool_not_foundutility method was unused externally and added unnecessary surface to LLM::Stream.
Ported from mruby-llm.
v9.0.0
Changes since v8.1.0.
This release deepens llm.rb's transport and cost-tracking surface. It
replaces the old mutable persist! API with constructor-driven transport
selection, removes #call from contexts and agents in favor of explicit
ctx.wait(:call), makes queued stream waits strategy-free, and deletes
the unused LLM::Utils module.
It adds cache read/write token tracking
with corresponding cost components, audio and image token pricing,
LLM::Context#functions? for queue-aware tool loops,
LLM::Agent.stream DSL support, and exposes #stream readers on
contexts and agents.
The HTTP transport layer has been refactored around shared backends so providers, MCP, and custom transports all use the same normalized response interface.
Breaking
Remove
#callas a context and agent tool-loop API
RemoveLLM::Context#call(:functions)andLLM::Agent#call(:functions). Tool loops should usectx.wait(:call)oragent.wait(:call)instead. The ActiveRecord and Sequel wrappers no longer expose#callpassthroughs for stored llm.rb contexts.Make HTTP transport selection constructor-driven
Remove publicpersist!and.persistentmutation APIs from providers, transports, and MCP clients. Select persistent behavior at construction time withpersistent: true,LLM::Transport.net_http,LLM::Transport.net_http_persistent, or an explicittransport:override.Make queued stream waits strategy-free
ChangeLLM::Stream::Queue#waitto resolve queued work by the actual task types already present in the queue instead of accepting an external wait strategy.LLM::Stream#wait(...)remains compatible but now ignores its arguments when delegating to the queue.Remove unused
LLM::Utils
Delete theLLM::Utilsmodule and remove its remaining unused provider includes and top-level require.
Add
Expose
#streamreaders on contexts and agents
Add publicLLM::Context#streamandLLM::Agent#streamaccessors so callers can inspect the active stream object directly.Track cache read and write tokens in usage
Addcache_read_tokensandcache_write_tokenstoLLM::Usageand preserve them through completion usage adaptation and context usage aggregation.Add
LLM::Context#functions?for queue-aware tool loops
Addfunctions?toLLM::Contextand the ActiveRecord and Sequel wrappers so callers can detect pending tool work through either the bound stream queue or unresolved functions, and update the docs to preferwhile ctx.functions?overctx.functions.any?in tool-loop examples.Add
:callas a first-class wait strategy
Add:callto pending-function wait paths soctx.wait(:call)can prefer queued streamed work when present and otherwise fall back to direct sequential function execution throughspawn(:call).wait.Read provider cache usage into completion responses
Read cache read tokens from provider usage metadata, including OpenAIusage.prompt_tokens_detailsand Anthropicusage.cache_read_input_tokens. Read Anthropic cache write tokens fromusage.cache_creation_input_tokens, and expose explicit zero-valuedcache_write_tokensmethods on providers that do not report cache creation usage.Extend cost tracking with cache write pricing
ExtendLLM::Costwithcache_read_costs,cache_write_costs, andreasoning_costsalongside the existinginput_costsandoutput_costs. Add#to_hfor structured cost insight and updatectx.costto calculate all available components from registry pricing data.Price input and output audio separately
Trackinput_audio_tokensandoutput_audio_tokensin usage and includeinput_audio_costsandoutput_audio_costsinLLM::Costso multimodal requests report accurate audio spend.Track image tokens in input cost reporting
Addinput_image_tokensto usage and includeinput_image_costsinLLM::Costusing the model's generic input rate so image-bearing prompts report their input spend.Add
LLM::Agent.streamDSL support
Let agents define a defaultstreamthrough the class DSL, including block-based stream construction so each agent instance can resolve its stream the same waytracerdoes.
Change
Refactor HTTP transports around shared backends
SplitNet::HTTPandNet::HTTP::Persistentinto separateLLM::Transportimplementations, move HTTP-specific request helpers and response execution into the shared transport layer, and let MCP HTTP wrap those transports instead of maintaining a separate transient/persistent client split.Share transport overrides across providers and MCP
Let both provider construction andLLM::MCP.http(...)acceptLLM::Transportinstances or classes as HTTP transport overrides, so callers can reuse the same transport implementation across the runtime.Let custom transports adapt their own response objects
Introduce a transport response interface so custom transports can adapt backend-specific response objects to one normalized shape and have them work with the existing provider execution and error-handling code.
v8.1.0
Changes since v8.0.0.
This release adds Amazon Bedrock provider support through the Converse
API, including AWS SigV4 request signing, event stream decoding,
structured output through schema:, and a models.dev-backed registry.
It exposes llm.models.all for Bedrock via the ListFoundationModels
API and adds LLM::Object#transform_values! for in-place value
transformation. Several Bedrock-specific fixes land as well, including
response id exposure, blank text block suppression in tool turns, and
DSML tool-marker filtering in streamed text.
Add
Add AWS Bedrock provider support
AddLLM.bedrock(...)with Bedrock Converse chat support, AWS SigV4 request signing, Bedrock event stream decoding, structured output support throughschema:, and models.dev-backedbedrock.jsonregistry generation.Add AWS Bedrock Models endpoint support
Addllm.models.allfor Bedrock via the ListFoundationModels API, including SigV4 signing for the control-plane endpoint and normalizedLLM::Modelcollection responses.Add
LLM::Object#transform_values!
LetLLM::Objecttransform stored values in place through#transform_values!.
Fix
Expose response ids on Bedrock completion responses
Read the Bedrock request id intoLLM::Response#idfor completion responses adapted from the Converse API.Avoid blank assistant text blocks in Bedrock tool turns
Stop replaying assistant tool-call messages with empty text content blocks that Bedrock rejects.Suppress Bedrock DSML tool markers in streamed text
Filter\"<|DSML|function_calls\"markers out of streamed Bedrock assistant text so tool-call sentinels do not leak into user-visible output.
v8.0.0
Changes since v7.0.0.
This release adds Unix-fork concurrency for process-isolated tool
execution, extends LLM::Object with #merge and #delete, and drops
Ruby 3.2 support due to segfaults observed with the :fork path. It
promotes LLM::Pipe to the top-level namespace and adds
persistent: true on LLM::MCP.http for direct persistent transport
configuration. LLM::Function#runner is exposed as public API, agent
tracer overrides are supported, fiber execution now uses Fiber.schedule,
missing optional dependencies raise clearer LLM::LoadError guidance,
and ActiveRecord wrapper plumbing is deduplicated between acts_as_llm
and acts_as_agent.
Breaking
- Drop Ruby 3.2 support
Stop supporting Ruby 3.2 due to a segfault observed with the:forktool concurrency strategy.
Add
Add
LLM::Object#merge
LetLLM::Objectreturn a new wrapped object when merging hash-like data through#merge.Add
LLM::Object#delete
LetLLM::Objectdelete keys directly through#delete.
Change
Add fork-based tool concurrency
Add:forkas a new concurrency strategy forLLM::Function#spawn,LLM::Function::Array#wait, andLLM::Agent.concurrencythat runs class-based tools in isolated child processes. Fork-backed tools support tracer callbacks,on_interrupt/on_cancelhooks, andalive?checks. Requires thexchangem for inter-process communication with:fork. This is especially useful for tools that need process isolation, such as running shell commands or handling unsafe data.Promote
LLM::Pipefrom MCP namespace to top-level
MoveLLM::MCP::PipetoLLM::Pipeso the pipe abstraction is available outside MCP internals. The new class adds abinmode:option for binary pipes.LLM::MCP::Commandand related MCP transport code have been updated to useLLM::Pipe.Allow
persistent: trueonLLM::MCP.http
LetLLM::MCP.http(...)enable persistent HTTP transport directly throughpersistent: trueat construction time.Expose
LLM::Function#runneras public API
Promote the internal runner instantiation to a publicrunnermethod onLLM::Function, so callers can inspect or reuse the resolved tool instance that a function wraps.Allow agent instance tracer overrides
LetLLM::Agent.new(..., tracer: ...)override the class-level tracer for that agent instance.Make
:fiberuse scheduler-backed fibers
Change:fibertool execution to useFiber.scheduleand requireFiber.scheduler, instead of wrapping direct calls in raw fibers. This gives:fibera real cooperative concurrency model instead of acting as a thin wrapper around sequential execution.Read stored values from zero-argument
LLM::Objectmethod calls
Let calls likeobj.delete,obj.fetch,obj.merge,obj.key?,obj.dig,obj.slice, orobj.keysreturn a stored value when that method name exists as a key and no arguments are given.Harden
LLM::Objectagainst arbitrary key names
Move internal lookup logic offLLM::Objectinstances and onto the singleton class instead, making stored keys likemethod_missingmore resilient while preserving normal dynamic field access.Deduplicate ActiveRecord wrapper plumbing
Move shared ActiveRecord wrapper defaults and utility methods intoLLM::ActiveRecord, reducing duplication betweenacts_as_llmandacts_as_agent.Raise clearer errors for missing optional runtime dependencies
Route optionalasync,xchan, andnet/http/persistentloads throughLLM.requireso missing runtime gems raiseLLM::LoadErrorwith installation guidance instead of leaking rawLoadErrorexceptions.
Fix
Avoid
RuntimeErrorfromAsync::Task.currentlookups
CheckAsync::Task.current?before reading the current Async task so provider transports fall back toFiber.currentwithout raising when no Async task is active.Serialize
LLM::Objectvalues correctly throughLLM.json
MakeLLM::Object#to_jsoncallLLM.json.dump(to_h, ...)soLLM::Objectvalues serialize through the llm.rb JSON adapter.
v7.0.0
Changes since v6.1.0.
This release turns agent tool-loop limit errors into in-band advisory
returns so the LLM can react to rate limits and continue the loop. It
adds tool_attempts: nil as a way to opt out of advisory tool-limit
returns entirely, and fixes the default provider HTTP path to keep
net-http-persistent optional when not explicitly enabled.
Breaking
Return in-band tool-loop limit errors from agents
Stop raisingLLM::ToolLoopErrorwhen an agent exhausts its tool loop attempt budget, and instead send advisoryLLM::Function::Returnerrors back through the model so the LLM can react to the rate limit in-band and continue the loop.Allow
tool_attempts: nilto disable advisory tool-limit returns
Keep the defaulttool_attemptsbudget at25, but treat an explicittool_attempts: nilas an opt-out that disables advisory tool-limit returns entirely.
Fix
- Keep
net-http-persistentoptional on normal HTTP requests
Stop the default provider HTTP path from loadingnet/http/persistentunless persistent transport support is explicitly enabled.
v6.1.0
Changes since v6.0.0.
This release tightens interrupt and compaction behavior for long-running
contexts. It adds LLM::Buffer#rindex, supports percentage-based token
thresholds in LLM::Compactor, tracks persisted compaction state through
context serialization, reliably interrupts Async-backed requests, preserves
valid tool-call history on cancellation, keeps concurrent skill tool loops
running on streamed agents, and returns zero-valued usage objects when no
provider usage has been recorded yet.
Change
Add
LLM::Buffer#rindex
AddLLM::Buffer#rindexas a direct forward to the underlying message array so callers can find the last matching message index through the buffer API.Support percentage compaction token thresholds
LetLLM::Compactoraccepttoken_threshold:values like\"90%\"so compaction can trigger at a percentage of the active model context window.
Fix
Interrupt Async-backed requests reliably
Track request ownership through the provider transport so contexts use the active Async task when available, lettingctx.interrupt!reliably cancel streamed requests under Async runtimes and surface them asLLM::Interrupt.Preserve valid tool-call history on cancellation
Append cancelled tool-return messages for unresolved tool calls duringctx.interrupt!so follow-up provider requests do not fail with invalid tool-call history after pending tool work is cancelled.Preserve concurrent skill tool loops on streamed agents
Propagate the active agent concurrency through the effective request stream so nested skill agents keep using queuedwait(...)tool execution instead of falling back to direct:callexecution.Track persisted compaction state on contexts
Mark contexts as compacted afterLLM::Compactor#compact!, persist and restore that state through context serialization, and clear it after the next successful model response.Return zero-valued usage objects from contexts
MakeLLM::Context#usageconsistently return anLLM::Object, using a zero-valued usage object when no provider usage has been recorded yet.
v6.0.0
Changes since v5.4.0.
This release simplifies the ORM persistence contract around serialized
data state, removing the assumption of reserved provider, model, and
usage columns. Provider selection must now come from provider: hooks,
model defaults come from context: or agent DSL, and usage is read from the
serialized runtime state. Alongside this breaking change, Sequel JSON and
JSONB persistence is fixed, ractor-backed tools now fire tracer callbacks,
and LLM::RactorError is raised for unsupported ractor tool work.
Change
- Simplify ORM persistence to serialized
datastate
Change the built-in ActiveRecord and Sequel wrappers to treat serializeddataas the persistence contract, instead of assuming reservedprovider,model, and usage columns. Provider selection must now come fromprovider:hooks that resolve a realLLM::Providerinstance, model defaults come fromcontext:or agent DSL, andusageis read from the serialized runtime state.
Fix
Fix Sequel JSON and JSONB persistence
Load Sequel PostgreSQL JSON support whenplugin :llmis configured withformat: :jsonor:jsonb, and wrap structured payloads correctly so persisted context state can be stored in PostgreSQL JSON columns.Trace ractor-backed tool callbacks
Make tool tracers fireon_tool_startandon_tool_finishfor class-based:ractorexecution too, so ractor-backed tool calls show up in tracer callbacks like the other concurrent tool paths.Raise
LLM::RactorErrorfor unsupported ractor tool work
AddLLM::RactorErrorand fail fast when:ractorexecution is requested for unsupported tool types such as skill-backed tools, instead of letting deeper Ruby isolation errors leak out later in execution.Delegate interrupt to concurrent task implementations
MakeLLM::Function::Task#interrupt!delegate to the underlying fork or ractor task when it supports interruption, soctx.interrupt!andtask.interrupt!work correctly for fork- and ractor-backed tool execution.
v5.4.0
Changes since v5.3.0.
This release expands tracer support around agentic execution. It lets
LLM::Agent define scoped tracers through the agent DSL and fixes concurrent
tool execution so those scoped tracers stay attached when work crosses
thread, task, fiber, and skill boundaries.
Change
- Add agent-scoped tracers
LetLLM::Agentclasses definetracer ...ortracer { ... }so an agent can carry its own tracer without replacing the provider's default tracer. The resolved tracer is scoped to that agent's turns, tool loops, and pending tool access. Available through theacts_as_agentand Sequel agent plugintracerDSL too.
Fix
- Preserve scoped tracers across concurrent tool work
Keep agent- and request-scoped tracers attached when tool execution crosses:thread,:task, or:fiberboundaries, including skill execution, so spawned work does not fall back to the provider default tracer.
v5.3.0
Changes since v5.2.1.
This release deepens llm.rb's request-rewriting and tool-definition surface.
It adds transformer lifecycle hooks to LLM::Stream so UIs can surface work
like PII scrubbing before a request is sent, and it adds a more explicit
OmniAI-style tool DSL form with parameter plus separate required
declarations while keeping the older param ... required: true style working.
Change
Add transformer stream lifecycle hooks
Addon_transformandon_transform_finishtoLLM::Streamso UIs can surface request rewriting work such as PII scrubbing before a request is sent to the model.Add a separate
requiredtool DSL form
Addparameteras an alias ofparamand supportrequired %i[...]as a separate declaration, inspired by OmniAI-style tools, while keeping the existingparam ... required: trueform working too.
v5.2.1
Changes since v5.2.0.
This release tightens the streamed queue fix from v5.2.0 for concurrent
workloads. Request-local streams now stay bound long enough for wait to
drain queued work and then clear cleanly so later waits fall back to the
context's configured stream.
Fix
- Reset request-local streams after
waitdrains queued work
Keep per-callstream:bindings alive throughLLM::Context#waitso queued streamed tool work still resolves correctly, then clear the request-local stream after the wait completes to avoid leaking it into later turns.
v5.2.0
Changes since v5.1.0.
This release adds current DeepSeek V4 support through refreshed provider
metadata, including deepseek-v4-flash and deepseek-v4-pro, while fixing
request-local queue handling for concurrent streamed workloads so wait and
interruption use the active per-call stream correctly.
Change
Add
LLM::MCP#runfor scoped MCP client lifecycle
AddLLM::MCP#runso MCP clients can be started for the duration of a block and then stopped automatically, which simplifies the usualstart/stoppattern in examples and application code.Refresh provider model metadata
Add current DeepSeek and OpenAI model metadata todata/and update the Google Gemma model entry to match the current provider naming.
Fix
Reject unsupported DeepSeek multimodal prompt objects early
RaiseLLM::PromptErrorforimage_url,local_file, andremote_filein DeepSeek chat requests instead of sending invalid OpenAI-compatible payloads that the provider rejects at runtime.Preserve DeepSeek reasoning content across tool turns
Replayreasoning_contentwhen serializing prior assistant messages for DeepSeek chat completions, so thinking-mode tool calls can continue into follow-up requests without triggering invalid request errors.Default DeepSeek to
deepseek-v4-flash
ChangeLLM::DeepSeek#default_modeltodeepseek-v4-flashso new contexts and default provider usage align with the current preferred chat model.Use per-call streams when waiting on streamed tool work
Track request-local streams bound throughtalk(..., stream:)andrespond(..., stream:)soLLM::Context#waitand interruption-aware queue handling use the active stream instead of falling back to pending function spawning.
v5.1.0
Changes since v5.0.0.
This release tightens streamed tool execution around the actual request-local
runtime state. It fixes streamed resolution of per-request tools and makes
that streamed path work cleanly with LLM.function(...), MCP tools, bound
tool instances, and normal tool classes.
Fix
Resolve request-local tools during streaming
Resolve streamed tool calls throughLLM::Streamrequest-local tools before falling back to the global registry, so per-request tools and bound tool instances work correctly during streaming.Support
LLM.function(...)and MCP tools in streamed tool resolution
Let streamed tool resolution use the current request tool set, soLLM.function(...), MCP tools, bound tool instances, and normalLLM::Toolclasses all work through the same streamed tool path.
v5.0.0
Changes since v4.23.0.
This release expands llm.rb from an execution runtime into a more explicit
supervision and transformation runtime. It adds context-level guards,
transformers, and loop supervision through LLM::LoopGuard, while deepening
long-lived context behavior through compaction, interruption hooks, and
streamed ctx.spawn(...) tool execution.
Change
Make compactor thresholds explicit
Requiremessage_threshold:andtoken_threshold:to be opted into explicitly, soLLM::Compactoronly compacts automatically when one of those thresholds is configured. Context-window-derived token limits can be computed by the caller when needed.Allow assigning a compactor through
LLM::Context
LetLLM::Contextacceptctx.compactor = ...in addition to the constructorcompactor:option, so compactor config can be assigned or replaced after context initialization.Mark compaction summaries in message metadata
Mark compaction summaries withextra[:compaction]andLLM::Message#compaction?, so applications can detect or hide synthetic summary messages in conversation history.Add cooperative tool interruption hooks
Letctx.interrupt!notify queued tool work throughon_interrupt, so running tools can clean up cooperatively when a context is cancelled.Add
LLM::Contextguards
Add a newguardcapability toLLM::Contextso execution can be supervised at the runtime level. The built-inLLM::LoopGuarddetects repeated tool-call patterns and stops stuck agentic loops through in-bandLLM::GuardErrorreturns.LLM::Agentenables this guard by default.Add
LLM::Contexttransformers
Add a newtransformercapability toLLM::Contextso prompts and params can be rewritten before provider requests are sent. This makes it possible to apply context-wide behaviors such as PII scrubbing or request-level param injection without rewriting everytalkandrespondcall site.
v4.23.0
Changes since v4.22.0.
This release expands llm.rb's runtime surface for long-lived contexts and
stateful tools. It adds built-in context compaction through LLM::Compactor,
lets explicit tools: arrays accept bound LLM::Tool instances, and fixes
OpenAI-compatible no-arg tool schemas for stricter providers such as xAI.
Change
Add
LLM::Compactorfor long-lived contexts
Add built-in context compaction throughLLM::Compactor, so older history can be summarized, retained windows can stay bounded, compaction can run on its ownmodel:, thresholds can be configured explicitly, andLLM::Streamcan observe the lifecycle throughon_compactionandon_compaction_finish.Allow bound tool instances in explicit tool lists
Let explicittools:arrays acceptLLM::Toolinstances such asMyTool.new(foo: 1), so tools can carry bound state without changing the global tool registry model.
Fix
- Fix xAI/OpenAI-compatible no-arg tool schemas
Send an empty object schema for tools without declared parameters instead ofnull, so stricter providers such as xAI accept mixed tool sets that include no-arg tools.
v4.22.0
Changes since v4.21.0.
This release deepens the runtime shape of llm.rb. It reduces helper-method surface on persisted ORM models, expands real ORM coverage, and makes skills behave more like bounded sub-agents with inherited recent context and proper instruction injection.
Change
Reduce ActiveRecord wrapper model surface
Move helper methods such as option resolution, column mapping, serialization, and persistence intoUtilsfor the ActiveRecord wrappers so wrapped models include fewer internal helper methods.Reduce Sequel wrapper model surface
Move helper methods such as option resolution, column mapping, serialization, and persistence intoUtilsfor the Sequel wrappers so wrapped models include fewer internal helper methods.Expand ORM integration coverage
Add broader ActiveRecord and Sequel coverage for persisted context and agent wrappers, including real SQLite-backed records and cassette-backed OpenAI persistence paths.Make skills inherit recent parent context
RunLLM::Skillwith a curated slice of recent parent user and assistant messages, prefixed withRecent context:, so skills behave more like task-scoped sub-agents instead of instruction-only helpers.
Fix
Fix Sequel
plugin :agentload order
Require the shared Sequel plugin support fromLLM::Sequel::Agentsoplugin :agentcan load independently without raisinguninitialized constant LLM::Sequel::Plugin.Make skill execution inherit parent context request settings
RunLLM::Skillthrough a parentLLM::Contextinstead of a bare provider so nested skill agents inherit context-level settings such asmode: :responses,store: false, streaming, and other request defaults, while still keeping skill-local tools and avoiding parent schemas.Keep agent instructions when history is preseeded
InjectLLM::Agentinstructions once unless a system message is already present, so agents and nested skills still get their instructions when they start with inherited non-system context.
v4.21.0
Changes since v4.20.2.
This release expands higher-level composition in llm.rb. It adds Sequel agent
persistence through plugin :agent and introduces directory-backed skills
that load from SKILL.md, resolve named tools, and plug directly into
LLM::Context and LLM::Agent.
Change
Add
plugin :agentfor Sequel models
Add Sequel support forplugin :agent, similar to ActiveRecord'sacts_as_agent, so models can wrapLLM::Agentwith built-in persistence.Load directory-backed skills through
LLM::ContextandLLM::Agent
Addskills:toLLM::Contextandskills ...toLLM::Agentso directories withSKILL.mdcan be loaded, resolved into tools, and run through the normal llm.rb tool path.
v4.20.2
Changes since v4.20.1.
This patch release improves runtime behavior around interruption and mixed concurrency waits. It also rounds out response API uniformity for Google completion responses.
Fix
Expose Google completion response IDs through
.id
AddLLM::Response#idsupport to Google completion responses so tracer and caller code can rely on the same API used by other providers.Track interrupt ownership on the active request
BindLLM::Contextinterruption to the fiber runningtalkorrespondsointerrupt!works correctly when requests are started outside the context's initialization fiber.
Change
- Allow mixed concurrency strategies in
wait(...)
LetLLM::Context#wait,LLM::Stream#wait, andLLM::Agent.concurrencyaccept arrays such as[:thread, :ractor]so mixed tool sets can wait on more than one concurrency strategy.
v4.20.1
Changes since v4.20.0.
This patch release fixes ORM option resolution in the Sequel and
ActiveRecord wrappers. Symbol-based provider: and context: hooks now
resolve correctly, and internal default option constants are referenced
explicitly instead of relying on nested constant lookup.
Fix
Fix symbol-based ORM option hooks for provider and context hashes
Makeprovider:andcontext:resolve symbol hooks through the model in the Sequel plugin and ActiveRecord wrappers instead of falling back to an empty hash.Fix ORM wrapper constant lookup for option defaults
Qualify internalEMPTY_HASH/DEFAULTSreferences in the Sequel plugin and ActiveRecord wrappers so option resolution does not depend on nested constant lookup quirks.
v4.20.0
Changes since v4.19.0.
This release adds better support for tagged prompt content. LLM::Context
can now serialize and restore image_url, local_file, and remote_file
content cleanly, and LLM::Message now exposes helpers for inspecting
tagged image and file attachments.
Change
Round-trip tagged prompt objects through
LLM::Context
TeachLLM::Contextserialization and restore to preserveimage_url,local_file, andremote_filecontent acrossto_json/restore.Add attachment helpers to
LLM::Message
Addimage_url?,image_urls,file?, andfilesso callers can inspect messages for tagged image and file content more directly.
v4.19.0
Changes since v4.18.0.
This release tightens the ActiveRecord and ORM integration layer. It adds
inline agent DSL blocks to acts_as_agent so agent defaults can be defined
where the wrapper is declared, and it exposes the resolved provider through
public llm methods on the ActiveRecord and Sequel wrappers.
Change
Make ORM provider access public through
llm
Expose the resolved provider on the Sequel plugin and the ActiveRecordacts_as_llm/acts_as_agentwrappers through a publicllmmethod.Allow inline agent DSL blocks in
acts_as_agent
Let ActiveRecord models configuremodel,tools,schema,instructions, andconcurrencydirectly inside theacts_as_agentdeclaration block.
v4.18.0
Changes since v4.17.0.
This release improves tracing and tool execution behavior across llm.rb.
It makes provider tracers default to the provider instance, adds
LLM::Provider#with_tracer for scoped overrides, restores tool tracing for
concurrent and streamed tool execution, extends streamed tracing to MCP tools,
and adds symbol-based ORM option hooks alongside experimental ractor tool
concurrency.
Change
Make provider tracers default to the provider instance
Changellm.tracer = ...so it sets a provider default tracer instead of relying on scoped fiber-local state alone. This makes tracer configuration behave more predictably across normal tasks, threads, and fibers that share the same provider instance.Add
LLM::Provider#with_tracerfor scoped overrides
Addwith_traceras the opt-in escape hatch for request- or turn-scoped tracer overrides. Use it when you want temporary tracing on the current fiber without replacing the provider's default tracer.Trace concurrent tool calls outside ractors
Make tool tracing fire correctly when functions run through:thread,:task, or:fiberconcurrency. Experimental:ractorexecution still does not emit tool tracer events.Trace streamed tool calls, including MCP tools
Bind stream metadata throughLLM::Stream#extraso streamed tool calls inherit tracer and model context before they are handed toon_tool_call. This restores tool tracing for streamed MCP and local tool execution.Support symbol-based ORM option hooks
Letprovider:,context:, andtracer:on the Sequel plugin and the ActiveRecordacts_as_llm/acts_as_agentwrappers resolve through model method names as well as procs.Add experimental ractor tool concurrency
Add:ractorsupport toLLM::Function#spawn,LLM::Function::Array#wait,LLM::Stream#wait, andLLM::Agent.concurrencyso class-based tools with ractor-safe arguments and return values can run in Ruby ractors and report their results back into the normal LLM tool-return path. MCP tools are not supported by the current:ractormode, but mixed workloads can still branch ontool.mcp?and choose a supported strategy per tool.:ractoris especially useful for CPU-bound tools, while:task,:fiber, or:threadmay be a better fit for I/O-bound work.
v4.17.0
Changes since v4.16.1.
This release expands agent support across llm.rb. It brings LLM::Agent
closer to LLM::Context, adds configurable automatic tool concurrency
including experimental ractor support for class-based tools,
extends persisted ORM wrappers with more of the context runtime surface and
tracer hooks, and introduces built-in ActiveRecord agent persistence through
acts_as_agent.
Change
Add configurable tool concurrency to
LLM::Agent
Add the class-levelconcurrencyDSL toLLM::Agentso automatic tool loops can run with:call,:thread,:task,:fiber, or experimental:ractorsupport for class-based tools instead of always executing sequentially.Bring
LLM::Agentcloser toLLM::Context
ExpandLLM::Agentso it exposes more of the same runtime surface asLLM::Context, including returns, interruption, mode, cost, context window, structured serialization, and other context-backed helpers, while still auto-managing tool loops.Refresh agent docs and coverage
Update the README and deep dive to explain the current role ofLLM::Agent, add examples that show automatic tool execution and concurrency, and add focused specs for the expanded agent surface and tool-loop behavior.Add ORM tracer hooks for persisted contexts
Addtracer:to both the Sequel plugin andacts_as_llmso models can resolve and assign tracers onto the provider used by their persistedLLM::Context.Bring persisted ORM wrappers closer to
LLM::Context
Expand both the Sequel plugin andacts_as_llmso record-backed contexts expose more of the same runtime surface asLLM::Context, including mode, returns, interruption, prompt helpers, file helpers, and tracer access.Add ActiveRecord agent persistence with
acts_as_agent
Addacts_as_agentfor ActiveRecord models that should wrapLLM::Agent, reusing the same record-backed runtime shape asacts_as_llmwhile letting tool execution be managed by the agent.
v4.16.1
Changes since v4.16.0.
This release tightens ORM persistence by removing an unnecessary JSON
round-trip when restoring structured :json and :jsonb context
payloads.
Change
- Restore structured ORM payloads directly
TeachLLM::Context#restoreto accept parsed data payloads and use that path from the ActiveRecord and Sequel persistence wrappers forformat: :jsonand:jsonb, avoiding a redundantHash -> JSON string -> Hashround-trip on restore.
v4.16.0
Changes since v4.15.0.
This release expands ORM support with built-in ActiveRecord persistence and improves compatibility with OpenAI-compatible gateways, proxies, and self-hosted servers that use non-standard API root paths.
Change
Support OpenAI-compatible base paths
Addbase_path:to provider configuration so OpenAI-compatible endpoints can vary both host and API prefix. This supports providers, proxies, and gateways that keep OpenAI request shapes but use non-standard URL layouts such as DeepInfra's/v1/openai/....Add ActiveRecord context persistence with
acts_as_llm
Add a built-in ActiveRecord wrapper that mirrors the Sequel plugin API so applications can persistLLM::Contextstate on records with default columns, provider/context hooks, validation-backed writes, andformat: :string,:json, or:jsonbstorage.
v4.15.0
Changes since v4.14.0.
Change
Reduce OpenAI stream parser merge overhead
Special-case the most common single-field deltas, streamline incremental tool-call merging, and avoid repeated JSON parse attempts until streamed tool arguments look complete.Cache streaming callback capabilities in parsers
Cache callback support checks once at parser initialization time in the OpenAI, OpenAI Responses, Anthropic, Google, and Ollama stream parsers instead of repeatingrespond_to?checks on hot streaming paths.Reduce OpenAI Responses parser lookup overhead
Special-case the hot Responses API event paths and cache the current output item and content part so streamed output text deltas do less repeated nested lookup work.Add a Sequel context persistence plugin
Addplugin :llmfor Sequel models so apps can persistLLM::Contextstate with default columns and pass provider setup throughprovider:when needed. The plugin now also supportsformat: :string,:json, or:jsonbfor text and native JSON storage when Sequel JSON typecasting is enabled.Improve streaming parser performance
In the local replay-basedstream_parserbenchmark versusv4.14.0(median of 20 samples, 5000 iterations), plain Ruby is a small overall win: the generic eventstream path is about 0.4% faster, the OpenAI stream parser is about 0.5% faster, and the OpenAI Responses parser is about 1.6% faster, with unchanged allocations. Under YJIT on the same benchmark, the generic eventstream path is about 0.9% faster and the OpenAI stream parser is about 0.4% faster, while the OpenAI Responses parser is about 0.7% slower, also with unchanged allocations.
Compared to v4.13.0, the larger v4.14.0 streaming gains still
hold. The generic eventstream path remains dramatically faster than
v4.13.0, the OpenAI stream parser remains modestly faster, and the
OpenAI Responses parser is roughly flat to slightly better depending
on runtime. In other words, current keeps the large eventstream win
from v4.14.0, adds only small incremental changes beyond that, and
does not turn the post-v4.14.0 parser work into another large
benchmark jump.
v4.14.0
Changes since v4.13.0.
This release adds request interruption for contexts, reworks provider HTTP internals for lower-overhead streaming, and fixes MCP clients so parallel tool calls can safely share one connection.
Add
- Add request interruption support
AddLLM::Context#interrupt!,LLM::Context#cancel!, andLLM::Interruptfor interrupting in-flight provider requests, inspired by Go's context cancellation.
Change
Rework provider HTTP transport internals
Rework provider HTTP aroundLLM::Provider::Transport::HTTPwith explicit transient and persistent transport handling.Reduce SSE parser overhead
Dispatch raw parsed values to registered visitors instead of building anEventobject for every streamed line.Reduce provider streaming allocations
Decode streamed provider payloads directly inLLM::Provider::Transport::HTTPbefore handing them to provider parsers, which cuts allocation churn and gives a smaller streaming speed bump.Reduce generic SSE parser allocations
Keep unread event-stream buffer data in place until compaction is worthwhile, which lowers allocation churn in the remaining generic SSE path.Improve streaming parser performance
In the local replay-basedstream_parserbenchmark versusv4.13.0(median of 20 samples, 5000 iterations): Plain Ruby: the generic eventstream path is about 53% faster with about 32% fewer allocations, the OpenAI stream parser is about 11% faster with about 4% fewer allocations, and the OpenAI Responses parser is about 3% faster with unchanged allocations. YJIT on the current parser benchmark harness: the current tree is about 26% faster than non-YJIT on the generic eventstream path, about 18% faster on the OpenAI stream parser, and about 16% faster on the OpenAI Responses parser, with allocations unchanged.
Fix
Support parallel MCP tool calls on one client
Route MCP responses by JSON-RPC id so concurrent tool calls can share one client and transport without mismatching replies.Use explicit MCP non-blocking read errors
UseIO::EAGAINWaitReadablewhile continuing to retry onIO::WaitReadable.
v4.13.0
Changes since v4.12.0.
This release expands MCP prompt support, improves reasoning support in the OpenAI Responses API, and refreshes the docs around llm.rb's runtime model, contexts, and advanced workflows.
Add
- Add
LLM::MCP#promptsandLLM::MCP#find_promptfor MCP prompt support.
Change
- Rework the README around llm.rb as a runtime for AI systems.
- Add a dedicated deep dive guide for providers, contexts, persistence, tools, agents, MCP, tracing, multimodal prompts, and retrieval.
Fix
All of these fixes apply to MCP:
- fix(mcp): raise
LLM::MCP::MismatchErroron mismatched response ids. - fix(mcp): normalize prompt message content while preserving the original payload.
All of these fixes apply to OpenAI's Responses API:
- fix(openai): emit
on_reasoning_contentfor streamed reasoning summaries. - fix(openai): skip
previous_response_idonstore: falsefollow-up calls. - fix(openai): fall back to an empty object schema for tools without params.
- fix(openai): preserve original tool-call payloads on re-sent assistant tool messages.
- fix(openai): emit
output_textfor assistant-authored response content. - fix(openai): return
nilforsystem_fingerprinton normalized response objects.
v4.12.0
Changes since v4.11.1.
This release expands advanced streaming and MCP execution while reframing llm.rb more clearly as a system integration layer for LLMs, tools, MCP sources, and application APIs.
Add
- Add
persistentas an alias forpersist!on providers and MCP transports. - Add
LLM::Stream#on_tool_returnfor observing completed streamed tool work. - Add
LLM::Function::Return#error?.
Change
- Expect advanced streaming callbacks to use
LLM::Streamsubclasses instead of duck-typing them onto arbitrary objects. Basic#<<streaming remains supported.
Fix
- Fix Anthropic tools without params by always emitting
input_schema. - Fix Anthropic tool-only responses to still produce an assistant message.
- Fix Anthropic tool results to use the
userrole. - Fix Anthropic tool input normalization.
v4.11.1
Changes since v4.11.0.
Fix
- Cast OpenTelemetry tool-related values to strings.
Otherwise they're rejected by opentelemetry-sdk as invalid attributes.
v4.11.0
Changes since v4.10.0.
Add
- Add
LLM::Streamfor richer streaming callbacks, includingon_content,on_reasoning_content, andon_tool_callfor concurrent tool execution. - Add
LLM::Stream#waitas a shortcut forqueue.wait. - Add
LLM::Context#waitas a shortcut for the configured stream'swait. - Add
LLM::Context#call(:functions)as a shortcut forfunctions.call. - Add
LLM::Function.registryand enhanced support for MCP tools inLLM::Tool.registryfor tool resolution during streaming. - Add normalized
LLM::Responsefor OpenAI Responses, providingcontent,content!,messages/choices,usage, andreasoning_content. - Add
mode: :responsestoLLM::Contextfor routingtalkthrough the Responses API. - Add
LLM::Context#returnsfor collecting pending tool returns from the context. - Add persistent HTTP connection pooling for repeated MCP tool calls via
LLM.mcp(http: ...).persist!. - Add explicit MCP transport constructors via
LLM::MCP.stdio(...)andLLM::MCP.http(...).
Fix
- Fix Google tool-call handling by synthesizing stable ids when Gemini does not provide a direct tool-call id.
v4.10.0
Changes since v4.9.0.
Add
- Add HTTP transport for MCP with
LLM::MCP::Transport::HTTPfor remote servers - Add JSON Schema union types (
any_of,all_of,one_of) with parser integration - Add JSON Schema type array union support (e.g.,
\"type\": [\"object\", \"null\"]) - Add JSON Schema type inference from
const,enum, ordefaultfields
Change
- Update
LLM::MCPconstructor for exclusivehttp:orstdio:transport - Update
LLM::MCPdocumentation for HTTP transport support
v4.9.0
Changes since v4.8.0.
Add
- Add fiber-based concurrency with
LLM::Function::FiberGroupandLLM::Function::TaskGroupclasses for lightweight async execution. - Add
:thread,:task, and:fiberstrategy parameter toLLM::Function#spawnfor explicit concurrency control. - Add stdio MCP client support, including remote tool discovery and
invocation through
LLM.mcp,LLM::Context, and existing function/tool APIs. - Add model registry support via
LLM::Registry, including model metadata lookup, pricing, modalities, limits, and cost estimation. - Add context access to a model context window via
LLM::Context#context_window. - Add tracking of defined tools in the tool registry.
- Add
LLM::Schema::Enum, enablingEnum[...]as a schema/tool parameter type. - Add top-level Anthropic system instruction support using Anthropic's provider-specific request format.
- Add richer tracing hooks and extra metadata support for LangSmith/OpenTelemetry-style traces.
- Add rack/websocket and Relay-related example work, including MCP-focused examples.
- Add concurrent tool execution with
LLM::Function#spawn,LLM::Function::Array(call,wait,spawn), andLLM::Function::ThreadGroup. - Add
LLM::Function::ThreadGroup#alive?method for non-blocking monitoring of concurrent tool execution. - Add
LLM::Function::ThreadGroup#valuealias forThreadGroup#waitfor consistency with Ruby'sThread#value.
Change
- Rename
LLM::SessiontoLLM::Contextthroughout the codebase to better reflect the concept of a stateful interaction environment. - Rename
LLM::GeminitoLLM::Googleto better reflect provider naming. - Standardize model objects across providers around a smaller common interface.
- Switch registry cost internals from
LLM::EstimatetoLLM::Cost. - Update image generation defaults so OpenAI and xAI consistently return base64-encoded image data by default.
- Update
LLM::Botdeprecation warning from v5.0 to v6.0, giving users more time to migrate toLLM::Context. - Rework the README and screencast documentation to better cover MCP, registry, contexts, prompts, concurrency, providers, and example flow.
- Expand the README with architecture, production, and provider guidance while improving readability and example ordering.
Fix
- Fix local schema
$refresolution inLLM::Schema::Parser. - Fix multiple MCP issues around stdio env handling, request IDs, registry interaction, tool registration, and filtering of MCP tools from the standard tool registry.
- Fix stream parsing issues, including chunk-splitting bugs and safer handling of streamed error responses.
- Fix prompt handling across contexts, agents, and provider adapters so prompt turns remain consistent in history and completions.
- Fix several tool/context issues, including function return wrapping, tool lookup after deserialization, unnamed subclass filtering, and thread-safety around tool registry mutations.
- Fix Google tool-call handling to preserve
thoughtSignature. - Fix
LLM::Tracer::Loggerargument handling. - Fix packaging/docs issues such as registry files in the gemspec and stale provider docs.
- Fix Google provider handling of
nilfunction IDs during context deserialization. - Fix MCP stdio transport by increasing poll timeout for better reliability.
- Fix Google provider to properly cast non-Hash tool results into Hash format for API compatibility.
- Fix schema parser to support recursive normalization of
Array,LLM::Object, and nested structures. - Fix DeepSeek provider to tolerate malformed tool arguments.
- Fix
LLM::Function::TaskGroup#alive?to properly delegate toAsync::Task#alive?. - Fix various RuboCop errors across the codebase.
- Fix DeepSeek provider to handle JSON that might be valid but unexpected.
Notes
Notable merged work in this range includes:
feat(function): add fiber-based concurrency for async environments (#64)feat(mcp): add stdio MCP support (#134)Add LLM::Registry + cost support (#133)Consistent model objects across providers (#131)Add rack + websocket example (#130)feat(gemspec): add changelog URI (#136)feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)- README and screencast refresh across
#66,#68,#71, and#72 chore(bot): update deprecation warning from v5.0 to v6.0fix(deepseek): tolerate malformed tool argumentsrefactor(context): Rename Session as Context (#70)
Comparison base:
- Latest tag:
v4.8.0(6468f2426ee125823b7ae43b4af507b125f96ffc) - HEAD used for this changelog:
915c48da6fda9bef1554ff613947a6ce26d382e3