Changelog

Unreleased

Changes since v5.4.0.

v5.4.0

Changes since v5.3.0.

This release expands tracer support around agentic execution. It lets LLM::Agent define scoped tracers through the agent DSL and fixes concurrent tool execution so those scoped tracers stay attached when work crosses thread, task, fiber, and skill boundaries.

Change

  • Add agent-scoped tracers
    Let LLM::Agent classes define tracer ... or tracer { ... } so an agent can carry its own tracer without replacing the provider's default tracer. The resolved tracer is scoped to that agent's turns, tool loops, and pending tool access. Available through the acts_as_agent and Sequel agent plugin tracer DSL too.

Fix

  • Preserve scoped tracers across concurrent tool work
    Keep agent- and request-scoped tracers attached when tool execution crosses :thread, :task, or :fiber boundaries, including skill execution, so spawned work does not fall back to the provider default tracer.

v5.3.0

Changes since v5.2.1.

This release deepens llm.rb's request-rewriting and tool-definition surface. It adds transformer lifecycle hooks to LLM::Stream so UIs can surface work like PII scrubbing before a request is sent, and it adds a more explicit OmniAI-style tool DSL form with parameter plus separate required declarations while keeping the older param ... required: true style working.

Change

  • Add transformer stream lifecycle hooks
    Add on_transform and on_transform_finish to LLM::Stream so UIs can surface request rewriting work such as PII scrubbing before a request is sent to the model.

  • Add a separate required tool DSL form
    Add parameter as an alias of param and support required %i[...] as a separate declaration, inspired by OmniAI-style tools, while keeping the existing param ... required: true form working too.

v5.2.1

Changes since v5.2.0.

This release tightens the streamed queue fix from v5.2.0 for concurrent workloads. Request-local streams now stay bound long enough for wait to drain queued work and then clear cleanly so later waits fall back to the context's configured stream.

Fix

  • Reset request-local streams after wait drains queued work
    Keep per-call stream: bindings alive through LLM::Context#wait so queued streamed tool work still resolves correctly, then clear the request-local stream after the wait completes to avoid leaking it into later turns.

v5.2.0

Changes since v5.1.0.

This release adds current DeepSeek V4 support through refreshed provider metadata, including deepseek-v4-flash and deepseek-v4-pro, while fixing request-local queue handling for concurrent streamed workloads so wait and interruption use the active per-call stream correctly.

Change

  • Add LLM::MCP#run for scoped MCP client lifecycle
    Add LLM::MCP#run so MCP clients can be started for the duration of a block and then stopped automatically, which simplifies the usual start/stop pattern in examples and application code.

  • Refresh provider model metadata
    Add current DeepSeek and OpenAI model metadata to data/ and update the Google Gemma model entry to match the current provider naming.

Fix

  • Reject unsupported DeepSeek multimodal prompt objects early
    Raise LLM::PromptError for image_url, local_file, and remote_file in DeepSeek chat requests instead of sending invalid OpenAI-compatible payloads that the provider rejects at runtime.

  • Preserve DeepSeek reasoning content across tool turns
    Replay reasoning_content when serializing prior assistant messages for DeepSeek chat completions, so thinking-mode tool calls can continue into follow-up requests without triggering invalid request errors.

  • Default DeepSeek to deepseek-v4-flash
    Change LLM::DeepSeek#default_model to deepseek-v4-flash so new contexts and default provider usage align with the current preferred chat model.

  • Use per-call streams when waiting on streamed tool work
    Track request-local streams bound through talk(..., stream:) and respond(..., stream:) so LLM::Context#wait and interruption-aware queue handling use the active stream instead of falling back to pending function spawning.

v5.1.0

Changes since v5.0.0.

This release tightens streamed tool execution around the actual request-local runtime state. It fixes streamed resolution of per-request tools and makes that streamed path work cleanly with LLM.function(...), MCP tools, bound tool instances, and normal tool classes.

Fix

  • Resolve request-local tools during streaming
    Resolve streamed tool calls through LLM::Stream request-local tools before falling back to the global registry, so per-request tools and bound tool instances work correctly during streaming.

  • Support LLM.function(...) and MCP tools in streamed tool resolution
    Let streamed tool resolution use the current request tool set, so LLM.function(...), MCP tools, bound tool instances, and normal LLM::Tool classes all work through the same streamed tool path.

v5.0.0

Changes since v4.23.0.

This release expands llm.rb from an execution runtime into a more explicit supervision and transformation runtime. It adds context-level guards, transformers, and loop supervision through LLM::LoopGuard, while deepening long-lived context behavior through compaction, interruption hooks, and streamed ctx.spawn(...) tool execution.

Change

  • Make compactor thresholds explicit
    Require message_threshold: and token_threshold: to be opted into explicitly, so LLM::Compactor only compacts automatically when one of those thresholds is configured. Context-window-derived token limits can be computed by the caller when needed.

  • Allow assigning a compactor through LLM::Context
    Let LLM::Context accept ctx.compactor = ... in addition to the constructor compactor: option, so compactor config can be assigned or replaced after context initialization.

  • Mark compaction summaries in message metadata
    Mark compaction summaries with extra[:compaction] and LLM::Message#compaction?, so applications can detect or hide synthetic summary messages in conversation history.

  • Add cooperative tool interruption hooks
    Let ctx.interrupt! notify queued tool work through on_interrupt, so running tools can clean up cooperatively when a context is cancelled.

  • Add LLM::Context guards
    Add a new guard capability to LLM::Context so execution can be supervised at the runtime level. The built-in LLM::LoopGuard detects repeated tool-call patterns and stops stuck agentic loops through in-band LLM::GuardError returns. LLM::Agent enables this guard by default.

  • Add LLM::Context transformers
    Add a new transformer capability to LLM::Context so prompts and params can be rewritten before provider requests are sent. This makes it possible to apply context-wide behaviors such as PII scrubbing or request-level param injection without rewriting every talk and respond call site.

v4.23.0

Changes since v4.22.0.

This release expands llm.rb's runtime surface for long-lived contexts and stateful tools. It adds built-in context compaction through LLM::Compactor, lets explicit tools: arrays accept bound LLM::Tool instances, and fixes OpenAI-compatible no-arg tool schemas for stricter providers such as xAI.

Change

  • Add LLM::Compactor for long-lived contexts
    Add built-in context compaction through LLM::Compactor, so older history can be summarized, retained windows can stay bounded, compaction can run on its own model:, thresholds can be configured explicitly, and LLM::Stream can observe the lifecycle through on_compaction and on_compaction_finish.

  • Allow bound tool instances in explicit tool lists
    Let explicit tools: arrays accept LLM::Tool instances such as MyTool.new(foo: 1), so tools can carry bound state without changing the global tool registry model.

Fix

  • Fix xAI/OpenAI-compatible no-arg tool schemas
    Send an empty object schema for tools without declared parameters instead of null, so stricter providers such as xAI accept mixed tool sets that include no-arg tools.

v4.22.0

Changes since v4.21.0.

This release deepens the runtime shape of llm.rb. It reduces helper-method surface on persisted ORM models, expands real ORM coverage, and makes skills behave more like bounded sub-agents with inherited recent context and proper instruction injection.

Change

  • Reduce ActiveRecord wrapper model surface
    Move helper methods such as option resolution, column mapping, serialization, and persistence into Utils for the ActiveRecord wrappers so wrapped models include fewer internal helper methods.

  • Reduce Sequel wrapper model surface
    Move helper methods such as option resolution, column mapping, serialization, and persistence into Utils for the Sequel wrappers so wrapped models include fewer internal helper methods.

  • Expand ORM integration coverage
    Add broader ActiveRecord and Sequel coverage for persisted context and agent wrappers, including real SQLite-backed records and cassette-backed OpenAI persistence paths.

  • Make skills inherit recent parent context
    Run LLM::Skill with a curated slice of recent parent user and assistant messages, prefixed with Recent context:, so skills behave more like task-scoped sub-agents instead of instruction-only helpers.

Fix

  • Fix Sequel plugin :agent load order
    Require the shared Sequel plugin support from LLM::Sequel::Agent so plugin :agent can load independently without raising uninitialized constant LLM::Sequel::Plugin.

  • Make skill execution inherit parent context request settings
    Run LLM::Skill through a parent LLM::Context instead of a bare provider so nested skill agents inherit context-level settings such as mode: :responses, store: false, streaming, and other request defaults, while still keeping skill-local tools and avoiding parent schemas.

  • Keep agent instructions when history is preseeded
    Inject LLM::Agent instructions once unless a system message is already present, so agents and nested skills still get their instructions when they start with inherited non-system context.

v4.21.0

Changes since v4.20.2.

This release expands higher-level composition in llm.rb. It adds Sequel agent persistence through plugin :agent and introduces directory-backed skills that load from SKILL.md, resolve named tools, and plug directly into LLM::Context and LLM::Agent.

Change

  • Add plugin :agent for Sequel models
    Add Sequel support for plugin :agent, similar to ActiveRecord's acts_as_agent, so models can wrap LLM::Agent with built-in persistence.

  • Load directory-backed skills through LLM::Context and LLM::Agent
    Add skills: to LLM::Context and skills ... to LLM::Agent so directories with SKILL.md can be loaded, resolved into tools, and run through the normal llm.rb tool path.

v4.20.2

Changes since v4.20.1.

This patch release improves runtime behavior around interruption and mixed concurrency waits. It also rounds out response API uniformity for Google completion responses.

Fix

  • Expose Google completion response IDs through .id
    Add LLM::Response#id support to Google completion responses so tracer and caller code can rely on the same API used by other providers.

  • Track interrupt ownership on the active request
    Bind LLM::Context interruption to the fiber running talk or respond so interrupt! works correctly when requests are started outside the context's initialization fiber.

Change

  • Allow mixed concurrency strategies in wait(...)
    Let LLM::Context#wait, LLM::Stream#wait, and LLM::Agent.concurrency accept arrays such as [:thread, :ractor] so mixed tool sets can wait on more than one concurrency strategy.

v4.20.1

Changes since v4.20.0.

This patch release fixes ORM option resolution in the Sequel and ActiveRecord wrappers. Symbol-based provider: and context: hooks now resolve correctly, and internal default option constants are referenced explicitly instead of relying on nested constant lookup.

Fix

  • Fix symbol-based ORM option hooks for provider and context hashes
    Make provider: and context: resolve symbol hooks through the model in the Sequel plugin and ActiveRecord wrappers instead of falling back to an empty hash.

  • Fix ORM wrapper constant lookup for option defaults
    Qualify internal EMPTY_HASH / DEFAULTS references in the Sequel plugin and ActiveRecord wrappers so option resolution does not depend on nested constant lookup quirks.

v4.20.0

Changes since v4.19.0.

This release adds better support for tagged prompt content. LLM::Context can now serialize and restore image_url, local_file, and remote_file content cleanly, and LLM::Message now exposes helpers for inspecting tagged image and file attachments.

Change

  • Round-trip tagged prompt objects through LLM::Context
    Teach LLM::Context serialization and restore to preserve image_url, local_file, and remote_file content across to_json / restore.

  • Add attachment helpers to LLM::Message
    Add image_url?, image_urls, file?, and files so callers can inspect messages for tagged image and file content more directly.

v4.19.0

Changes since v4.18.0.

This release tightens the ActiveRecord and ORM integration layer. It adds inline agent DSL blocks to acts_as_agent so agent defaults can be defined where the wrapper is declared, and it exposes the resolved provider through public llm methods on the ActiveRecord and Sequel wrappers.

Change

  • Make ORM provider access public through llm
    Expose the resolved provider on the Sequel plugin and the ActiveRecord acts_as_llm / acts_as_agent wrappers through a public llm method.

  • Allow inline agent DSL blocks in acts_as_agent
    Let ActiveRecord models configure model, tools, schema, instructions, and concurrency directly inside the acts_as_agent declaration block.

v4.18.0

Changes since v4.17.0.

This release improves tracing and tool execution behavior across llm.rb. It makes provider tracers default to the provider instance, adds LLM::Provider#with_tracer for scoped overrides, restores tool tracing for concurrent and streamed tool execution, extends streamed tracing to MCP tools, and adds symbol-based ORM option hooks alongside experimental ractor tool concurrency.

Change

  • Make provider tracers default to the provider instance
    Change llm.tracer = ... so it sets a provider default tracer instead of relying on scoped fiber-local state alone. This makes tracer configuration behave more predictably across normal tasks, threads, and fibers that share the same provider instance.

  • Add LLM::Provider#with_tracer for scoped overrides
    Add with_tracer as the opt-in escape hatch for request- or turn-scoped tracer overrides. Use it when you want temporary tracing on the current fiber without replacing the provider's default tracer.

  • Trace concurrent tool calls outside ractors
    Make tool tracing fire correctly when functions run through :thread, :task, or :fiber concurrency. Experimental :ractor execution still does not emit tool tracer events.

  • Trace streamed tool calls, including MCP tools
    Bind stream metadata through LLM::Stream#extra so streamed tool calls inherit tracer and model context before they are handed to on_tool_call. This restores tool tracing for streamed MCP and local tool execution.

  • Support symbol-based ORM option hooks
    Let provider:, context:, and tracer: on the Sequel plugin and the ActiveRecord acts_as_llm / acts_as_agent wrappers resolve through model method names as well as procs.

  • Add experimental ractor tool concurrency
    Add :ractor support to LLM::Function#spawn, LLM::Function::Array#wait, LLM::Stream#wait, and LLM::Agent.concurrency so class-based tools with ractor-safe arguments and return values can run in Ruby ractors and report their results back into the normal LLM tool-return path. MCP tools are not supported by the current :ractor mode, but mixed workloads can still branch on tool.mcp? and choose a supported strategy per tool. :ractor is especially useful for CPU-bound tools, while :task, :fiber, or :thread may be a better fit for I/O-bound work.

v4.17.0

Changes since v4.16.1.

This release expands agent support across llm.rb. It brings LLM::Agent closer to LLM::Context, adds configurable automatic tool concurrency including experimental ractor support for class-based tools, extends persisted ORM wrappers with more of the context runtime surface and tracer hooks, and introduces built-in ActiveRecord agent persistence through acts_as_agent.

Change

  • Add configurable tool concurrency to LLM::Agent
    Add the class-level concurrency DSL to LLM::Agent so automatic tool loops can run with :call, :thread, :task, :fiber, or experimental :ractor support for class-based tools instead of always executing sequentially.

  • Bring LLM::Agent closer to LLM::Context
    Expand LLM::Agent so it exposes more of the same runtime surface as LLM::Context, including returns, interruption, mode, cost, context window, structured serialization, and other context-backed helpers, while still auto-managing tool loops.

  • Refresh agent docs and coverage
    Update the README and deep dive to explain the current role of LLM::Agent, add examples that show automatic tool execution and concurrency, and add focused specs for the expanded agent surface and tool-loop behavior.

  • Add ORM tracer hooks for persisted contexts
    Add tracer: to both the Sequel plugin and acts_as_llm so models can resolve and assign tracers onto the provider used by their persisted LLM::Context.

  • Bring persisted ORM wrappers closer to LLM::Context
    Expand both the Sequel plugin and acts_as_llm so record-backed contexts expose more of the same runtime surface as LLM::Context, including mode, returns, interruption, prompt helpers, file helpers, and tracer access.

  • Add ActiveRecord agent persistence with acts_as_agent
    Add acts_as_agent for ActiveRecord models that should wrap LLM::Agent, reusing the same record-backed runtime shape as acts_as_llm while letting tool execution be managed by the agent.

v4.16.1

Changes since v4.16.0.

This release tightens ORM persistence by removing an unnecessary JSON round-trip when restoring structured :json and :jsonb context payloads.

Change

  • Restore structured ORM payloads directly
    Teach LLM::Context#restore to accept parsed data payloads and use that path from the ActiveRecord and Sequel persistence wrappers for format: :json and :jsonb, avoiding a redundant Hash -> JSON string -> Hash round-trip on restore.

v4.16.0

Changes since v4.15.0.

This release expands ORM support with built-in ActiveRecord persistence and improves compatibility with OpenAI-compatible gateways, proxies, and self-hosted servers that use non-standard API root paths.

Change

  • Support OpenAI-compatible base paths
    Add base_path: to provider configuration so OpenAI-compatible endpoints can vary both host and API prefix. This supports providers, proxies, and gateways that keep OpenAI request shapes but use non-standard URL layouts such as DeepInfra's /v1/openai/....

  • Add ActiveRecord context persistence with acts_as_llm
    Add a built-in ActiveRecord wrapper that mirrors the Sequel plugin API so applications can persist LLM::Context state on records with default columns, provider/context hooks, validation-backed writes, and format: :string, :json, or :jsonb storage.

v4.15.0

Changes since v4.14.0.

Change

  • Reduce OpenAI stream parser merge overhead
    Special-case the most common single-field deltas, streamline incremental tool-call merging, and avoid repeated JSON parse attempts until streamed tool arguments look complete.

  • Cache streaming callback capabilities in parsers
    Cache callback support checks once at parser initialization time in the OpenAI, OpenAI Responses, Anthropic, Google, and Ollama stream parsers instead of repeating respond_to? checks on hot streaming paths.

  • Reduce OpenAI Responses parser lookup overhead
    Special-case the hot Responses API event paths and cache the current output item and content part so streamed output text deltas do less repeated nested lookup work.

  • Add a Sequel context persistence plugin
    Add plugin :llm for Sequel models so apps can persist LLM::Context state with default columns and pass provider setup through provider: when needed. The plugin now also supports format: :string, :json, or :jsonb for text and native JSON storage when Sequel JSON typecasting is enabled.

  • Improve streaming parser performance
    In the local replay-based stream_parser benchmark versus v4.14.0 (median of 20 samples, 5000 iterations), plain Ruby is a small overall win: the generic eventstream path is about 0.4% faster, the OpenAI stream parser is about 0.5% faster, and the OpenAI Responses parser is about 1.6% faster, with unchanged allocations. Under YJIT on the same benchmark, the generic eventstream path is about 0.9% faster and the OpenAI stream parser is about 0.4% faster, while the OpenAI Responses parser is about 0.7% slower, also with unchanged allocations.

Compared to v4.13.0, the larger v4.14.0 streaming gains still hold. The generic eventstream path remains dramatically faster than v4.13.0, the OpenAI stream parser remains modestly faster, and the OpenAI Responses parser is roughly flat to slightly better depending on runtime. In other words, current keeps the large eventstream win from v4.14.0, adds only small incremental changes beyond that, and does not turn the post-v4.14.0 parser work into another large benchmark jump.

v4.14.0

Changes since v4.13.0.

This release adds request interruption for contexts, reworks provider HTTP internals for lower-overhead streaming, and fixes MCP clients so parallel tool calls can safely share one connection.

Add

  • Add request interruption support
    Add LLM::Context#interrupt!, LLM::Context#cancel!, and LLM::Interrupt for interrupting in-flight provider requests, inspired by Go's context cancellation.

Change

  • Rework provider HTTP transport internals
    Rework provider HTTP around LLM::Provider::Transport::HTTP with explicit transient and persistent transport handling.

  • Reduce SSE parser overhead
    Dispatch raw parsed values to registered visitors instead of building an Event object for every streamed line.

  • Reduce provider streaming allocations
    Decode streamed provider payloads directly in LLM::Provider::Transport::HTTP before handing them to provider parsers, which cuts allocation churn and gives a smaller streaming speed bump.

  • Reduce generic SSE parser allocations
    Keep unread event-stream buffer data in place until compaction is worthwhile, which lowers allocation churn in the remaining generic SSE path.

  • Improve streaming parser performance
    In the local replay-based stream_parser benchmark versus v4.13.0 (median of 20 samples, 5000 iterations): Plain Ruby: the generic eventstream path is about 53% faster with about 32% fewer allocations, the OpenAI stream parser is about 11% faster with about 4% fewer allocations, and the OpenAI Responses parser is about 3% faster with unchanged allocations. YJIT on the current parser benchmark harness: the current tree is about 26% faster than non-YJIT on the generic eventstream path, about 18% faster on the OpenAI stream parser, and about 16% faster on the OpenAI Responses parser, with allocations unchanged.

Fix

  • Support parallel MCP tool calls on one client
    Route MCP responses by JSON-RPC id so concurrent tool calls can share one client and transport without mismatching replies.

  • Use explicit MCP non-blocking read errors
    Use IO::EAGAINWaitReadable while continuing to retry on IO::WaitReadable.

v4.13.0

Changes since v4.12.0.

This release expands MCP prompt support, improves reasoning support in the OpenAI Responses API, and refreshes the docs around llm.rb's runtime model, contexts, and advanced workflows.

Add

  • Add LLM::MCP#prompts and LLM::MCP#find_prompt for MCP prompt support.

Change

  • Rework the README around llm.rb as a runtime for AI systems.
  • Add a dedicated deep dive guide for providers, contexts, persistence, tools, agents, MCP, tracing, multimodal prompts, and retrieval.

Fix

All of these fixes apply to MCP:

  • fix(mcp): raise LLM::MCP::MismatchError on mismatched response ids.
  • fix(mcp): normalize prompt message content while preserving the original payload.

All of these fixes apply to OpenAI's Responses API:

  • fix(openai): emit on_reasoning_content for streamed reasoning summaries.
  • fix(openai): skip previous_response_id on store: false follow-up calls.
  • fix(openai): fall back to an empty object schema for tools without params.
  • fix(openai): preserve original tool-call payloads on re-sent assistant tool messages.
  • fix(openai): emit output_text for assistant-authored response content.
  • fix(openai): return nil for system_fingerprint on normalized response objects.

v4.12.0

Changes since v4.11.1.

This release expands advanced streaming and MCP execution while reframing llm.rb more clearly as a system integration layer for LLMs, tools, MCP sources, and application APIs.

Add

  • Add persistent as an alias for persist! on providers and MCP transports.
  • Add LLM::Stream#on_tool_return for observing completed streamed tool work.
  • Add LLM::Function::Return#error?.

Change

  • Expect advanced streaming callbacks to use LLM::Stream subclasses instead of duck-typing them onto arbitrary objects. Basic #<< streaming remains supported.

Fix

  • Fix Anthropic tools without params by always emitting input_schema.
  • Fix Anthropic tool-only responses to still produce an assistant message.
  • Fix Anthropic tool results to use the user role.
  • Fix Anthropic tool input normalization.

v4.11.1

Changes since v4.11.0.

Fix

  • Cast OpenTelemetry tool-related values to strings.
    Otherwise they're rejected by opentelemetry-sdk as invalid attributes.

v4.11.0

Changes since v4.10.0.

Add

  • Add LLM::Stream for richer streaming callbacks, including on_content, on_reasoning_content, and on_tool_call for concurrent tool execution.
  • Add LLM::Stream#wait as a shortcut for queue.wait.
  • Add LLM::Context#wait as a shortcut for the configured stream's wait.
  • Add LLM::Context#call(:functions) as a shortcut for functions.call.
  • Add LLM::Function.registry and enhanced support for MCP tools in LLM::Tool.registry for tool resolution during streaming.
  • Add normalized LLM::Response for OpenAI Responses, providing content, content!, messages / choices, usage, and reasoning_content.
  • Add mode: :responses to LLM::Context for routing talk through the Responses API.
  • Add LLM::Context#returns for collecting pending tool returns from the context.
  • Add persistent HTTP connection pooling for repeated MCP tool calls via LLM.mcp(http: ...).persist!.
  • Add explicit MCP transport constructors via LLM::MCP.stdio(...) and LLM::MCP.http(...).

Fix

  • Fix Google tool-call handling by synthesizing stable ids when Gemini does not provide a direct tool-call id.

v4.10.0

Changes since v4.9.0.

Add

  • Add HTTP transport for MCP with LLM::MCP::Transport::HTTP for remote servers
  • Add JSON Schema union types (any_of, all_of, one_of) with parser integration
  • Add JSON Schema type array union support (e.g., "type": ["object", "null"])
  • Add JSON Schema type inference from const, enum, or default fields

Change

  • Update LLM::MCP constructor for exclusive http: or stdio: transport
  • Update LLM::MCP documentation for HTTP transport support

v4.9.0

Changes since v4.8.0.

Add

  • Add fiber-based concurrency with LLM::Function::FiberGroup and LLM::Function::TaskGroup classes for lightweight async execution.
  • Add :thread, :task, and :fiber strategy parameter to LLM::Function#spawn for explicit concurrency control.
  • Add stdio MCP client support, including remote tool discovery and invocation through LLM.mcp, LLM::Context, and existing function/tool APIs.
  • Add model registry support via LLM::Registry, including model metadata lookup, pricing, modalities, limits, and cost estimation.
  • Add context access to a model context window via LLM::Context#context_window.
  • Add tracking of defined tools in the tool registry.
  • Add LLM::Schema::Enum, enabling Enum[...] as a schema/tool parameter type.
  • Add top-level Anthropic system instruction support using Anthropic's provider-specific request format.
  • Add richer tracing hooks and extra metadata support for LangSmith/OpenTelemetry-style traces.
  • Add rack/websocket and Relay-related example work, including MCP-focused examples.
  • Add concurrent tool execution with LLM::Function#spawn, LLM::Function::Array (call, wait, spawn), and LLM::Function::ThreadGroup.
  • Add LLM::Function::ThreadGroup#alive? method for non-blocking monitoring of concurrent tool execution.
  • Add LLM::Function::ThreadGroup#value alias for ThreadGroup#wait for consistency with Ruby's Thread#value.

Change

  • Rename LLM::Session to LLM::Context throughout the codebase to better reflect the concept of a stateful interaction environment.
  • Rename LLM::Gemini to LLM::Google to better reflect provider naming.
  • Standardize model objects across providers around a smaller common interface.
  • Switch registry cost internals from LLM::Estimate to LLM::Cost.
  • Update image generation defaults so OpenAI and xAI consistently return base64-encoded image data by default.
  • Update LLM::Bot deprecation warning from v5.0 to v6.0, giving users more time to migrate to LLM::Context.
  • Rework the README and screencast documentation to better cover MCP, registry, contexts, prompts, concurrency, providers, and example flow.
  • Expand the README with architecture, production, and provider guidance while improving readability and example ordering.

Fix

  • Fix local schema $ref resolution in LLM::Schema::Parser.
  • Fix multiple MCP issues around stdio env handling, request IDs, registry interaction, tool registration, and filtering of MCP tools from the standard tool registry.
  • Fix stream parsing issues, including chunk-splitting bugs and safer handling of streamed error responses.
  • Fix prompt handling across contexts, agents, and provider adapters so prompt turns remain consistent in history and completions.
  • Fix several tool/context issues, including function return wrapping, tool lookup after deserialization, unnamed subclass filtering, and thread-safety around tool registry mutations.
  • Fix Google tool-call handling to preserve thoughtSignature.
  • Fix LLM::Tracer::Logger argument handling.
  • Fix packaging/docs issues such as registry files in the gemspec and stale provider docs.
  • Fix Google provider handling of nil function IDs during context deserialization.
  • Fix MCP stdio transport by increasing poll timeout for better reliability.
  • Fix Google provider to properly cast non-Hash tool results into Hash format for API compatibility.
  • Fix schema parser to support recursive normalization of Array, LLM::Object, and nested structures.
  • Fix DeepSeek provider to tolerate malformed tool arguments.
  • Fix LLM::Function::TaskGroup#alive? to properly delegate to Async::Task#alive?.
  • Fix various RuboCop errors across the codebase.
  • Fix DeepSeek provider to handle JSON that might be valid but unexpected.

Notes

Notable merged work in this range includes:

  • feat(function): add fiber-based concurrency for async environments (#64)
  • feat(mcp): add stdio MCP support (#134)
  • Add LLM::Registry + cost support (#133)
  • Consistent model objects across providers (#131)
  • Add rack + websocket example (#130)
  • feat(gemspec): add changelog URI (#136)
  • feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)
  • README and screencast refresh across #66, #67, #68, #71, and #72
  • chore(bot): update deprecation warning from v5.0 to v6.0
  • fix(deepseek): tolerate malformed tool arguments
  • refactor(context): Rename Session as Context (#70)

Comparison base:

  • Latest tag: v4.8.0 (6468f2426ee125823b7ae43b4af507b125f96ffc)
  • HEAD used for this changelog: 915c48da6fda9bef1554ff613947a6ce26d382e3