About
This post is a beginner-friendly introduction to
Data_Wrap_Struct, the macro that connects C data to Ruby
objects in mruby. If you have written mruby C extensions before you may
have noticed that some Ruby objects are backed by C structs rather than
instance variables. The Data macros are how that works. They
allocate a special kind of Ruby object called RData, attach
a C pointer to it, and register a destructor so the C memory is freed
automatically when the Ruby object is garbage collected.
By the end you should understand when to use
Data_Wrap_Struct, how it differs from plain instance
variables, and how to avoid the common pitfalls that trip up
newcomers.
Background
What is RData?
Every Ruby object in mruby is backed by a C struct that starts with
the MRB_OBJECT_HEADER macro:
struct RData {
MRB_OBJECT_HEADER;
struct iv_tbl *iv;
const mrb_data_type *type;
void *data;
};
The data field holds a void* to whatever C
memory you want to associate with the object. The type field
carries a name (for debugging) and a destructor function pointer (called
when the object is collected). The iv field holds instance
variables, just like any other Ruby object -- so you can use both a C
struct and Ruby instance variables on the same object.
When to use it
Use Data_Wrap_Struct when your Ruby object needs to hold
a C resource that cannot be represented as a Ruby value. Typical examples
include:
- A file descriptor or
FILE*handle - A pointer to a native library struct (
struct jail,struct curl) - A mmap'd region of memory
- A database connection handle
If your object only stores integers, strings, and other Ruby values,
you do not need Data_Wrap_Struct -- instance variables
suffice.
Defining a data type
Before you can wrap a struct, you need an mrb_data_type
descriptor. This is a small struct with a name and a destructor:
#include <mruby.h>
#include <mruby/data.h>
static void
process_info_free(mrb_state *mrb, void *ptr)
{
free(ptr);
}
static const mrb_data_type process_info_type = {
"process_info", process_info_free
};
Explanation
process_info_freeis called when the Ruby object is garbage collected. If you skip this, your C struct leaks.- The name
"process_info"is used ininspectoutput and error messages from the type-checking macros.
Wrapping a struct
The Data_Wrap_Struct macro allocates a new
RData object and attaches your C pointer to it. It returns
struct RData*, not an mrb_value. To hand it to
Ruby you wrap it with mrb_obj_value:
struct process_info {
pid_t pid;
char name[256];
};
static mrb_value
mrb_process_new(mrb_state *mrb, mrb_value self)
{
struct RClass *klass = mrb_class_ptr(self);
struct process_info *info = calloc(1, sizeof(*info));
info->pid = getpid();
strlcpy(info->name, "self", sizeof(info->name));
return mrb_obj_value(
Data_Wrap_Struct(mrb, klass, &process_info_type, info));
}
Explanation
mrb_class_ptr(self)extracts thestruct RClass*from themrb_valuethat represents the class itself. In a class method,selfis the class, not an instance.Data_Wrap_Struct(mrb, klass, &process_info_type, info)allocates theRDataobject and makes it an instance ofklass.mrb_obj_value(...)converts thestruct RData*into anmrb_valuethat Ruby can use. The function name ismrb_obj_value, notmrb_value_obj.
Retrieving the pointer
To get your C struct back from a wrapped object, use
Data_Get_Struct or DATA_PTR:
static mrb_value
mrb_process_pid(mrb_state *mrb, mrb_value self)
{
struct process_info *info;
Data_Get_Struct(mrb, self, &process_info_type, struct process_info,
info);
return mrb_fixnum_value(info->pid);
}
Explanation
Data_Get_Structchecks that the object'smrb_data_typematches&process_info_type. If it does not match, it raisesTypeError. This prevents passing the wrong type of object to the method.- The last two arguments are the C type name and a variable name. The macro declares the local variable for you.
- If you do not need type checking (e.g. in a hot path where you know
the type), use
DATA_PTR(self)to get the rawvoid*.
Setting the instance type
This is the most common mistake. When you define a class with
mrb_define_class, mruby sets its default instance type to
MRB_TT_OBJECT. But Data_Wrap_Struct produces
objects of type MRB_TT_CDATA. If the class expects
MRB_TT_OBJECT, mruby rejects the allocation with an error
like:
(mirb):in find: allocation failure of BSD::Process (TypeError)
The fix is to tell mruby that instances of your class are
RData objects. Use MRB_SET_INSTANCE_TT right
after defining the class:
void
mrb_mruby_bsd_process_gem_init(mrb_state *mrb)
{
struct RClass *mod = mrb_define_module(mrb, "BSD");
struct RClass *proc = mrb_define_class_under(
mrb, mod, "Process", mrb->object_class);
MRB_SET_INSTANCE_TT(proc, MRB_TT_CDATA);
mrb_define_class_method(mrb, proc, "find",
mrb_process_find, MRB_ARGS_REQ(1));
}
Explanation
-
MRB_SET_INSTANCE_TT(proc, MRB_TT_CDATA)sets bits 0-4 of the class'sflagsfield toMRB_TT_CDATA. This tellsmrb_allocto allocate astruct RDatainstead of astruct RObjectwhen the class is instantiated. -
Without this,
Data_Wrap_Structraises aTypeErrorbecause the class's instance type constraint rejectsMRB_TT_CDATA. -
The macro is defined in
include/mruby/class.h:#define MRB_SET_INSTANCE_TT(c, tt) ((c)->flags = (((c)->flags & ~MRB_INSTANCE_TT_MASK) | (char)(tt)))
The complete example
Here is a complete mrbgem that wraps a process info struct, including
both the MRB_SET_INSTANCE_TT fix and the full round-trip
from allocation to access:
/* src/bsd_process.c */
#include <mruby.h>
#include <mruby/class.h>
#include <mruby/data.h>
#include <mruby/error.h>
struct process_info {
pid_t pid;
char name[256];
};
static void
process_info_free(mrb_state *mrb, void *ptr)
{
free(ptr);
}
static const mrb_data_type process_info_type = {
"process_info", process_info_free
};
static mrb_value
mrb_process_find(mrb_state *mrb, mrb_value self)
{
mrb_int pid;
struct RClass *klass = mrb_class_ptr(self);
struct process_info *info;
mrb_get_args(mrb, "i", &pid);
info = calloc(1, sizeof(*info));
info->pid = (pid_t)pid;
snprintf(info->name, sizeof(info->name), "process-%ld", (long)pid);
return mrb_obj_value(
Data_Wrap_Struct(mrb, klass, &process_info_type, info));
}
static mrb_value
mrb_process_pid(mrb_state *mrb, mrb_value self)
{
struct process_info *info;
Data_Get_Struct(mrb, self, &process_info_type, struct process_info,
info);
return mrb_fixnum_value(info->pid);
}
void
mrb_mruby_bsd_process_gem_init(mrb_state *mrb)
{
struct RClass *mod = mrb_define_module(mrb, "BSD");
struct RClass *proc = mrb_define_class_under(
mrb, mod, "Process", mrb->object_class);
MRB_SET_INSTANCE_TT(proc, MRB_TT_CDATA);
mrb_define_class_method(mrb, proc, "find",
mrb_process_find, MRB_ARGS_REQ(1));
mrb_define_method(mrb, proc, "pid",
mrb_process_pid, MRB_ARGS_NONE());
}
# mrblib/bsd_process.rb
module BSD
class Process
# The C code defines .find and #pid.
# Add convenience methods in Ruby:
def to_s
"#<BSD::Process pid=#{pid}>"
end
end
end
# spec/bsd_process_spec.rb
describe "BSD::Process" do
describe ".find" do
it "returns a process for a given PID" do
proc = BSD::Process.find(3189)
expect(proc).must_be_instance_of BSD::Process
expect(proc.pid).must_equal 3189
end
end
end
Minitest.run(ARGV) || exit(1)
Constructor pattern
If you want Ruby-side new to work with your data-wrapped
class, define initialize in C. The self in
initialize is already an allocated RData object
(because mruby's Class#allocate creates it), so you use
mrb_data_init to attach the struct:
static mrb_value
mrb_process_init(mrb_state *mrb, mrb_value self)
{
mrb_int pid;
struct process_info *info;
mrb_get_args(mrb, "i", &pid);
info = calloc(1, sizeof(*info));
info->pid = (pid_t)pid;
mrb_data_init(self, info, &process_info_type);
return self;
}
Explanation
-
mrb_data_init(self, info, &process_info_type)sets thedataandtypefields on the already-allocatedRData. The object was created byClass#allocate, so you do not callData_Wrap_Structhere. -
mrb_data_initis defined ininclude/mruby/data.h:MRB_INLINE void mrb_data_init(mrb_value v, void *ptr, const mrb_data_type *type) { mrb_assert(mrb_data_p(v)); DATA_PTR(v) = ptr; DATA_TYPE(v) = type; }
Common mistakes
Missing MRB_SET_INSTANCE_TT
Without it, Data_Wrap_Struct raises
TypeError:
allocation failure of BSD::Process (TypeError)
Returning struct RData* directly
Data_Wrap_Struct returns struct RData*. If
you return that directly from a C function that should return
mrb_value, the compiler complains about incompatible types.
Always wrap it:
return mrb_obj_value(Data_Wrap_Struct(...));
Typo in mrb_obj_value
The function is mrb_obj_value, not
mrb_value_obj. Writing mrb_value_obj(...)
produces an implicit declaration that returns int, causing a
type mismatch:
error: returning 'int' from a function
with incompatible result type 'mrb_value'
Forgetting the destructor
If you do not provide a dfree callback, your C struct
leaks when the Ruby object is garbage collected. If your struct contains
nested allocations, the destructor must free those too.
Using Data_Wrap_Struct when instance variables suffice
If your object only stores simple Ruby values (strings, integers,
arrays), use mrb_iv_set instead.
Data_Wrap_Struct adds complexity and requires the
MRB_SET_INSTANCE_TT setup.
Data_Make_Struct
mruby also provides Data_Make_Struct, which combines
allocation of the C struct with the Ruby object wrap:
#define Data_Make_Struct(mrb, klass, strct, type, sval, data_obj) do { \
(data_obj) = Data_Wrap_Struct(mrb, klass, type, NULL); \
(sval) = (strct*)mrb_malloc(mrb, sizeof(strct)); \
{ static const strct zero = { 0 }; *(sval) = zero; }; \
(data_obj)->data = (sval); \
} while (0)
This allocates the C struct through mrb_malloc (which the
GC tracks) and zero-initialises it. The sval variable lets
you populate the struct after allocation. Data_Make_Struct
is useful when you do not need custom allocation logic, but
Data_Wrap_Struct with calloc gives you more
control.
Reference
Data API summary
| Macro/Function | Purpose |
|---|---|
Data_Wrap_Struct(mrb, klass, type, ptr) |
Allocates an RData, returns struct RData* |
mrb_obj_value(ptr) |
Converts struct RData* to
mrb_value |
Data_Get_Struct(mrb, obj, type, c_type, var) |
Extracts pointer with type check |
DATA_PTR(obj) |
Gets raw void* without type check |
DATA_TYPE(obj) |
Gets the mrb_data_type* of an object |
mrb_data_init(v, ptr, type) |
Attaches data to an existing RData |
MRB_SET_INSTANCE_TT(c, tt) |
Sets instance type for a class |
mrb_data_type |
Struct with name and destructor |
Conclusion
Data_Wrap_Struct is the bridge between C memory and Ruby
objects in mruby. It gives you a Ruby object that holds a C pointer, with
automatic cleanup through the GC. The pattern is simple: define a type
descriptor with a destructor, set the class instance type to
MRB_TT_CDATA, wrap your struct with
Data_Wrap_Struct, and retrieve it with
Data_Get_Struct.
Real-world mrbgems like mruby-jail, mruby-curl, and mruby-termbox2 use this pattern to wrap native library handles. The earlier post on how to write C with mruby covers more of the general mruby C API if you are new to writing mruby extensions.
Further topics worth exploring include building Ruby-side classes on
top of data-wrapped C objects, using mrb_data_check_get_ptr
for optional type checks, and the GC arena management required when
allocating many RData objects in a loop.