Using a C Library from Ruby

Credit: Garrett Rooney

Problem

Youd like to use a library in your Ruby program, but the librarys implemented in C and there are no bindings.

Solution

Write a Ruby extension that wraps the C library with Ruby classes and methods.

Lets say we want to give a Ruby interface to Cs file methods (yes, the File class already does this, but this makes a good example). We want to make it possible to open a disk file and read from it a byte at a time.

Just as in Recipe 22.1, youll need a C file that implements the actual extension. This one is called stdio.c. Its got an Init_stdio function that defines a Ruby module (Stdio), a Ruby class (Stdio::File), and some methods for that class.

The file_allocate function corresponds to the Stdio::File constructor. Because its a constructor, we must also define some hook functions to create and destroy the underlying resources (in this case, a filehandle and the memory it uses):

#include "stdio.h" #include "ruby.h" static VALUE rb_mStdio; static VALUE rb_cStdioFile; struct file { FILE *fhandle; }; static VALUE file_allocate(VALUE klass) { struct file *f = malloc(sizeof(*f)); f->fhandle = NULL; return Data_Wrap_Struct(klass, file_mark, file_free, f); } static void file_mark(struct file *f) { } static void file_free(struct file *f) { fclose(f->fhandle); free(f); }

The file_open function implements the Stdio::File#open method:

static VALUE file_open(VALUE object, VALUE fname) { struct file *f; Data_Get_Struct(object, struct file, f); f->fhandle = fopen(RSTRING(fname)->ptr, "r"); return Qnil; }

file_readbyte implements the Stdio::File#readbyte method:

static VALUE file_readbyte(VALUE object) { char buffer[2] = { 0, 0 }; struct file *f; Data_Get_Struct(object, struct file, f); if (! f->fhandle) rb_raise(rb_eRuntimeError, "Attempt to read from closed file"); fread(buffer, 1, 1, f->fhandle); return rb_str_new2(buffer); }

Finally, our Init_ method defines the Stdio module, the File class, and the three methods defined for the File class:

void Init_stdio() { rb_mStdio = rb_define_module("Stdio"); rb_cStdioFile = rb_define_class_under(rb_mStdio, "File", rb_cObject); rb_define_alloc_func(rb_cStdioFile, file_allocate); rb_define_method(rb_cStdioFile, "open", file_open, 1); rb_define_method(rb_cStdioFile, "readbyte", file_readbyte, 0); }

As before, youll need an extconf.rb file that knows how to compile your C library:

# extconf.rb require mkmf dir_config("stdio") create_makefile("stdio")

Once the C library is compiled, you can use it from Ruby as though it were a Ruby library:

open(foo.txt, w) { |f| f << foo } require stdio f = Stdio::File.new f.open(foo.txt) f.readbyte # => "f" f.readbyte # => "o" f.readbyte # => "o"

Discussion

The basic idea when writing a Ruby extension is to create a C data structure and wrap it in a Ruby object. The C data structure gives you someplace to store whatever data you need, so you can access it in your C methods. You e creating a primitive form of object-oriented programming in C.

Ruby provides some macros to help with this. Data_Wrap_Struct wraps a C data structure in a Ruby object. It takes a pointer to your data structure, along with a few pointers to callback functions, and returns a VALUE. The Data_Get_Struct macro takes that VALUE and gives you back a pointer to your C data structure.

You usually use Data_Wrap_Struct inside your classs allocate function (called by the constructor), and Data_Get_Struct inside its instance methods. In the example above, the file_allocate function creates a C struct (containing a variable of type FILE) and passes it into Data_Wrap_Struct to get a VALUE. The functions for the instance methods, file_open and file_readbyte, both take a VALUE as an argument, and pass it into Data_Get_Struct to get a C struct.

So what about those callback functions? There are three of them: an "allocate" function, a "mark" function, and a "free" function. The "allocate" function is called whenever an object is created. The other two have to do with garbage collection.

Rubys garbage collector uses a mark-and-sweep algorithm: it runs through all the "live" objects in the system, marking them to note that it was able to reach them. Then it destroys every object that it couldn reach: by definition, those objects are no longer in use, and don need to be kept around in memory. To make this work, you need to provide two callbacks: one that marks an object as reachable, and one that frees the underlying resources for all unreachable objects.

In this case, both functions are simple. The "free" callback simply closes the filehandle and calls the C free function. The "mark" callback doesn need to do anything, since this object doesn refer to any other Ruby objects.

If your object does contain references to other Ruby objects, all you need to do is explicitly mark them (by calling the rb_gc_mark function) in your "mark" callback. This example goes a bit further than it needs to by defining an empty mark callback; it could accomplish the same thing by passing in a NULL function pointer.

To summarize: if your library doesn define its own data structures, define your own C struct. Implement methods that translate Ruby arguments into their C equivalents, call the library functions you e interested in, then translate the return values back into Ruby data structures, so that the rest of the Ruby program can use it.

See Also

Категории