Writing Ruby Extensions in C - Part 10, Hashes

 
This is the tenth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. This post will look at hashes.

Hashes


The nice thing about hashes in ruby C extensions is that they act very much like the ruby hashes they represent. There are a few functions to know about:
  • rb_hash_new() - create a new ruby Hash
  • rb_hash_aset(hash, key, value) - set the hash key to value
  • rb_hash_aref(hash, key) - get the value for hash key
  • rb_hash_foreach(hash, callback, args) - call callback for each key,value pair in the hash. Callback must have a prototype of int (*cb)(VALUE key, VALUE val, VALUE in)

An example will demonstrate this:

 1) int do_print(VALUE key, VALUE val, VALUE in) {
 2)      fprintf(stderr, "Input data is %s\n", StringValueCStr(in));
 3)
 4)      fprintf(stderr, "Key %s=>Value %s\n", StringValueCStr(key),
 5)              StringValueCStr(val));
 6)
 7)      return ST_CONTINUE;
 8) }
 9)
10) VALUE result;
11) VALUE val;
12)
13) result = rb_hash_new();
14) // result is now {}
15) rb_hash_aset(result, rb_str_new2("mykey"),
16)              rb_str_new2("myvalue"));
17) // result is now {"mykey"=>"myvalue"}
18) rb_hash_aset(result, rb_str_new2("anotherkey"),
19)              rb_str_new2("anotherval"));
20) // result is now {"mykey"=>"myvalue",
21) //                "anotherkey"=>"anotherval"}
22) rb_hash_aset(result, rb_str_new2("mykey"),
23)              rb_str_new2("differentval"));
24) // result is now {"mykey"=>"differentval",
25) //                "anotherkey"=>"anotherval"}
26) val = rb_hash_aref(result, rb_str_new2("mykey"));
27) // result is now {"mykey"=>"differentval",
28) //                "anotherkey"=>"anotherval"},
29) // val is "differentval"
30) rb_hash_delete(result, rb_str_new2("mykey"));
31) // result is now {"anotherkey"=>"anotherval"}
32)
33) rb_hash_foreach(result, do_print, rb_str_new2("passthrough"));

Most of this is pretty straightforward. The most interesting part of this is line 33, where we perform an operation on all elements in the hash by utilizing a callback. This callback is defined on lines 1 through 8, and takes in the key, value, and the user data provided to the original rb_hash_foreach() call. The return code from the callback defines what happens to the processing of the rest of the hash. If the return value is ST_CONTINUE, then the rest of the hash is processed as normal. If the return value is ST_STOP, then no further processing of the hash is done. If the return value is ST_DELETE, then the current hash key is deleted from the hash and the rest of the hash is processed. If the return value is ST_CHECK, then the hash is checked to see if it has been modified during this operation. If so, processing of the hash stops.

Update: Fixed up the example code to show on the screen. 

 

 Writing Ruby Extensions in C - Part 9, Arrays

 
This is the ninth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. This post will focus on arrays.

Arrays


The nice thing about arrays in ruby C extensions is that they act very much like the ruby arrays they represent. There are a few functions to know about:
  • rb_ary_new() - create a new array with 0 elements. Elements can be added later using rb_ary_push(), rb_ary_store(), or rb_ary_unshift().
  • rb_ary_new2(size) - create a new array with size elements
  • rb_ary_store(array, index, value) - put the ruby value into array at index. This can be used to create sparse arrays; intervening elements that have not yet had values assigned will be set to nil
  • rb_ary_push(array, value) - put value at the end of the array
  • rb_ary_unshift(array, value) - put value at the start of the array
  • rb_ary_pop(array) - pop the last element of array off and return it
  • rb_ary_shift(array) - remove the first element of array and return it
  • rb_ary_entry(array, index) - examine array element located at index without changing array
  • rb_ary_dup(array) - copy array and return the copy
  • rb_ary_to_s(array) - invoke the "to_s" method on the array. Note that this concatenates the array elements together without spacing, so is not generally useful
  • rb_ary_join(array, string_object) - create a string by converting each element of the array to a string separated by string_object. If string_object is Qnil, then no separator is used
  • rb_ary_reverse(array) - reverse the order of all of the elements in array
  • rb_ary_to_ary(ruby_object) - create an array out of any ruby object. If the object is already an array, a reference to the same object is returned. If the object supports the "to_ary" method, then "to_ary" is invoked on the object and the result is returned. If neither of the previous are true, then a new array with 1 element containing the object is returned

An example should make most of this clear:

 1) VALUE result, elem, arr2, mystr;
 2)
 3) result = rb_ary_new();
 4) // result is now []
 5) rb_ary_push(result, INT2FIX(1));
 6) // result is now [1]
 7) rb_ary_push(result, INT2FIX(2));
 8) // result is now [1, 2]
 9) rb_ary_unshift(result, INT2FIX(0));
10) // result is now [0, 1, 2]
11) rb_ary_store(result, 3, INT2FIX(3));
12) // result is now [0, 1, 2, 3]
13) rb_ary_store(result, 5, INT2FIX(5));
14) // result is now [0, 1, 2, 3, nil, 5]
15) elem = rb_ary_pop(result);
16) // result is now [0, 1, 2, 3, nil] and elem is 5
17) elem = rb_ary_shift(result);
18) // result is now [1, 2, 3, nil] and elem is 0
19) elem = rb_ary_entry(result, 0);
20) // result is now [1, 2, 3, nil] and elem is 1
21) arr2 = rb_ary_dup(result);
22) // result is now [1, 2, 3, nil] and arr2 is [1, 2, 3, nil]
23) mystr = rb_ary_to_s(result);
24) // result is now [1, 2, 3, nil] and mystr is 123
25) mystr = rb_ary_join(result, rb_str_new2("-"));
26) // result is now [1, 2, 3, nil] and mystr is 1-2-3-
27) rb_ary_reverse(result);
28) // result is now [nil, 3, 2, 1]
29) rb_ary_shift(result);
30) // result is now [3, 2, 1]
31) result = rb_ary_to_ary(rb_str_new2("hello"));
32) // result is now ["hello"]

 

 Writing Ruby Extensions in C - Part 8, Strings

 
This is the eighth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talk about dealing with numbers. This post will talk about strings.

Dealing with Strings


It is fairly easy to convert C-style strings to ruby string objects, and vice-versa. There are a few functions to know about:
  • rb_str_new(c_str, length) - take the char * c_str pointer and a length in, and return a ruby string object. Note that c_str does *not* have to be NULL terminated; this is one way to deal with binary data
  • rb_str_new2(c_str) - take the NULL terminated char * c_str pointer in, and return a ruby string object
  • rb_str_dup(ruby_string_object) - take ruby_string_object in and return a copy
  • rb_str_plus(string_object_1, string_object_2) - concatenate string_object_1 and string_object_2 and return the result without modifying either object
  • rb_str_times(string_object_1, fixnum_object) - concatenate string_object_1 with itself fixnum_object number of times and return the result
  • rb_str_substr(string_object, begin, length) - return the substring of string_object starting at position begin and going for length characters. If length is less than 0, then "nil" is returned. If begin is passed the end of the array or before the beginning of the array, then "nil" is returned. Otherwise, this function returns the substring of string_object that matches begin..length, though it may be cut short if there are not enough characters in the array
  • rb_str_cat(string_object, c_str, length) - take the char * c_str pointer and length in, and concatenate onto the end of string_object
  • rb_str_cat2(string_object, c_str) - take the NULL-terminated char *c_str pointer in, and concatenate onto the end of string_object
  • rb_str_append(string_object_1, string_object_2) - concatenate string_object_2 onto string_object_1
  • rb_str_concat(string_object, ruby_object) - concatenate ruby_object onto string_object_1. If ruby_object is a FIXNUM between 0 and 255, then it is first converted to a character before concatenation. Otherwise it behaves exactly the same as rb_str_append
  • StringValueCStr(ruby_object) - take ruby_object in, attempt to convert it to a String, and return the NULL terminated C-style char *
  • StringValue(ruby_object) - take ruby_object in and attempt to convert it to a String. Assuming this is successful, the C char * pointer for the string is available via the macro RSTRING_PTR(return_value) and the length of the string is available via the macro RSTRING_LEN(return_value). This is useful to retrieve binary data out of a String object

An example should make most of this clear:

 1) VALUE result, str2, substr;
 2)
 3) result = rb_str_new2("hello");
 4) // result is now "hello"
 5) str2 = rb_str_dup(result);
 6) // result is now "hello", str2 is now "hello"
 7) result = rb_str_plus(result, rb_str_new2(" there"));
 8) // result is now "hello there"
 9) result = rb_str_times(result, INT2FIX(2));
10) // result is now "hello therehello there"
11) substr = rb_str_substr(result, 0, 2);
12) // result is now "hello therehello there", substr is "he"
13) substr = rb_str_substr(result, -2, 2);
14) // result is now "hello therehello there", substr is "re"
15) substr = rb_str_substr(result, -2, 5);
16) // result is now "hello therehello there", substr is "re"
17) // (substring was cut short because the length goes past the end of the string)
18) substr = rb_str_substr(result, 0, -1);
19) // result is now "hello therehello there", substr is Qnil
20) // (length is negative)
21) substr = rb_str_substr(result, 23, 1);
22) // result is now "hello therehello there", substr is Qnil
23) // (requested start point after end of string)
24) substr = rb_str_substr(result, -23, 1);
25) // result is now "hello therehello there", substr is Qnil
26) // (requested start point before beginning of string)
27) rb_str_cat(result, "wow", 3);
28) // result is now "hello therehello therewow"
29) rb_str_cat2(result, "bob");
30) // result is now "hello therehello therewowbob"
31) rb_str_append(result, rb_str_new2("again"));
32) // result is now "hello therehello therewowbobagain"
33) rb_str_concat(result, INT2FIX(33));
34) // result is now "hello therehello therewowbobagain!"
35) fprintf(stderr, "Result is %s\n", StringValueCStr(result));
36) // "hello therehello there wowbobagain!" is printed to stderr

Update: modified the code to fit in the pre box. 

 

 Writing Ruby Extensions in C - Part 7, Numbers

 
This is the seventh in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. This post will talk about numbers.

Dealing with numbers


Numbers are pretty easy to deal with in a ruby C extension. There are two possible types of Ruby numbers; FIXNUMs and Bignums. FIXNUMs are very fast since they just use the native long type of the architecture. However, due to some implementation details, the range of a FIXNUM is limited to one-half of the native long type. If larger (or smaller) numbers need to be manipulated, Bignums are full-blown ruby objects that can represent any number of any size, at a performance cost. The ruby C extension API has support for converting native integer types to ruby FIXNUMs and Bignums and vice-versa. Some of the functions are:
  • INT2FIX(int) - take an int and convert it to a FIXNUM object (but see INT2NUM below)
  • LONG2FIX(long) - synonym for INT2FIX
  • CHR2FIX(char) - take an ASCII character (0x00-0xff) and convert it to a FIXNUM object
  • INT2NUM(int) - take an int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object. Since this does the right thing in all circumstances, this should always be used in place of INT2FIX
  • LONG2NUM(long) - synonym for INT2NUM
  • UINT2NUM(unsigned int) - take an unsigned int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • ULONG2NUM(unsigned long int) - synonym for UINT2NUM
  • LL2NUM(long long) - take a long long int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • ULL2NUM(unsigned long long) - take an unsigned long long int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • OFFT2NUM(off_t) - take an off_t and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • FIX2LONG(fixnum_object) - take a FIXNUM object and return the long representation (but see NUM2LONG below)
  • FIX2ULONG(fixnum_object) - take a FIXNUM object and return the unsigned long representation (but see NUM2ULONG below)
  • FIX2INT(fixnum_object) - take a FIXNUM object and return the int representation (but see NUM2INT below)
  • FIX2UINT(fixnum_object) - take a FIXNUM object and return the unsigned int representation (but see NUM2UINT below)
  • NUM2LONG(numeric_object) - take a FIXNUM or Bignum object in and return the long representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2LONG
  • NUM2ULONG(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned long representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2ULONG
  • NUM2INT(numeric_object) - take a FIXNUM or Bignum object in and return the int representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2INT
  • NUM2UINT(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned int representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2UINT
  • NUM2LL(numeric_object) - take a FIXNUM or Bignum object in and return the long long representation
  • NUM2ULL(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned long long representation
  • NUM2OFFT(numeric_object) - take a FIXNUM or Bignum object in and return the off_t representation
  • NUM2DBL(numeric_object) - take a FIXNUM or Bignum object in and return the double representation
  • NUM2CHR(ruby_object) - take ruby_object in and return the char representation of the object. If ruby_object is a string, then the char of the first character in the string is returned. Otherwise, NUM2INT is run on the object and the result is returned
For this particular topic I'll omit the example. There aren't really a lot of interesting things to show or odd corner cases that you need to deal with when working with numbers. 

 

 Writing Ruby Extensions in C - Part 6, Catch/Throw

 
This is the sixth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. This post will talk about ruby catch and throw blocks.

Catch/Throw

In ruby, raising exceptions is used to transfer control out of a block of code when something goes wrong. Ruby has a second mechanism for transferring control to blocks called catch/throw. Any ruby block can be labelled via catch(), and then any line of code within that block can throw() to terminate the rest of the block. This also works with nested catch/throw blocks so an inner nested throw could throw all the way back out to the outer block. Essentially, they are a fancy goto mechanim; see [1] for some examples. How can we catch and throw from within our C extension module? Like exceptions, we accomplish this through callbacks.

To set up a catch in a C extension, the rb_catch() function is used. rb_catch() takes 3 parameters: the first parameter is the name of the catch block, the second parameter is the name of the callback to invoke in block context, and the third parameter is data to be passed to the callback. As may be expected, the callback function must take a single VALUE parameter in and return a VALUE.

To return to a catch point in a C extension, the rb_throw() function is used. rb_throw() takes two parameters: the name of the catch block to return to, and the return value (which can be any valid ruby object, including Qnil). If rb_throw() is executed, control is returned from the point of the rb_throw() to the end of the rb_catch() block, and execution continues from there.

An example can demonstrate much of this. First let's look at the C code to implement an example catch/throw:

 1) static VALUE m_example;
 2)
 3) static VALUE catch_cb(VALUE val, VALUE args, VALUE self) {
 4)     rb_yield(args);
 5)     return Qnil;
 6) }
 7)
 8) static VALUE example_method(VALUE klass) {
 9)     VALUE res;
10)
11)     if (!rb_block_given_p())
12)         rb_raise(rb_eStandardError, "Expected a block");
13)
14)     res = rb_catch("catchpoint", catch_cb, rb_str_new2("val"));
15)     if (TYPE(res) != T_FIXNUM)
16)         rb_throw("catchpoint", Qnil);
17)
18)     return res;
19) }
20)
21) void Init_example() {
22)     m_example = rb_define_module("Example");
23)
24)     rb_define_module_function(m_example, "method",
25)                               example_method, 0);
26) }
Lines 21 through 26 set up the extension module, as described elsewhere.

Lines 8 through 19 implement the module function "method". Line 11 checks if a block is given; if not, an exception is raised on line 12. Line 14 sets up an rb_catch() named "catchpoint". The callback catch_cb() will be executed, and a new string of "val" will be passed into the callback. Lines 3 through 6 implement the callback; the value is yielded to the block initially passed into "method", and a nil is returned (which is ignored). Line 15 checks the return value from the block; if it is not a number, then line 16 does an rb_throw() to abort the entire block (with control passing to the line of ruby code after the Example::method call). If the value from the block is a number, then it is returned at line 18. Note that this particular sequence of calls is contrived, since the value returned from the block is just returned to the caller. Still, I think it is a good example of what can be done with rb_catch() and rb_throw().

Now let's look at some example ruby code that might utilize the above code:

require 'example'

# if the method were to be called like this, an exception would be
# raised since no block is given
# retval = Example::method

# if the method were to be called like this, an exception would be
# raised since the return value from the block is not a number
# retval = Example::method {|input|
#     "hello"
# }

# this works properly, since the return value is a number
retval = Example::method {|input|
    puts "Input is #{input}"
    6
}

 

 Writing Ruby Extensions in C - Part 5, Exceptions

 
This is the fifth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. This post will focus on creating and handling exceptions.

Exceptions

When a method implementation in a ruby C extension encounters an error, the typical response is to throw an exception (a value indicating error can also be returned, but that is not idiomatic). The exception to be thrown can either be one of the built-in exception classes, or a custom defined exception class. The built-in exception classes are:
  • rb_eException
  • rb_eStandardError
  • rb_eSystemExit
  • rb_eInterrupt
  • rb_eSignal
  • rb_eFatal
  • rb_eArgError
  • rb_eEOFError
  • rb_eIndexError
  • rb_eStopIteration
  • rb_eRangeError
  • rb_eIOError
  • rb_eRuntimeError
  • rb_eSecurityError
  • rb_eSystemCallError
  • rb_eThreadError
  • rb_eTypeError
  • rb_eZeroDivError
  • rb_eNotImpError
  • rb_eNoMemError
  • rb_eNoMethodError
  • rb_eFloatDomainError
  • rb_eLocalJumpError
  • rb_eSysStackError
  • rb_eRegexpError
  • rb_eScriptError
  • rb_eNameError
  • rb_eSyntaxError
  • rb_eLoadError

Extension modules should usually define a custom exception class for errors related directly to the extension, and use one of the built-in exception classes for standard errors. The custom exception class should generally be a subclass of rb_eException or rb_eStandardError, though if the module has special needs any of the built-in exception classes can be used. Example:

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE exception_impl(VALUE klass, VALUE input) {
 5)     if (TYPE(input) != T_FIXNUM)
 6)         rb_raise(rb_eTypeError, "invalid type for input");
 7)
 8)     if (NUM2INT(input) == -1)
 9)         rb_raise(e_ExampleError, "input was < 0");
10)         return Qnil;
11) }
12)
13) void Init_example() {
14)     m_example = rb_define_module("Example");
15)
16)     e_ExampleError = rb_define_class_under(m_example, "Error",
17)                                            rb_eStandardError);
18)
19)     rb_define_module_function(m_example, "exception_example",
20)                               exception_impl, 1);
21) }
Line 14 sets up the extension module. Line 16 creates the custom exception class as a subclass of rb_eStandardError. Now if the extension module runs into a situation that it can't accept, it can raise e_ExampleError and throw an exception of type Example::Error. Line 19 defines a module function that demonstrates the use of standard and custom exceptions. If Example::exception_example is called with an argument that is not a number, it raises the ArgumentError exception on line 6 (side-note: Check_Type should really be used to do this type of checking, but for example purposes we omit that). If Example::exception_example is called with a number argument that is -1, then the custom exception Example::Error is raised on line 9. Otherwise, the method succeeds and Qnil is returned.

Raising exceptions

There are a few different ways to raise exceptions:
  • rb_raise(error_class, error_string, ...) - the main interface for raising exceptions. A new exception object of class type error_class is created and then raised, with the error message set to error_string (plus any printf-style arguments)
  • rb_fatal(error_string, ...) - a function for raising an exception of type rb_eFatal with the error message set to error_string (plus any printf-style arguments). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
  • rb_bug(error_string, ...) - prints out the error string (plus any printf-style arguments) and then calls abort(). Since this call doesn't allocate an error object or do any of the other typical exception handling steps, it isn't technically a function to raise exceptions. This function should only be used when a bug in the interpreter is found, and as such, should not be used by extension modules
  • rb_sys_fail(error_string) - raises an exception based on errno. Ruby defines a separate class for each of the errno values (such as Errno::EAGAIN, Errno::EACCESS, etc), and this function will raise an exception of the type that corresponds to the current errno
  • rb_notimplement() - raises an exception of rb_eNotImpError. This is used when a particular function is implemented on one platform, but possibly not on other platforms that ruby supports
  • rb_exc_new2(error_class, error_string) - allocate a new exception object of type error_class, and set the error message to error_string. Note that rb_exc_new2() does not accept printf-style options, so the string will have to be fully-formed before passing it to rb_exc_new2()
  • rb_exc_raise(error_object) - a low-level interface to raise exceptions that have been allocated by rb_exc_new2()
  • rb_exc_fatal(error_object) - a low-level interface to raise a fatal exception that has been allocated by rb_exc_new2(). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
The example below shows the use of rb_raise() and rb_exc_raise(), which are the only two calls that extension modules should really use.

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE example_method(VALUE klass, VALUE input) {
 5)     VALUE exception;
 6)
 7)     if (TYPE(input) != T_FIXNUM)
 8)         rb_raise(rb_eTypeError, "invalid type for input");
 9)
10)     if (NUM2INT(input) < 0) {
11)         exception=rb_exc_new2(e_ExampleError, "input was < 0");
12)         rb_iv_set(exception, "@additional_info",
13)                   rb_str_new2("additional information"));
14)         rb_exc_raise(exception);
15)     }
16)
17)     return Qnil;
18) }
19)
20) void Init_example() {
21)     m_example = rb_define_module("Example");
22)
23)     e_ExampleError = rb_define_class_under(m_example, "Error",
24)                                            rb_eStandardError);
25)     rb_define_attr(e_ExampleError, "additional_info", 1, 0);
26)
27)     rb_define_module_function(m_example, "method",
28)                               example_method, 1);
29) }
Lines 20 through 29 show the module initialization. Since this is described in more detail elsewhere, I'll only point out line 25, where a custom attribute for the error class e_ExampleError is defined. When an error occurs in the extension module, additional error information can be placed into that attribute, and any caller can look inside of the error object to retrieve that additional information.

Lines 4 through 18 implement an example method that takes one and only one input parameter. Line 7 checks to see if the input value is a number, and if not an exception is raised with rb_raise() on line 8. Line 10 checks to see if the number is less than 0. If it is, then a new exception object of type e_ExampleError is allocated on line 11 with rb_exc_new2(), and the additional_info attribute of the object is set to "additional information" on line 12. As with most other things, the value that additional_info is set to can be any valid ruby object. Line 14 then raises the exception. This example shows very clearly the power of rb_exc_new2() and rb_exc_raise(), in that additional error information can be passed through to callers.

Handling exceptions

The other half of dealing with exceptions in an extension module is handling exceptions in C code when they are thrown from ruby functions. How is that done since C has no raise/rescue type mechanism? Through callbacks.

There are a few functions that can be used for handling exceptions:
  • rb_ensure(cb, cb_args, ensure, ensure_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. When cb() finishes, regardless of whether it completes successfully or raises an exception, call ensure with ensure_args. The ensure function must take in a single VALUE parameter and return VALUE
  • rb_protect(cb, cb_args, line_pointer) - Call cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If an exception is raised by cb(), store the exception handler point in line_pointer and return control. It is then the responsibility of the caller to call rb_jump_tag() to return to the exception point
  • rb_jump_tag(line) - do a longjmp to the line saved by rb_protect(). No code after this statement will be executed
  • rb_rescue(cb, cb_args, rescue, rescue_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If cb() raises any exception, rescue is called with rescue_args. The rescue callback should take in two VALUE parameters and return VALUE

Another example should make some of this clear:

 1) static VALUE cb(VALUE args) {
 2)     if (TYPE(args) != T_FIXNUM)
 3)         rb_raise(rb_eTypeError, "expected a number");
 4)     return Qnil;
 5) }
 6)
 7) static VALUE ensure(VALUE args) {
 8)     fprintf(stderr, "Ensure value is %s\n",
 9)               StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     fprintf(stderr, "Rescue args %s, object classname %s\n",
15)             StringValueCStr(args),
16)             rb_obj_classname(exception_object));
17)     return Qnil;
18) }
19)
20) VALUE res;
21) int exception;
22)
23) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
24) res = rb_ensure(cb, rb_str_new2("bad"), ensure,
25)                 rb_str_new2("data"));
26)
27) res = rb_protect(cb, INT2NUM(0), &exception);
28) res = rb_protect(cb, rb_str_new2("bad"), &exception);
29) if (exception) {
30)     fprintf(stderr, "Failed cb\n");
31)     rb_jump_tag(exception);
32) }
33)
34) res = rb_rescue(cb, INT2NUM(0), rescue, rb_str_new2("data"));
35) res = rb_rescue(cb, rb_str_new2("bad"), rescue,
36                    rb_str_new2("data"));
Line 23 kicks off the action with a call to rb_ensure(). In this first rb_ensure, we pass a FIXNUM object to cb(), which means that no exception is raised. Because of the rb_ensure(), however, the ensure() callback on lines 7 through 11 is called anyway and does some printing.

Line 24 passes a String object to cb(), which causes cb() to raise an exception. Because of the rb_ensure, the ensure() callback on lines 7 through 11 is called and does some printing. Importantly, after ensure() is called the exception is propagated, so in reality none of the code after line 21 will be executed (we'll ignore this fact for the sake of this example).

Line 27 uses rb_protect() to call the callback; since a FIXNUM object is passed, no exception is raised. Note that if the call that is being wrapped by rb_protect() does not raise an exception, exception is always initialized to 0.

Line 28 uses rb_protect() to call cb() with a String object, which causes an exception to be raised. Because rb_protect() is being used, control will be returned to the calling code at line 29, and that code can then check for the exception. Since an exception was raised, the "exception" integer will have a non-0 number and the code can do whatever we need to clean up and then propagate the exception further with rb_jump_tag() on line 31.

Line 34 uses the rb_rescue() wrapper to call cb(). Since a FIXNUM object is passed to cb(), no exception is raised and no callbacks other than cb() are called.

Line 35 uses rb_rescue() to call cb() with a String object, which causes an exception to be raised and the rescue() callback to be executed. The rescue() callback on lines 13 through 18 takes two arguments: the VALUE initially passed into the rb_rescue() rescue_args, and the exception_object that caused the exception. Based on the exception_object, the rescue() callback can choose to handle this exception or not.

Example

Before finishing this post, I'll leave you with another example. When writing ruby code, the full begin..rescue block goes something like:

begin
  ...
rescue FooException => e
  ...
rescue
  ...
else
  ...
ensure
  ...
How would we implement this in C?

 1) static VALUE foo_exception_rescue(VALUE args) {
 2)     fprintf(stderr, "foo_exception_rescue value is %s\n",
 3)             StringValueCStr(args));
 4)     return Qnil;
 5) }
 6)
 7) static VALUE other_exception_rescue(VALUE args) {
 8)     fprintf(stderr, "other_exception_rescue value is %s\n",
 9)             StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     if (strcmp(rb_obj_classname(exception_object),
15)                "FooException") == 0)
16)         return foo_exception_rescue(args);
17)     else
18)         return other_exception_rescue(args);
19) }
20)
21) static VALUE cb(VALUE args) {
22)     return rb_rescue(cb, args, rescue, rb_str_new2("data"));
23) }
24)
25) static VALUE ensure(VALUE args) {
26)     fprintf(stderr, "Ensure args %s\n", StringValueCStr(args));
27)     return Qnil;
28) }
29)
30) VALUE res;
31)
32) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
This example implements almost the entire ability of the ruby begin..rescue blocks. What it does not implement is the "else" clause; I have not yet come up with a good way to do that. If you think of something to make this example work for the "else" clause, please leave a comment. 

 

 Writing Ruby Extensions in C - Part 4, Types and Return Values

 
This is the fourth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. This short post will focus on some details of method implementation, including how to check the types that are being passed to extension methods and the legal return values from the extension methods.

Ruby types


When implementing a ruby method in C, the method may expect certain arguments to be of a certain type. For instance, it is possible that the ruby method expects a number, and only a number, as the input parameter. The ruby C extension API provides several functions to check if an input parameter is a certain type:
  • TYPE(ruby_object) - return the builtin type of ruby_object. The builtin types distinguish between things like TrueClass, FalseClass, FIXNUM, etc. It explicitly does not distinguish between complicated object types; use either CLASS_OF() or rb_obj_classname() for that. The builtin types that may be returned are:
    • T_NONE
    • T_NIL
    • T_OBJECT
    • T_CLASS
    • T_ICLASS
    • T_MODULE
    • T_FLOAT
    • T_STRING
    • T_REGEXP
    • T_ARRAY
    • T_FIXNUM
    • T_HASH
    • T_STRUCT
    • T_BIGNUM
    • T_FILE
    • T_TRUE
    • T_FALSE
    • T_DATA
    • T_MATCH
    • T_SYMBOL
    • T_BLKTAG
    • T_UNDEF
    • T_VARMAP
    • T_SCOPE
    • T_NODE
  • NIL_P(ruby_object) - test if ruby_object is the nil object
  • CheckType(ruby_object, builtin_type) - check to make sure that ruby_object is of type builtin_type (one of the T_* types listed above). If it is not, an exception is raised
  • CLASS_OF(ruby_object) - return the ruby class VALUE that corresponds to ruby_object. Note that this can distinguish between built-in class types (such as rb_cSymbol) as well as more complicated class types (such as those defined by the API user)
  • rb_obj_classname(ruby_object) - return the char * string representation of the class corresponding to ruby_object

Return values


Every ruby method implemented in C has to return a VALUE. This VALUE can either be a ruby object (such as that returned by INT2NUM), or one of the special values:
  • Qnil - ruby "nil"
  • Qtrue - ruby "true"
  • Qfalse - ruby "false"

Methods that are expected to either succeed or raise an exception typically return Qnil to indicate success. 

 

 Writing Ruby Extensions in C - Part 3, Extension Initialization

 
This is the third in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The posts from here on out will focus on the C code. This post talks about initializing the module and setting up classes.

Initializing the module


There is a bit of magic involved with initially loading the extension module into ruby. Assuming the extension module is called "example", then the C code that implements the extension must have an initialization function that looks like:

 1) static VALUE m_example;
 2)
 3) void Init_example() {
 4)     m_example = rb_define_module("Example");
 5)     example_library_initialize();
 6) }

Line 1 sets up the variable that holds the reference to the module. Line 3 is a function that must be called "Init_<extension_name>", take no parameters, and return nothing. When the ruby interpreter encounters a line of code such as "require 'example'", it will call this initialization function to set things up. Line 4 actually defines the module for us and calls it "Example". Finally, line 5 does whatever initialization is necessary for the library that is being wrapped. In this case, it just calls the example_library_initialize() function.

Defining classes, constants, and methods


Once the module itself has been initialized, functions, classes, methods, and attributes can be added to it. These are pretty easy to use:
  • rb_define_module_function(module, "function_name", implementation, number_of_args) - define function_name for module. Assuming the module is called "Example", functions like this can be invoked from ruby code like:

    out = Example::function_name

    The implementation should be a C function that takes number_of_args and returns a VALUE. See "Implementing methods" below for more explanation of implementation of methods in C.
  • rb_define_class_under(module, "class_name", super_class) - define a new class named "class_name" under the module. super_class can be one of the pre-defined types (rb_cObject, rb_cArray, etc) or a class that has been defined in this module.
  • rb_define_method(class, "method_name", implementation, number_of_args) - define a new method for class. The implementation should be a C function that takes number_of_args and returns a VALUE. See "Implementing methods" below for more explanation of implementation of methods in C.
  • rb_define_const(class, "CONST", value) - define a new constant for class with value. Assuming the module is called "Example" and the class is called "Class", these can be accessed in ruby code like:

    puts Example::Class::CONST

    The value can be any legal ruby type.
  • rb_define_attr(class, "attr_name", read, write) - define a new attribute for class called attr_name. The read and write parameters should each be 0 or 1, depending on whether you want a read implementation and/or a write_implementation for this attribute, respectively.
  • rb_define_singleton_method(class, "method_name", implementation, number_of_args) - define a new singleton method for class. The implementation should be a C functions that takes number_of_args and returns a VALUE. See "Implementing methods" below for more explanation of implementation of methods in C.

Implementing methods


Using the above methods, it is pretty straightforward to define module functions, class methods, and singleton methods. There is a bit of work necessary to understand the C implementation of these methods. The first thing to realize is that the "number_of_args" as the last parameter of the rb_define_* call defines how many parameters the method will take. For no parameters, you would pass 0, for one parameter you would pass 1, etc. When you go to implement the method in C, your C function must take the number of parameters, plus one for the class (this will be shown in the example below).

You can also pass -1, which tells ruby that you want to take optional arguments. When you go to implement the method in C, the C function must take exactly 3 arguments: int argc, VALUE *argv, VALUE klass. The argc parameter defines how many arguments were passed, the argv parameter is all of the arguments in an array, and the last parameter is the klass itself. To properly parse the arguments, the rb_scan_args(argc, argv, "format", ...) should be called. A brief explanation of rb_scan_args is below; for more information, see the document at [1].

The first two arguments to rb_scan_args() are the argc and argv passed into the function. The third argument is a string that defines how many required and how many optional parameters the method requires. The last parameters are pointers to VALUEs to place the value of the arguments in. For instance, to have 1 required and 2 optional parameters to the method, format should be "12" and 3 additional VALUE parameters should be passed to rb_scan_args(). To have no required and 1 optional parameters to the method, format should be "01" and 1 additional VALUE parameter should be passed to rb_scan_args(). Note that if less than the number of required parameters is passed to the method, an ArgumentError exception will be raised. All optional arguments are set to the value that was passed, if any, or "nil".

Let's take a look at an example to show all of this off:

 1) static VALUE m_example;
 2) static VALUE c_example;
 3)
 4) static VALUE mymethod(VALUE c, VALUE arg) {
 5)      fprintf(stderr, "Called mymethod with one arguments\n");
 6)      return Qnil;
 7) }
 8)
 9) static VALUE myvariablemethod(int argc, VALUE *argv, VALUE c) {
10)      VALUE optional;
11)
12)      fprintf(stderr, "Called myvariablemethod with variable
                          arguments\n");
13)
14)      rb_scan_args(argc, argv, "01", &optional);
15)
16)      return Qnil;
17) }
18)
19) void Init_example() {
20)     m_example = rb_define_module("Example");
21)     c_example = rb_define_class_under(m_example, "Class",
                                          rb_cObject);
22)
23)     rb_define_attr(c_example, "my_readonly_attr", 1, 0);
24)     rb_define_attr(c_example, "my_readwrite_attr", 1, 1);
25)
26)     rb_define_const(c_example, "MYCONST", INT2NUM(5));
27)
28)     rb_define_method(c_example, "mymethod", example_mymethod, 1);
29)     rb_define_method(c_example, "myvariablemethod",
                         example_variable_method, -1);
30) }

Lines 19 through 30 are the entry point for the extension. Line 20 defines and stores the module called "Example". Line 21 defines and stores the class "Class" under the module "Example". Line 23 defines a new read-only attribute for the class; this is equivalent to attr_reader in ruby code. This is read-only because the 3rd parameter is 1 and the 4th parameter is 0, meaning to generate a read method but no write method for this attribute. Line 24 defines a new read-write attribute for the class; this is equivalent to attr_accessor in ruby code. This is read-write because the 3rd parameter is 1 and the 4th parameter is 1, meaning to generate both read and write methods. Line 26 defines a new constant for the class called "MYCONST" with a value of 5; this can be accessed in ruby code via Example::Class::MYCONST. Line 28 defines a new method for "Example::Class" called "mymethod" that takes exactly one parameter. Line 29 defines a new method for "Example::Class" called "myvariablemethod" that takes a variable number of parameters.

Now that we have looked at the extension initialization, we can examine the implementation of the methods. Lines 4 through 7 implement the "mymethod" method; the first parameter is the class itself, and the second parameter is the required argument. Lines 9 through 17 implement the "myvariablemethod" method. As described earlier, this takes the number of arguments in argc, the argument array in argv, and the class in c. Line 14 uses rb_scan_args to define zero required arguments and one optional argument. We pass the address of the VALUE "optional" to rb_scan_args(); if an argument is given, this will be filled in with the argument, otherwise it will be set to "nil".

[1] http://www.oreillynet.com/ruby/blog/2007/04/c_extension_authors_use_rb_sca_1.html

Update: edited to make the examples readable 

 

 Writing Ruby Extensions in C - Part 2, RDoc

 
This is the second in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. This post focuses on documentation generation.

RDoc and ri


RDoc is the documentation generation system for ruby. The general idea is that the source code is marked up with specially-formatted comments, and then the rdoc tool is run against the source to generate the documentation. The output from this is either HTML documentation, or ri documentation, or both. Generating rdoc documentation is a simple matter of:
  1. Annotating the source code with the appropriate tags. The basic form of an RDoc tag is:
    
    /*
     * call-seq:
     *   obj.method(required, optional=0) -> retval
     *
     * Call +wrappedLibraryFunction
     * +[http://www.example.org/docs.html#wrappedLibraryFunction]
     * to execute wrappedLibraryFunction.  This method takes a
     * single required argument, and one optional argument that
     * defaults to 0 if not specified.  It returns retval, which
     * can be any valid ruby object
     */
    

    Most of my own knowledge about RDoc syntax comes from [1]; it is highly suggested reading. For more real-world examples of markup, please look at the ruby-libvirt bindings[3]; all of the methods are properly marked-up for RDoc.
  2. Adding appropriate task(s) to the Rakefile. This is very easy as rake has pre-defined tasks for generating RDoc documentation:
    
    1) require 'rake/rdoctask'
    2)
    3) RDOC_FILES = FileList["README.rdoc", "ext/example.c"]
    4)
    5) Rake::RDocTask.new do |rd|
    6)     rd.main = "README.rdoc"
    7)     rd.rdoc_dir = "doc/site/api"
    8)     rd.rdoc_files.include(RDOC_FILES)
    9) end
    10)
    11) Rake::RDocTask.new(:ri) do |rd|
    12)     rd.main = "README.rdoc"
    13)     rd.rdoc_dir = "doc/ri"
    14)     rd.options << "--ri-system"
    15)     rd.rdoc_files.include(RDOC_FILES)
    16) end
    

    Line 1 pulls in the rake rdoctask that does most of the work for us. Line 3 defines the files that will be looked at for generating the rdoc. Note that the order of files is important; if there are dependencies between C files, the earlier dependencies must be listed first. Lines 5 through 9 define the main rdoc task. By default Rake::RDocTask creates a task called "rdoc", so nothing needs to be supplied for that. The "main" attribute of the rd specifies where the top-level documentation comes from. The "rdoc_dir" attribute specifies where the output will go. The "rdoc_files" attributes specifies which files to look at; here we point it at the list defined at line 3. With this task in place, we can now execute:
    
    $ rake rdoc
    

    at the command-line and the rdoc files will be generated from the C files and placed in doc/site/api. Lines 11 through 16 look very similar to the previous rdoc command, with a couple of differences. First, since we supply a symbol to the Rake::RDocTask.new method, we get a task named "ri" instead of rdoc. Second, we specify an option in line 14 that tells rdoc to generate the ri documentation instead of the HTML rdoc documentation. Execution is again easy:
    
    $ rake ri
    

    This will generate the ri documentation from the C files and place the output in doc/ri.
While the idea behind RDoc is very cool, the actual implementation is a little bit weak for C extensions. RDoc just cannot handle several common C idioms:
  • Using a macro to define constants - I used to have code like:
    
    #define DEF_DOMSTATE(name) rb_define_const(c_domain, #name, INT2NUM(VIR_DOMAIN_##name))
    DEF_DOMSTATE(NOSTATE);
    DEF_DOMSTATE(RUNNING);
    

    in ruby-libvirt. This was nice because I didn't have to repeat myself twice on every definition line. Since RDoc couldn't handle the macro, I had to remove all of these to get proper RDoc documentation.
  • Classes and methods split across multiple files - this one is an absolute deal-breaker for me. ruby-libvirt consists of around 7500 lines of C code, and having all of that in one file is just not feasible. Instead I have the code split along functional lines, which makes maintenance much easier. However, RDoc as of ruby 1.8.7 cannot follow the dependencies across different files, and hence almost none of my documentation was being generated. Luckily I found a patch[2] that makes RDoc smart enough to work across different files, but it sucks because I have to continually patch my local Ruby version. Maybe 1.9 fixes this in a better way; the RDoc parser seems to have been completely re-written, so there is hope on that front.
  • Having methods for a class defined in a different file - this one isn't a C idiom as such, but it seems like a simple thing. Given the nature of the ruby-libvirt bindings, I used to have all of the methods concerning a particular class (say, Libvirt::Network) in the same file. That included the lookup and definition methods, which are technically methods of class Libvirt::Connect (e.g. network = conn.lookup_network_by_name('netname')). However, RDoc also cannot handle this, so I was missing the RDoc documentation for all of the lookup and definition methods. I've now changed this to have all of the lookup and definition methods in the connect.c file, but it clutters that file unnecessarily. Again, maybe the Ruby 1.9 rewrite of RDoc fixes this.
That being said, RDoc is the canonical Ruby way to generate documentation, so whatever limitations it has must be worked around. The above is just a list of problems that I have come across that need workarounds in order to properly generate RDoc documentation.
[1] http://www.rubyfleebie.com/an-introduction-to-rdoc/
[2] http://marc.info/?l=ruby-core&m=110691458204738&w=2
[3] http://libvirt.org/git/?p=ruby-libvirt.git;a=tree

Update: edited to make the example RDoc tagging readable
Update: edited to make the references readable
Update: edited to fix up minor formatting problem 

 

本系列从墙外搬运,原地址http://clalance.blogspot.jp/2011/01/writing-ruby-extensions-in-c-part-1.html

 Writing Ruby Extensions in C - Part 1, Project Setup

 
Earlier this year, I took over maintainership of the ruby-libvirt bindings[1]. While I had been contributing to the bindings on and off for the last couple of years, taking over maintainership has led me to learning about a whole range of issues deep inside ruby. Subsequently, I've found that while there is information scattered around the internet about writing these bindings, comprehensive guides (with examples) seem to be lacking. This series of blog posts aim to be a guide for anyone interested in some of the finer details of writing ruby extensions in C. All of these notes apply to Ruby 1.8. In theory, most of this also applies to Ruby 1.9, but I have not personally tested them or done much with Ruby 1.9, so your mileage may vary.

This information is culled from various places around the internet, along with reading the ruby source code and banging my head against a wall until things worked. The most useful resources I have found, besides the ruby sources, are at [2] and [3].

This first post will talk about the general structure of a ruby extension project, including documentation and building. Further posts will talk about programming considerations, including defining classes and methods, memory management, etc.

(NOTE: actually writing ruby extensions by hand seems to be kind of passe nowadays. Apparently FFI[4] is all the rage. That being said, I still find this a useful exercise, if to nobody but myself)

Directory structure

The directory structure of a ruby project is flexible, though most of the ruby extensions that I have seen follow a very similar pattern. Usually the top-level of the project contains a directory listing that looks like:
COPYING
NEWS
Rakefile
README.rdoc
doc/
ext/
The COPYING file contains the license for the project. The NEWS file typically contains information about releases. The Rakefile defines rake targets for the project (see the section about Rakefiles for more information). The README.rdoc file contains the header information that will be used when generating the RDoc documentation; see the post about RDoc for more details. The doc subdirectory contains any additional documentation about the project, including the code for the website, example usage of the code, etc. The ext/ directory typically contains the C source code for the extension module, which can be in any number of files (though note the caveat in the RDoc post about automatically generating RDoc documentation from multiple C files). The ext/ directory also contains the extconf.rb file (see the extconf and mkmf section), which controls how to build the extension.

mkmf and extconf

extconf and mkmf are the parts of the ruby extension build system that generate the header files and Makefile(s) needed to compile the C part of the program. Like the Rakefile, it is run through ruby so has all of the power of ruby at its disposal. A file named extconf.rb is generally placed in the ext/ subdirectory of the project, and extconf requires mkmf to do all of the heavy lifting for it. An example extconf.rb looks like:

 1) require 'mkmf'
 2)
 3) RbConfig::MAKEFILE_CONFIG['CC'] = ENV['CC'] if ENV['CC']
 4)
 5) extension_name = 'example'
 6)
 7) unless pkg_config('library_to_link_to')
 8)     raise "library_to_link_to not found"
 9) end
10)
11) have_func('useful_function', 'library_to_link_to/lib.h')
12) have_type('useful_type', 'library_to_link_to/lib.h')
13)
14) create_header
15) create_makefile(extension_name)
Line 1 just pulls in the mkmf module, which is what does all of the hard work here.

Line 3 isn't strictly necessary, but gives the ability to easily use alternate compilers to build the extension. Since mkmf detects the compiler at Makefile creation time, this isn't very interesting until you consider static analysis tools, which tend to substitute the standard compiler with their own enhanced version. By having this line of code at the top, the Rakefile is prepared to let these static analysis tools do their thing (and help improve your code).

Line 5 defines the extension name, which is used later.

Lines 7 through 9 do a pkgconfig check to see if the library necessary to build this extension exists. Typically you will need to have the development package of the library you want to use installed, including the header files. If the library cannot be found, an exception will be raised and no Makefiles will be generated. Note that this is a required first step; all of the have_*() functions later on work by trying to compile and link a program with the function, type, or constant that you are looking for, so they need to know where to find the library to link against.

Line 11 uses the mkmf function have_func() to determine if the library installed on the build system has the function 'useful_function' defined in the header file 'library_to_link_to/lib.h'. If the function is found, then a macro called HAVE_ will be defined in extconf.h (which all of the C files in the project should #include).

Line 12 uses the mkmf function have_type() to determine if the library installed on the build system has the structure 'useful_type' defined in the header file 'library_to_link_to/lib.h'. If the structure is found, then a macro called HAVE_TYPE_ will be defined in extconf.h.

Line 14 actually creates the header file extconf.h, based on the results from all of the previous have_*() functions. The extconf.h file should be #include'd by all of the C files in the project to gain access to the HAVE_* macros that extconf defines.

Line 15 creates the Makefile based on all of the previous information.
While the recommended way to invoke the extconf.rb is through the Rakefile (see the next section), you can also run it by hand to test it out. If the extconf.rb file is located in the recommended ext/ subdirectory, you can run:
$ cd ext
$ ruby extconf.rb
The mkmf commands should run, and if everything goes smoothly, the extconf.h and Makefile will be generated inside of the ext/ subdirectory. If things do not succeed, the output to stdout, or to mkmf.log should help to debug the problem.

Rakefile

Once the extconf is in place, the next step is to create a Rakefile. As the name suggests, Rakefiles are the ruby analog to Makefiles; they allow automation of arbitrary tasks with possible dependencies between them. They also only re-build pieces of the code that have changed since the last invocation. The main difference between Rakefiles and Makefiles is that Rakefiles are written in ruby, so you have the full power of ruby at your disposal.
With that said, let's take a look at a Rakefile. I'll preface this discussion by saying that I don't know all that much about Rakefiles, other than the bare minimum to get them working. There are additional resources out on the web to describe them in depth[5], so if you want to know more, please look there.

 1) require 'rake/clean'
 2)
 3) EXT_CONF = 'ext/extconf.rb'
 4) MAKEFILE = 'ext/Makefile'
 5) MODULE = 'ext/example.so'
 6) SRC = Dir.glob('ext/*.c')
 7) SRC << MAKEFILE
 8)
 9) CLEAN.include [ 'ext/*.o', 'ext/depend', MODULE ]
10) CLOBBER.include [ 'config.save', 'ext/mkmf.log', 'ext/extconf.h',
                      MAKEFILE ]
11)
12) file MAKEFILE => EXT_CONF do |t|
13)     Dir::chdir(File::dirname(EXT_CONF)) do
14)         unless sh "ruby #{File::basename(EXT_CONF)}"
15)             $stderr.puts "Failed to run extconf"
16)             break
17)         end
18)     end
19) end
20) file MODULE => SRC do |t|
21)     Dir::chdir(File::dirname(EXT_CONF)) do
22)         unless sh "make"
23)             $stderr.puts "make failed"
24)             break
25)         end
26)     end
27) end
28) desc "Build the native library"
29) task :build => MODULE
Line 1 brings in the rake task that we care about. There are many more pre-defined rake tasks available; some of them will be described in further posts.

Lines 3 through 7 set up some global ruby variables that we will use later on. The important point to note here is that we have the full power of ruby available to us, including doing directory globs, array concatenation, etc.
Lines 9 and 10 set up the list of files that will get removed during the CLEAN and CLOBBER steps, respectively. 'rake clean' will clean out the development files listed in the CLEAN variable, and 'rake clobber' will clean out the development files in the CLEAN and CLOBBER variables.

Lines 12 through 29 are the meat of the build task. Lines 28 and 29 set up the start of the dependency chain; any time the rake target of "build" is entered, it depends on everything in MODULE (which is 'ext/example.so'). When rake encounters that, it goes looking for any other dependencies that MODULE may have. In this case, we've defined that MODULE depends on SRC, which is a list of all C files in ext/, plus the Makefile. Since the Makefile is going to be auto-generated by mkmf, we have another dependency between the Makefile and EXT_CONF (which is responsible for generating the makefile). At this point we've reached the end of our dependency chain, so the block at lines 13 through 18 is executed, which produces the Makefile. Once that is done rake goes back up the dependency chain and executes the block at lines 21 to 26, which actually does the build using make. At the end of all of this, the extension module should be properly built (assuming no compile errors, of course).

Gem

The ruby gem system aims to be a package manager for pieces of ruby code. While my personal opinion is that this system re-invents operating system package managers (poorly), they are an integral part of the ruby experience. Gems can be easily built using a few rakefile commands, and they are generally registered at http://rubygems.org. A few minor additions to the Rakefile are used to setup the task:

 1) require 'rake/gempackagetask'
 2)
 3) PKG_FILES = FileList[
 4)     "Rakefile", "COPYING", "README", "NEWS", "README.rdoc",
 5)     "ext/*.[ch]", "ext/MANIFEST", "ext/extconf.rb",
 6) ]
 7)
 8) SPEC = Gem::Specification.new do |s|
 9)     s.name = "example"
10)     s.version = "1.0"
11)     s.email = "list@example.com"
12)     s.homepage = "http://example.org/"
13)     s.summary = "C bindings"
14)     s.files = PKG_FILES
15)     s.required_ruby_version = '>= 1.8.1'
16)     s.extensions = "ext/extconf.rb"
17)     s.author = "List of Authors"
18)     s.rubyforge_project = "None"
19)     s.description = "C Bindings"
20) end
21)
22) Rake::GemPackageTask.new(SPEC) do |pkg|
23)     pkg.need_tar = true
24)     pkg.need_zip = true
25) end
Line 1 brings in the rake gempackagetask. Lines 3 through 6 define the files that we want included in the package; ruby globs can be used here. Lines 8 through 20 are the meat of the gem specification, and are pretty straightforward; just replace the fields with ones appropriate for your project. Finally, lines 22 through 25 define the task itself. To actually build the gem, you would now run:

$ rake gem

[1] http://libvirt.org/ruby
[2] http://ruby-doc.org/core
[3] http://ruby-doc.org/docs/ProgrammingRuby/html/ext_ruby.html
[4] https://github.com/ffi/ffi
[5] http://jasonseifer.com/2010/04/06/rake-tutorial

Update: modified some of the examples to make sure the code wasn't cut-off