Rails Dive: Module MimeResponds

As I'm learning rails, I occasionally hit sections of code that don't make any sense to me. When I saw the following (fairly common) block of code within the rails controller, it wasn't obvious to me exactly what was happening:
-------------
  respond_to do |format|
    format.html # index.html.erb
    format.xml  { render :xml => @posts }
  end
-------------
Sure, I know I can add lines like this
-------------
  respond_to do |format|
    format.html # index.html.erb
    format.xml  { render :xml => @posts }
    format.json { render :json => @posts }
  end
-------------
and it JustWorks, but I'm skeptical of almost everything and that seems like sorcery. So, I spent some quality time with rails and my debugger and found that there's really 10 steps involved that make that block work. For reference, I'm using the following:
  • ruby (1.8.6 p287 [universal-darwin9.0])
  • rails (2.3.3)
  • ruby-debug (0.10.3)
  • ruby-debug-base (0.10.3)
  • ruby-debug-ide (0.4.6)
  • NetBeans 6.7

1.) mime_responds.rb, lines 157-159:
-------------
  Mime::SET.each do |mime|
      generate_method_for_mime(mime)
   end
-------------
As part of rails startup, the MimeResponds module is loaded through a reflectively added include statement. This include causes the class declarations in that module to execute, and the above little block of code to run in the included Responder class (<rant>Lines 157-159? Yet run on load? Why isn't this near the head of the file, where it might be noticed?</rant>). For each mime in Mime::SET (a constant Array of Mime types defined in your config/initializers/mime_types.rb, also loaded during initialization), generate_method_for_mime is run.

2.) mime_responds.rb, lines 147:155:
-------------
  def self.generate_method_for_mime(mime)
    sym = mime.is_a?(Symbol) ? mime : mime.to_sym
    const = sym.to_s.upcase
    class_eval <<-RUBY, __FILE__, __LINE__ + 1
      def #{sym}(&block)                
        custom(Mime::#{const}, &block)  
      end                                   
    RUBY
  end
--------------
Here are the comments I had to clip out for space:
# def html(&block)
#   custom(Mime::HTML, &block)
# end
It's a good example of what's being added to the Responder class by reflection - a small, simple method for each mime type that calls Responder's custom method.

3.) *_controller.rb (based on your controller name)
-------------
  respond_to do |format|
    format.html # index.html.erb
    format.xml  { render :xml => @posts }
  end
-------------
Finally we can start looking at the code I was originally interested in... but only execute part of it. (yaaarrrg!)

respond_to is an instance method in MimeResponds, and in this case we're passing it the do |format|... block as an argument.

4.) mime_responds.rb, lines 102-105:
-------------
  def respond_to(*types, &block)
    raise ArgumentError, "CLIP" unless types.any? ^ block
    block ||= lambda { |responder| types.
       each { |type| responder.send(type) } }
    responder = Responder.new(self)
    block.call(responder)
    responder.respond
  end
-------------
The "CLIP" text in there is mine, just to make the line fit. What it says is this: "respond_to takes either types or a block, never both". It's just a logical XOR to make sure you don't try to pass it both kinds of arguments. In our case, we just passed it the block - so far so good.

The second (and third here, just to make it fit) line initializes the block if it's currently null. It's not in our case, so we just keep going and create a new Responder object.

5.) mime_responds.rb, lines 111-125:
-------------
  class Responder #:nodoc:
    def initialize(controller)
      @controller = controller
      @request    = controller.request
      @response   = controller.response

      if ActionController::Base.use_accept_header
        @mime_type_priority =     
          Array(Mime::Type.lookup_by_extension(
            @request.parameters[:format]) ||
            @request.accepts)
      else
        @mime_type_priority = [@request.format]
      end

      @order     = []
      @responses = {}
    end
-------------
The previous call of Responder.new calls the initialize method shown above. Basically, it's initializing a bunch of instance variables, the most important of which (to us, anyway) is @mime_type_priority. It's going use the method in Mime::Type to return an array of mime types with it's values ordered by the by the request header the client browser sent. In my case, it turned Firefox's request for "text/html, application/xhtml+xml, application/xml; q=0.9,*/*; q=0.8" into
[0] = "text/html"
[1] = "application/xml"
[2] = "*/*"


6.) mime_responds.rb, line 106:
-------------
  def respond_to(*types, &block)
    raise ArgumentError, "CLIP" unless types.any? ^ block
    block ||= lambda { |responder| types.
       each { |type| responder.send(type) } }
    responder = Responder.new(self)
    block.call(responder)
    responder.respond
  end
-------------
Now that we have a new responder object, we can run the block that piqued my interest in the first place.

7.) *_controller.rb (based on your controller name)
-------------
  respond_to do |format|
    format.html # index.html.erb
    format.xml  { render :xml => @posts }
  end
-------------
For each line of the block, we're now executing the little methods we created by reflection back in step 2. If you're debugging with NetBeans, the IDE will jump now to the code shown in step 2, where we created the methods. A totally reasonable thing to do, in my opinion. It's only confusing if you didn't know that code had already been executed (for example, because it was executed on include but buried two thirds of the way through the file...)

8.) mime_responds.rb, lines 127-137
-------------
  def custom(mime_type, &block)
    mime_type = mime_type.is_a?(Mime::Type) ? mime_type : 
      Mime::Type.lookup(mime_type.to_s)

    @order << mime_type

    @responses[mime_type] ||= Proc.new do        
      @response.template.template_format = mime_type.to_sym
      @response.content_type = mime_type.to_s
      block_given? ? block.call : @controller.
        send(:render, :action => @controller.action_name)
    end
  end
--------
Each of those reflectively created methods just calls the custom method, passing in the appropriate mime type. After validating that the passed in variable is, in fact, a Mime::Type (in our case it is - we reference them explicitly in our created methods), we append a new entry to the @order array, and add a new entry to the @responses hash, both of which were declared in Responder's initialize function.

Since @responses did not have a value for the mime type (it was empty when we started executing our reflection-generated methods, and each method is a different mime type), the key pair evaluates to null and the new key is the block in the method above (Proc.new ... end). The end result of what we're doing (after we've executed the custom method for each mime type in our controller) is that the @responses hash table will have keys for each of the different mime types we're set up to handle, and value will be the block for handling them. Quite cool.

9.) mime_responds.rb, line 107:
-------------
  def respond_to(*types, &block)
    raise ArgumentError, "CLIP" unless types.any? ^ block
    block ||= lambda { |responder| types.
       each { |type| responder.send(type) } }
    responder = Responder.new(self)
    block.call(responder)
    responder.respond
  end
-------------
Home Stretch!

10.) mime_responds.rb, lines 172-190
-------------
  def respond
    for priority in @mime_type_priority
      if priority == Mime::ALL
        @responses[@order.first].call
        return
      else
        if @responses[priority]
          @responses[priority].call
          return # mime type match found, be happy and return
        end
      end
    end

    if @order.include?(Mime::ALL)
      @responses[Mime::ALL].call
    else
      @controller.send :head, :not_acceptable
    end
  end
--------
In our base case, an actually very simple method. It steps through each mime type in @mime_type_priority, which was the array of mime types (ordered by preference) we created in step 5. If @responses has an entry for that mime type it executes the block from step 8 and then returns. If not, we roll over to the next-most-preferred response type, and try again. If we reach the end of @mime_type_priority, and still haven't returned from the method (and either they don't want or we can't handle MIME::ALL), we raise an error. In our case, this method should have returned on the first iteration - html was the first type we set up to handle.

With that return, our block-of-interest is complete. We've parsed what the client can handle, mapped that in (in order) to what we can handle, and done our best to respond in his most preferred mime type. So it's not actually sorcery - it's reflection!

(Which, really? Is a little like sorcery.)

Personal Web Crawlers

There are well over a trillion pages on the internet.

The most of them are not good. Everyone may have their own definition, but even the stuff we know isn't good gets posted anyway. Storage is free, and thinking is hard! Brute force your way to excellence!

But there are great pages out there too. Original ideas, brilliant insights, and stunningly different perspectives. How do I navigate the volume? How do I find the good pages without my having to sort through all the rest?

Social search could help, and social browsing might too. But while some of my friends are a good indicator of what I might like, a lot of them aren't. I don't typically make friends with people like me, I make friends with people *unlike* me. So not everything they like is good, by my definition.

Recommendation engines (which assert that people who like what I have liked will continue to do so) seem to do a little better. But if we're already at the point where we're working like mad for a 10% improvement, that doesn't seem like the way forward either.

I wonder how I might teach a computer to make quality decisions like I do. Are there mathematical harmonics behind articles I like? Do the authors share certain determinable characteristics? Do they tend to use similar references? (And if they did, would that bias my stream to the point where it was no longer useful?)

One day, I will have my own web crawler.