Tuesday, March 20, 2007

Extreme Rails programming

What is considered as extreme Rails programming? How do you know you are coding in a zone where no man has gone before you?

That's the feeling of you standing on top of Mount Everest, with the world underneath your feet. That's the feeling when history is in the making before your eyes.

So how do you get there?

Got dumped by your girlfriend because you love Rails more than her? Uh-uh.

Solving a problem no one has thought of before in Ruby on Rails? Negative.

Writing Rails code on the moon? Nah.

Today I looked over a programming pair's shoulders, and I immediately realized that they were in a programming zone where no one ever has gone before them. So I took a picture as proof.

And it is...

Writing a Rails app in vim, with a Dvorak keyboard.



This is real, and it's priceless.

So do you know what key you press to yank a line?

(Thanks to Muness Alrubaie and David Vollbracht for this shot)

Tuesday, March 06, 2007

Rails migration pitfalls

When we create a migration, we are essentially creating a "delta" SQL script, whose intent is to change the state of the database at a given time, including its schema and its data, from point A to point B. Rails migration scripts even allow you to rollback from B to A. A lot of developers got the delta of the schema part right, because it is the most intriguing part, but they fail to recognize the data part in their migrations.

Take a look at the following migration:

class MoveColumnFooFromTableAToTableB < ActiveRecord::Migration
  def self.up
    remove_column :table_a, :foo
    add_column :table_b, :foo, :string
  end

  def self.down
    add_column :table_a, :foo, :string
    remove_column :table_b, :foo
  end
end


This migration drops column "foo" from Table A, and adds column "foo" to Table B. From the schema point of view, the migration achieves what it's supposed to do - moving a column. But this delta script is flawed from the data point of view.

Imaging your application is at 1.0, users start inserting 20MM rows into your Table A. This migration script is part of your release 1.5 upgrade. You run this script against your table A. Poof! The column is moved, but you just lost *all* of your data in the column "foo" forever. Table B now has a "foo" column with no data in it. Things went wrong, so you rollback, right? Try it. Still, your Table A now contains a "foo" column with all NULL values in it. The data are gone.

Worse yet, your boss is standing behind you, giving you 15 minutes to fix the whole mess. "Database migration sucks...", you mumbled.

This migration fails to migrate the data from point A to point B. So, what should developers have done differently? Well, here's one way to do it:

  def self.up
    add_column :table_b, :foo, :string
    execute("UPDATE table_b INNER JOIN table_a ON table_b.id = table_a.id SET table_b.foo = table_a.foo")
    remove_column :table_a, :foo
    
  end

  def self.down
    add_column :table_a, :foo, :string
    execute("UPDATE table_a INNER JOIN table_b ON table_a.id = table_b.id SET table_a.foo = table_b.foo")
    remove_column :table_b, :foo
  end


Now with this delta script, which accounts for data migration as well, will do what it is intended to migrate: schema and data.

Here is another gotcha situation:

  def self.up
    add_column :table_a, :foo, :null => true
  end


When you have a table that contains rows in it, like our 20MM row table, after this migration, the 20MM rows will contain NULL in column "foo". Your database will complain null column constraint violated after this migration.

So be careful when you are performing these migrations. My advice is, do a sanity check on all your migrations by running them against a database with tables populated with data. It actually may not be a bad idea to run a CI build on any migration check-ins, to tick off all migrations, against a database full of data, if your migrations are meant to migrate data.

Enhance Array#collect to become magical

Do you do this often?

customers.collect { |customer| customer.name }
customers.collect { |customer| [customer.name, customer.id] }


Array#collect is indeed very powerful. But I still find myself to repeatedly declare a variable to keep a reference of the elements I am iterating. I could care less if I call it |customer| or |c|.

What if I enhance the Array class to do the following:

customers.collect_name
customers.collect_name_and_id


DRY-ness... Inspiration came from ActiveRecord's magic #find method.

class Array

  def method_missing(method_sym, *args)
    if collect_by_method?(method_sym)
      attributes = fetch_collect_attributes(method_sym)
      if attributes.size == 1
        block = lambda { |element| element.send attributes.first }
      else
        block = lambda { |element| attributes.collect { |attribute| element.send :"#{attribute}" } }
      end
      self.collect(&block)
    else
      super
    end
  end

  private

  def collect_by_method?(method_sym)
    method_sym.to_s =~ /^collect_/
  end

  def fetch_collect_attributes(method_sym)
    attributes = method_sym.to_s.gsub(/collect_/, '').split(/_and_/)
    raise ArgumentError, "Array#collect_* requires at least one method name after it. eg. #collect_id" if attributes.empty?
    attributes
  end

end


And of course, code is no good without tests:

class ArrayTest < Test::Unit::TestCase

  def test_collect_raises_exception_with_no_parameter
    assert_raise ArgumentError do
      [].collect_
    end
  end

  def test_collect_with_one_parameter
    array = []
    array << TestStruct.new(:id => 1, :foo => 'Foo 1')
    array << TestStruct.new(:id => 2, :foo => 'Foo 2')
    assert_equal ['Foo 1', 'Foo 2'], array.collect_foo
  end

  def test_collect_with_multiple_parameter
    array = []
    array << TestStruct.new(:id => 1, :foo => 'Foo 1', :bar => 'Bar 1')
    array << TestStruct.new(:id => 2, :foo => 'Foo 2', :bar => 'Bar 2')
    assert_equal [ ['Foo 1', 'Bar 1'], ['Foo 2', 'Bar 2'] ], array.collect_foo_and_bar
  end

  def test_collect_does_not_interfere_default_method_missing
    assert_raise NoMethodError do
    [].foo
    end
  end

end