Rails out of the box supports database migration. It allows Rails programmers to be more Agile, do less database BDUF and instead change the database schema as the requirements change and the business wants it. Writing a migration is also as easy as running a command line script/generate migration and then executing it with rake db:migrate.
Database migration is useful for two reasons. Number one is to allow developers to change the database schema. This can be as simple as a single ALTER TABLE statement to add a new column. Number two, more importantly, is to migrate the all-important data exist in the database. But if your application is still under development for initial release, then migration may not be buying you too much good, because chances are no one cares how your table structure got to where it is now, and you may not have a whole lot of data.
We all use migrations feverishly probably because most of the Rails books/references/tutorials begins with: "Let's start by creating a database migration, in it we insert some data into the newly created tables, and learn how to do XYZ." Thus, your db/migrate folder is stuffed with migrations that does everything: create new tables; alter existing tables; create data for those tables; update data created from previous migrations; and perhaps all of the above, while without justifiable reasons on what those migrations are trying to migrate. In practice, it is quite time-consuming for a developer to run 200+ migrations every time he blows away a database, which is not uncommon. Not only that, sometimes you have multiple migrations that basically cancel out each other's changes as our customers change their minds back and forth. As a result, you could be creating migrations left and right when there may not be any real beneficiaries: a database loaded with data.
Sometimes your QA team may have their own data set that tests your app, and thus their loaded database is a beneficiary. While that is true, I prefer their data to be scripted and be freshly generated by ActiveRecord models (using create!) every time instead of keep migrating them, because as my application domain model expands, I want not just QA data but all datasets to be cleansed and validated by my model validations. There is no guarentee that after running a drop_column migration the QA dataset does not violate any business logic. Keeping all these data valid while database migrations are being rapidly created is very hard.
To take advantage of the fact that your development environment has nothing to lose until you have an initial release, while maximizing the benefits of keeping all your data valid (for all enviroments) the whole time while your app is under development, it's a simple steps 1-2-3:
Step 1:
Have one migration file, 001_release_one_schema.rb, that captures all database object creations. For example, all your create_table, create_index, views, triggers (*yikes*), etc. After this migration, your database should contain all database objects for your Rails app but in a "blank", data-less state.
$ cat db/migrate/001_release_one_schema.rb
class ReleaseOneSchema < ActiveRecord::Migration
def self.up
create_table "foos" do |t|
end
(... and many others ...)
end
def self.down
(... ... ...)
end
end
Step 2:Create a rake task to populate all
reference data that your application requires to run with. Reference data meaning all data that your application cannot change through its screens, but are essential for your app to run. For example, all currencies that are used to populate a drop-down on your app. A lot of drop-down lists data are reference data.
$ cat lib/tasks/data.rake
namespace :data do
desc "Loads a default dataset of both reference and user data into database."
task :load => [ :environment,
:configuration,
:reference_data,
:user_data ]
private
task :configuration do
ENV['DATASET'] ||= 'slim'
end
task :reference_data do
require "#{RAILS_ROOT}/db/data/reference_data/#{ENV['DATASET']}"
end
end
$ cat db/data/reference_data/slim.rb
@us_dollar = Currency.create! :name => "US Dollar"
@yen = Currency.create! :name => "Yen"
@euro = Currency.create! :name => "EURO"
... ... ...
Step 3:Create a rake task to populate all
user data that your app requires to run with. User data are data that a user can create/update within your application. They are also required for the Rails app to function properly the first day when it launches. For example, for your flashy Paypal application, the fees structure on how it charges its users. An application administrator is allowed to raise or drop fees in your app.
namespace :data do
private
task :user_data do
require "#{RAILS_ROOT}/db/data/user_data/#{ENV['DATASET']}"
end
end
$ cat db/data/user_data/slim.rb
Fee.create! :amount => 1_000, :currency => @us_dollar
Fee.create! :amount => 1_500, :currency => @yen
Fee.create! :amount => 2_500, :currency => @euro
... ... ...
There are several benefits of managing your database schema and data this way:
- It is easier and faster to re-populate your entire database to the latest schema from scratch with data, since there are no extraneous migrations.
- Faster to locate and update data needed for application. They are always in your dataset generation scripts.
- No need to worry about outdated/removed ActiveRecord classes and declare them inside the migration file itself. They is no "legacy" ActiveRecord models.
- All data are valid all the time because they are created through Model.create! sanitized by your model validations.
- Easy to specify datasets to load by preference for DEV (slim, loaded), BA (story sign-off), QA (scenario-based), or demo (full) environments. e.g. rake data:load DATASET=slim RAILS_ENV=qa
- No worries about broken/incomplete migrations. Fewer code, fewer trouble.
Now, after your Rails app goes to a 1.0 production release, you should switch this back to the normal Rails database migration style. I suspect your application users won't be too happy if you blow away their data every time you roll out a minor update or a major release... (or not?)