STEPHEN CHU . com: pattern

Showing posts with label pattern. Show all posts

Sunday, June 22, 2008

Object-Oriented Programming with Rails ActiveRecord

Have you seen this before?

@posts = Post.find :all, :conditions => { :user_id => @user.id }

This happens to new Rails developers quite a bit. This code retrieves correctly an array of @posts and send them to the view for rendering. But what about it?

The key thing here is that this can also be done in a better way:

@posts = @user.posts

When you use the former method to retrieve @posts, you are actually thinking in database terms along the lines of "in the database table 'posts' and therefore ActiveRecord model 'Post', retrieve the rows whose foreign key column 'user_id' has the value @user.id." ActiveRecord is a pattern for accessing your data from objects, but you also have to combine it with the power of the object-oriented-ness of Ruby to create the chimes of beautiful code. Using Rails and Ruby does not make anyone a better programmer automagically. One can still write code in Java/C# in the same procedural style as if you were writing C. It is how to leverage the best of all worlds makes you a better problem solver.

Therefore, think like an object bigot. Think in objects, and not in database foreign key column values. Whenever you see any Rails code in the pattern of this:

:model_id => @model.id

Just stop. "Objects on Rails" sounds a lot better than "Ruby on Foreign Keys."

Wednesday, June 18, 2008

Learning encapsulation should come before learning Rails

Do you see anything wrong in this one line of code?

puts 'Drink up your milk!' if @milk.expiration_date < 2.days.from_now

I think there is. In particular, the programmer can do better. It is about a programming concept that we have all heard and should be familiar with: encapsulation. If you are into testing, which you ought to be, you can probably identify such encapsulation violations by how smelly your tests are, as I believe in Test Driven Design. Going back to the line of code. What's wrong with it?

The pattern of this code is, you have code somewhere that yanks states (expiration_state) out of an object (@milk) and then interrogate against those states (< 2.days.from_now). By no means this is a rule, as exceptions do exist. But, when this happens, your programming siren in your head should go off as if the CI build is broken: can this be a method on the object @milk itself?

Ruby is a powerful programming language. It allows you to do all sorts of fancy programming: dynamically altering classes, add methods to only the selected few object instances, while being duck-typed and object-oriented all at the same time. There are examples galore in Rails itself. But to enable all such magic, all of your ActiveRecord domain models have getters and setters on all of their states (i.e. data). While that is convenient to access their states, sometimes you have to be careful. With great power comes great responsibility, and that responsibility comes down to you.

You might not think this applies to you. But have you ever written code like this in Ruby?

MyMailer.deliver_product_returned_notification if @product.state == State::Returned

total = @line_items.sum { |line_item| line_item.price * line_item.quantity }

if [:admin, :superuser].include?(@user.role)
  ...
end

Now, let's look at these rewritten examples:

puts 'Drink up your milk!' if @milk.expiring?

MyMailer.deliver_product_returned_notification if @product.returned?

total = @line_items.sum(&:subtotal)

if @user.superuser?
  ...
end

Forget about the loops, the blocks, the hype, and everything about Ruby for a sec. Code is code. Not only does the code become more readable, when you try to enrich your domain by naming things correctly, you could also very well be opening up new business concepts that wasn't previously clear or accurate. If you have a domain model, you are modeling it against domain concepts so that your app can interact with. You'd better be sure it is right, or else lots of time will go wasted on code that solves only half of the business problems all the time.

Saturday, April 05, 2008

Pragmatic Rake tasks organization

Do you have a lot of rake tasks on your Rails projects? I am sure you do. Rake tasks are handy, and developing your next-big-idea Rails app is a lot less attractive without it. Most of the time you drop a new foo.rake file in your lib/tasks folder, and you write code in it and get out. Over the course of development, that folder is likely to look like your favorite Halloween party house (ahem, Neal's...), except on an every day basis.

If you want your rake tasks to be more manageable, here are a few tips:

1) Namespace your rake tasks.

Namespaces in rake is nice. They make sure all tasks in your namespace are cohesive as a area of work. Still, your job is to keep your rake tasks not too crowded in a single namespace. For example, the db namespace that Rails provides very easily become crowded with tasks of all sorts, because, well, on a Rails application what you do almost always is related to your database: drop/create/migrate your database, importing reference/application data, database integration points with other applications, db related background tasks, backups... Phew! How about create a new namespace for import, integration, background, and backup for each of them?

2) Name your rake files by their namespaces.

Even if you namespace all your rake tasks, still, you are likely to end up with a bunch of .rake files in lib/tasks. But, now that you have your tasks grouped by their meaningful and not-so-crowded namespaces, you should name your files by your namespace. For example, db.rake, data.rake, import.rake, integration.rake, and backup.rake. Can you guess where that rake db:awesome is implemented? You certainly don't know what it does, but you got that right, in lib/tasks/db.rake.

3) Try not to nest more than 2 namespaces.

This tip is not a rule, the point being don't over do it. Sometimes, that third-level namespace really makes sense, For example, db:create:all from Rails creates databases for all environments. It isn't for everything, though. For example, I prefer rake data:load better than rake db:data:load. Not only is it shorter, but also the namespace db is simply implied. Short nested-ness forces you to name your namespaces more descriptively, and thus easier to navigate to their implementation.

4) Move your rake file and its external ruby files into a sub-folder, named by that namespace.

When your rake tasks are complicated enough, you obviously will use objects to do the job for you. When that happens, instead of having lib/tasks/integration.rake, you will instead want to have a sub-folder lib/tasks/integration/, and put files like integration.rake and integration_awesome_klass.rb in it. There really is no need to contaminate your lib folder with rake specific files.

Shameless Plug: Still using NAnt? Check out how to organize your NAnt scripts as well, on this blog post.

Friday, January 11, 2008

Format your data using Rails composed_of Value Objects

A requirement comes in as usual says, "I want prices to be formatted like '$2,000.00' on all rhtml's."

Your programming brain then associates the word "formatted" with "presentation". Afterall, the price column is simply stored in a decimal column in the database. And off you go, wrapping number_to_currency(@book.price) on all of your views, just like any tutorial says.

It's all good until the same formatting is required for all emails being sent, all web services requests being issued, all Excel/CSV reports being generated, all everything. Since you took advantage of formatting the prices with an ActionView helper method number_to_currency, you cannot do the same in places where ActionView helper methods are not available to you. You start creating wrapper classes around your Book class, maybe a presenter, to keep this "presentation" logic outside of your pristine business domain model.

class Book < ActiveRecord::Base
end

class BookPresenter < SimpleDelegator
  include ActionView::Helpers::NumberHelper
  
  def price
    nummber_to_currency "#{__getobj__.price}"
  end
  
  def isbn
    ... ...
  end
end

class BooksController < ApplicationController
  def show
    @book = BookPresenter.new Book.find(param[:id])
  end
  
<p>
  ISBN: <%= @book.isbn %>
  Price: <%= @book.price %>
</p>

To present formatted data is extremely common on any applications, be it a price, date/time, name, address, SSN, ISBN, or even a baseball game score. A lot of these information are so simple that they can be stored in a single database column and be represented by a primitive type like String, DateTime, or Fixnum in Ruby. Since they are so simple in nature, programmers think that all it takes to format them is wrap a helper method around it on the view. But this approach may backfire like the example above.

Alternatively, one can create a value object that has a #to_s method that does all the formatting for you. That way, everywhere that needs formatting can ask a book object for it, and not have to wrap a presenter around it for something as formatting.

class Book < ActiveRecord::Base
  composed_of :isbn, :class_name => 'ISBN'
  composed_of :price, :class_name => 'Money'
end

class ISBN
  def to_s
    ... ...
  end
end

class Money
  include ActionView::Helpers::NumberHelper

  def to_s
    nummber_to_currency @price
  end
end
  
class BooksController < ApplicationController
  def show
    @book = Book.find param[:id]
  end
  
<p>
  ISBN: <%= @book.isbn %>
  Price: <%= @book.price %>
</p>

So what's the difference? The primarily difference is, every Book object now knows how to format its information correctly, and now when a book needs to be used by various sources, be it on an rhtml page, a web service request, or an Excel/CSV report, those sources can interrogate a book object, but not "first fetch a book object, then wrap it with a presenter".

Some people will yell now, "but aren't you putting presentation logic into your domain model?" Yes, kinda. One of the toughest decisions that a developer always has to make is to distinguish between domain logic and non-domain logic. Now if I may ask, if my formatting is happening in multiple places, is this domain logic? I tend to say it is. Also, from experience, while it is extremely rare for me to put such presentation/formatting logic in an Entity Object domain model class (e.g. Book) because such classes are almost always complicated enough in just domain behaviors, having formatting logic in these usually less complicated Value Objects (e.g. Money) makes a lot of sense to me. And in Rails, composed_of provides me that convenience to me almost effortlessly.

The way Rails has these handy ActionView::Helpers is super useful, but every once a while we must step back and ask ourselves the question, "how can I use them beyond the tutorial way on my really complicated application?"

Tip #1: You can also have your #to_s methods to take an additional parameter to customize the formatting behavior. Take a look at how DATE_FORMATS are used in Time#to_s.

Tip #2: You may not need a new class for the sake of using composed_of every time you have a new formatting requirement either. Sometimes you may be able to get away with just extending the Ruby primitive types with a custom method/format, for example Price: <%= @book.price.to_s :money %>.

Tuesday, December 04, 2007

The last "D" in TDD means more than just "Development"

When asked "Do you write tests?", a lot of developers these days will say "of course" as their answers. However, not everyone can admit to doing TDD (Test Driven Development) correctly. Test Driven Development says, a developer will write a test that fails first, then write code to make the test pass, and refactor when possible, and repeat. This is what most people's TDD rhythm is. For the most part this is fairly easy to do. But to reach the next level, one has to understand TDD as a tool: TDD means more than just test your own code. Here is a couple tips on what the last "D" means:

Discipline

It takes a great deal of discipline to even write a failing test before writing the actual code. Sometimes, we write a little seudo-code here, or move a method definition there, or changing code else where trying to see if more new code needs to be written after it, and sooner than you think you are writing the actual implementation of the methods you wanted to test (Test Afterwards Development anyone?). Other times you write the test, but you are too anxious to even run it and see it fails. And other times you want to jump into the actual code immediately when you see your new test fails, but failing for the unexpected reasons.

Don't fall into these traps. If anything is true, testing is hard, but it is at the same time rewarding and fun. What's also true is, it will pay off. Write the failing test, draw up your list of tests you will need to write, and satisfy them one by one. Having discipline is the cornerstone of becoming a better programmer.

Design

It takes too long to write a test? Tests are running too slowly? Are your tests difficult to read? Are they too brittle and fail all the time? Hang in there! You ever had the feeling you saw code in the codebase that irks the living hell out of your mind written by someone else on your team? Well, it is time for you to get some of these feedback about your own code. Yay, your code sucks! Your tests are telling you that! Let's address each of these one by one.

Slow running tests? You shouldn't be hitting a database or web service in your unit tests, because you can mock/stub them out. Difficult to mock/stub it out? There probably is a better way to design your classes your tests are hitting. Ever heard of Inversion of Control (or Dependency Injection)? Master them. True TDD masters use them extensively.

Unreadable tests? Is it because of too many mocks/stubs? Or is it the code is 500 lines long and doing high octane 720-double-backflip logic? Either way, you have to learn to like small objects. Check this blog post of mine out.

Hard to test something? Tests too brittle? Perhaps you have encapsulation issues in your class design. If your classes are referencing 15 other neighbors, of course they are hard to mock/stub. Chances are, you have to spend time to debug your tests to find out what's wrong! Heard of Law of Demeter? Even if you have, take a look at this highly entertaining yet informative post. It might change your perspective a little.

The bottom line is, TDD is a way to guide you to writing good code, but only if you know how to use it as a tool. Now that you know, hopefully you will have a new perspective next time you write a test.

Thursday, November 15, 2007

Seeing Rails Resources Clearly

REST-fulness is probably one of the hot topics you will face when you walk into the Rails world today. One of the terms you will notice when you dig deeper into Rails REST-ful routes is "resources". In Rails, after running the scaffold resource generator, you will get a database table migration, a domain model class, a controller, and 4 CRUD views. This is the skeleton that Rails provides for you to build upon.

However, a lot of Rails developers don't know how to bend this structure when it comes to non-CRUD operations (eg. search for something, reset password, login, various ajax actions etc.)

I have a solution for you: Routes Driven Development - Let your named routes guide you. Here's how.

As an example, suppose you have a Rails app that an admin can create/remove users that access your application. You have a database table named Users, which has a password column.

In your routes.rb, with a single line of:

map.resources :users

You automatically get these academically REST-ful urls:

  GET    /users          # => users_url
  POST   /users
  GET    /users/new      # => new_user_url
  GET    /users/:id/edit # => edit_user_url
  GET    /users/:id      # => user_url
  PUT    /users/:id      
  DELETE /users/:id

Then, there comes a requirement that says "From the login page, a user who forgot his password can go to a page to see directions on how to retrieve his forgotten password." We are all familiar with this concept, right?

The next thing you know, the developer jumps into the code:

  class UsersController < ActionController::Base
    def index...
    def show...
    def new...
    def edit...
    def create...
    def update...
    def destroy...
    
    def forgot_password
      # new code here
    end

And in your routes.rb:

  map.resouces :users
  map.forgot_password_url    '/users/forgot_password', :controller => 'users', :action => 'forgot_password', :method => :get

Oops. And you just made a wrong move in building upon the Rails skeleton. You put the action in the wrong controller.

Most developers make the assumption that a Rails "resource" is synonymous to a database table. This observation is made because this is how most people would start their Rails application, as in our case our Users concept. With this in mind, since password is a column rather than a table in the database, it should not have its own controller, and thus no model, nor view, etc. Still, we got to jam the code somewhere, and therefore the Users controller, the table on which it belongs. But this observation of one-table, one-controller is not always true.

Instead, I would keep my users controller free from password related operations and do this:

  class PasswordsController < ActionController::Base
    def forgot
      # new code here
    end

  map.resource :password, :member => { :forgot => :get }

This gives me an url of:

  GET: /passwords/forgot    # => Which gives you forgot_password_url

My named route of forgot_password_url is telling me that I am doing the right thing, because it is extremely readable. Had I not been creating a new passwords controller and put the action in my users controller, using the same technique I used to define my named route, I would end up with a not only incorrect but also nasty forgot_password_user_url(:id) by the naming convention (<action>_<controller>_url). Definitely not what I want to pass to any link_to() or button_to().

If you realize, you just created a controller (and its view) with no database table backing it what-so-ever. It doesn't even have a model class. But then again, how much more complicated can it be when the customer is asking for a page that shows directions?

By putting actions in the wrong controller, your controller actions will lack cohesiveness (i.e. actions don't operate on the right thing) which inhibits sharing, you will also run into problems in sharing views (pages, partials, rjs, helpers) like "why am I rendering a partial that is in an obscure folder?", etc. In short, a lot of bad code.

Using this routes driven technique to drive your actions and controllers, can you identify which action/controller combination is correct for a 'Login' button?

1) create_session_url  # => :controller => 'sessions', :action => 'create', :method => :post
2) login_user_url      # => :controller => 'users',    :action => 'login',  :method => :post
3) login_url           # => map.login '/login', :controller => 'whatever', :action => 'anything', :method => :post

The true answer is, it doesn't really matter, as long as your controller is responsible for the correct "resource", and the named route is screamingly readable to anyone using it.. No matter which controller you put your actions in, you can always fall back to creating a named route for it using map.connect like option (3) above. The bottom line is, Rails "resources" might not be what you and others think it is.

As homework, go read more about Rails resources. Not only can you specify :member, but learn how to use other options like :new, :collection, and nested routes. Also, there is also a singular "resource" method and a plural "resources" method. Get familiar with them. This rdoc page is IMHO pretty poorly documented, but hopefully this article gives you a reason to explore them.

Monday, January 01, 2007

Use your test classes to identify smelly code

How can you tell if you have smelly code? Your test classes actually can tell you a lot about it. Here are a few unit test anti-patterns:

Big test class
Hopefully you have one unit test class per class/module/whatever-that-contains-behaviors you write code for. When your test class starts becoming unwieldily difficult to read and understand, your Class Under Test* is doing too much (i.e. God Object). So what is difficult to read? It is different for everyone. For me, when your unit test class stretches more than 4-5 page scrolls, it is starting to become too big. Think small, think manageable.

*By Class Under Test, I mean the class (e.g. class Laptop) that is to be tested by the test class (e.g. class LaptopTest < Test::Unit::TestCase).

Prefix test method names for categorization purposes
When you have to categorize your test methods in your test class by prefixing some text for grouping to improve readability, it is an indication your Class Under Test is doing too much:

class EBayChristmasShopperTest < Test::Unit::TestCase
  def test_browse_fetch_pricing_details_for_item...
  def test_browse_froogle_compare_item_prices...
  def test_browse_amazon_compare_item_prices...
  def test_auction_overbid_item...
  def test_auction_underbid_item...
  def test_payment_paypal_purchase_item...
  def test_payment_paypal_buy_it_now...
  def test_payment_credit_card_purchase_item...
  def test_payment_credit_card_one_click_checkout...

Instead of one EBayChristmasShopper object doing all of these, how about have Shopper to delegate some of these functionalities (browsing, auction, etc.) to other smaller objects (like ItemBrowser, Auctioneer)? This way you can write more well-intended unit tests for ItemBrowser and Auctioneer, and significantly simplify our EBayChristmasShopper class.

To me, one good way to categorize your unit test methods is to group them by the names of your public methods:

class IPodTest < Test::Unit::TestCase
  def test_prev_skips_to_previous_song...
  def test_prev_and_hold_rewinds_during_playback...
  def test_menu_displays_song_list_during_playback...
  def test_play_starts_playback...
  def test_play_pauses_playback_when_playing...
  def test_play_resumes_playback_when_paused...
  def test_next_skips_to_next_song...
  def test_next_and_hold_fast_forwards_during_playback...


class IPod
  def prev...
  def menu...
  def play...
  def next...
end

This way, each of your public methods can have a group of unit tests responsible for its behavior. In addition, it's easy to find out which method is missing tests (psssst....). But most importantly, you can easily see and logically group functionalities together, and refactor them into new classes should your Class Under Test becomes hideous to look at.

Mocks/Stubs frenzy
All too often, developers try to write code to make things work the way they want, at the expense of the complicating the fundamentals of object-oriented programming: object-to-object communication. Sure, the code works, but the tests tell you a different picture about how you get it to work. Take a look at the following example:

class DriverTest < Test::Unit::TestCase
  def test_drive_vehicle_forward_moves_appropriate_car_parts
    car = Car.new
    car.radio.expects(:off)
    car.clutch.expects(:shift_to_drive)
    car.acceleration_paddle.expects(:push)
    car.steering_wheel.expects(:turn).with(:straight)
    car.ignition.expects(:engine_on)
    car.timing_belt.expects(:run)
    car.muffler.expects(:stinks)
    car.battery.expects(:charging)


    driver = Driver.new(car)
    driver.driving_directions = 'FORWARD'
    driver.drive
  end
end

Did I lose you at the mock call for my car radio? Driving forward, in my mind, only needs half of those moving parts (ignition, clutch, steering wheel, acceleration). Others? I don't care (at least not when I only want to move the car forward as a driver). Sure the code works, but an overdosage of Mocks and Stubs is a good indicator that you are breaking encapsulation and/or coupling your class with too many other objects.

Long test methods
This indicates your Class Under Test requires a lot of set up in order to be useful, or your object is very coupled with the rest of the world (remember someone once said decoupling is good?), or your Class Under Test is very long and procedural. Remember the test_drive_vehicle_forward above?

Reduce the size of your test methods by re-thinking your object-object communication path. Maybe introducing an additional object to do all the dirty set up work would help decoupling the Class Under Test from the rest of the world immensely. Maybe this is a common object-coupling design problem that someone already has a design pattern to solve it succinctly, and all you need is google for it. There are always ways around it.

Wednesday, September 13, 2006

Ruby constructor dependency injection... or not?

Dependency injection has proven to be something a black belt unit tester must know about if you are serious about unit testing. If you have written some sort of unit tests, would you be jealous if I tell you for some of us, running all these unit tests takes sub-1 second? In C# and Java, actively practising Dependency Injection makes mocking and stubbing out dependencies much easier, and thus tests become easy to write, and run fast because they do not need to make time consuming calls. In fact, constructor injection is one of my favourite design techniques:

    public class Laptop
    {
        private IBattery battery;
 
        public Laptop(IBattery battery)
        {
            this.battery = battery;
        }
 
        public void PowerOn()
        {
            if (battery.Meter >= 10)
            {
                // Booting Vista...
            }
        }
    }

Then to unit test Laptop, you could use NMock like such:

        [Test]
        public void PowerOnFailsWhenBatteryIsTooLow()
        {
            mocks = new Mockery();
            mockBattery = mocks.NewMock<IBattery>();
            Expect.Once.On(mockBattery).Method("Meter").Will(Return.Value(9));
            Laptop laptop = new Laptop(mockBattery);
 
            laptop.PowerOn();
 
            Assert.AreEqual("Off", laptop.PowerStatus);
            mocks.VerifyAllExpectationsHaveBeenMet();
        }

It may not be worth it to mock out Battery, but think about a lengthy web service class.

That's all true in C# and Java. Now in Ruby, I don't even need to constructor inject my Battery instance to unit test my Laptop class. I can just as easily unit test my Laptop class without having to inject my mock Battery:

class Laptop
  def initialize
    @battery = Battery.new
  end
  def power_on
    if @battery.meter >= 10
      # Booting Max OS X
    end
  end
  
  def power_status
  end
end

class LaptopTest < Test::Unit::TestCase
  def test_power_on_fails_when_battery_too_low
    Battery.any_instance.expects(:meter).returns(9)
    laptop = Laptop.new
    laptop.power_on
    assert_equal :off, laptop.power_status
  end
end

Basically I am mocking using Ruby Stubba/Mocha, but I don't even need to write an extra constructor to inject the Class Under Test's dependencies! With no interface IBattery, nothing! This is some cool trickery of programming in a dynamic language like Ruby, and I am discovering these things every day with my colleagues!

I know you are going to say "well, I can use reflection to do the same thing...", and I will tell you, sure, you try to do it in a readable manner and with one line of code. I didn't say you can't do it with C# or Java, I am just saying, this is how I can do it with one line of highly readable Ruby. Happy programming.

Friday, August 18, 2006

The "Not-Enough-Objects" anti-pattern

Definition:
This is a pattern that every single programmer in the world had used at one point or another in their programming careers. Classes become unnecessarily bloated and multi-talented in their ability to single-handedly accomplish almost anything and everything an application requires.

Root Causes:
Being lazy to create new classes for new but not aligned behaviors into existing objects; being ignorant on good objects commuication and interactions; lack of concern of each objects' goal of existence; failure to realize the importance of programming to interfaces; putting off code refactoring forever (the "kitchen sink" syndrome ¹); embraces code babies² as if they are the next American Idols; failure to effectively organize, categorize, and classify application functionalities.

UML:
eg.

Customer

+ OrderSoda() ~20 LOC
+ Sleep() ~100 LOC
+ RPCCallTolMom() ~250 LOC
+ PlayTVGames() ~500 LOC
+ WebServiceCallToWalkMyDog() ~3,000 LOC
+ RepeatRoutineEveryDay() ~275 LOC
.
.
.
(many more methods)

Symtoms:
When it appears in your code, they make your code become demoralizingly hard to read or comprehend; extremely difficult to touch code without introducing new errors; code become very brittle; code starts to look like POOP (Procedural Object-Oriented Programming); exponentially increase your programmers' after-hour caffeine consumption to life-threatening level against others (e.g. "I will kill the xxx if I knew who wrote this").

How to Avoid:
Avoid 1,000 line classes and/or 200 line methods; avoid class behaviors that are mutually exclusive; avoid class states that are mutually exclusive; avoid classes to have multiple responsibilities in favor of delegating to new objects that the classes use; avoid too many if-else statements in favor of object hierarchy (see Strategy); effectively categorize or organize your classes/namespace/file/folder structures into meaningful and humanitarian fashion; encourage collective code-ownership; frequently communicate with team members the intent of each object; feverishly unit test the heck out of each class and object using mocks and stubs; use TDD as test-driven design to think how other objects will use the object you are unit testing; pair-program with someone who reminds you of code quality is of utmost importance.

Short List of Skills Required to Avoid:
1. Programming to abstraction and not implementation
2. Object encapsulation
3. "Tell, Don't Ask" and the Law (or guideline) of Demeter (link), go back to #2 if you do not understand
4. Short methods/classes with well-intended names in favor of plethora of comments
5. Unit testing
6. All your unit tests run in sub-10 secs. because you understand Mock vs. Stub (link)
7. Dependency Injection (link)
8. Decoupling and cohesiveness (related to object-to-object communications)
9. Knowledge of design patterns of all types
10. Knowledge of anti-patterns

After-Thought:
Have we understood why good programmers are rare?

(1) kitchen sink syndrome: If it is clean, it stays clean. Once dirty dishes goes in and no one bothers cleaning up, dishes will start piling up and they never get cleaned.
(2) code babies: A snippet or chunk of code, or application functionality that is given birth by its genetically related programmer that no one else in the planet understands or dares to understand.

Wednesday, July 19, 2006

Applying PicoContainer.NET with Presentation Patterns - Part II

(Part I)

Let's think about what Pico is good at in practical sense. I think Pico is great at wiring dependencies together. That means, it is good at wiring dependencies starting from your Presentation Model/Presenter layer and back, because these are all classes that you have complete control over. For your Views, at least if you develop in Visual Studio .NET, the restriction on their default constructors conflicts with the way Pico works. There are ways around it (setter injection as said in Part I), but they are ugly.

But what if I don't register my Views into my Pico container like other classes? Then the question becomes: how can my View get a reference to its dependent Presentation Model/Presenter?

Before I show you my solution, I have to say one more thing. The method InitializeComponent() synonymously refer controls as components for a reason. Every control in .NET is a component. What is a component? It is something you put in a container. Every container has many components. So yes, I am telling you that these controls/widgets are all being put in a container (not Pico) in the .NET control hierarchy framework. So what does each control being a component in this .NET container hierarchy enable me to do? It allows you to have access to the functionalities from other components in the same container. Let me rephrase: each View, being a component, can have access to other classes like a Presentation Model/Presenter, so long as they are all in the same .NET container hierarchy framework. Didn't I just say that this container framework is not Pico? Yes, I did, but it doesn't mean we have to use only one over the other. We can use them both together to each of their strengths to achieve the best of both worlds. Let me explain.

We have already said using Pico container to host everything starting from the Presentation Model layer is a good thing. So here is some example code on how to do this:

        private static IMutablePicoContainer createMutablePicoContainer()
        {
            IMutablePicoContainer pico = new DefaultPicoContainer();
            pico.RegisterComponentImplementation(typeof(IPageOnePM), typeof(PageOnePM));
            pico.RegisterComponentImplementation(typeof(IPageTwoPM), typeof(PageTwoPM));
            pico.RegisterComponentImplementation(typeof(IWebService), typeof(WebService));
            return pico;
        }

In order to allow the Views to have access to other components (Presentation Model/Presenter), we have to create a .NET container. It is just a class that subclass from System.ComponentModel.Container. I am going to call this an ApplicationContainer to avoid being confused with our Pico container.

        internal class ApplicationContainer : System.ComponentModel.Container
        {
            // ...
        }

To put our Views into this ApplicationContainer, you instantiate an instance of it, and put the Form object that you will start your application with into it like this:

        [STAThread]
        static void Main() 
        {
            ApplicationContainer applicationContainer = new ApplicationContainer();
 
            MainForm form = new MainForm();
            applicationContainer.Add(form);
 
            Application.Run(form);
        }

From our Views, the method that they will use to get access to other components is called GetService(Type serviceType). When this method is called from within our Views, if our Views are put in a System.ComponentModel.Container, by default the method will ask for its containing container to traverse all registered components and see who can provide this "service" to it. If it finds such component, the container will return it, and now the requesting component gets a reference to that object. How does the container traverse its registered components and decide to give the component requesting a service the service that it asks for? Well, interestingly a System.ComponentModel.Container also has its own GetService() method to do just that. Now, since we have our own subclass ApplicationContainer, what if we override its GetService(), and while our ApplicationContainer object receives a request for service from any of its components, we can instruct it also look to see if Pico has the stuff that the requesting component has. More concretely, when a View uses GetService(typeof(MyPresentationModel) to get its Presentation Model/Presenter dependency, ApplicationContainer will ask for a Pico container that has already been fully registered to return an instance of that class if it is found, like such:

        internal class ApplicationContainer : System.ComponentModel.Container
        {
             private IMutablePicoContainer _pico = createMutablePicoContainer();
 
            protected override object GetService(Type service)
            {
                object instance = _pico.GetComponentInstanceOfType(service);
 
                if (instance != null)
                {
                    return instance;
                }
 
                return base.GetService (service);
            }
        }

To sum it all up, you need to do the following to get things to work:
1. Create an ApplicationContainer class subclassing System.ComponentModel.Container.
2. Add your starting Form into the ApplicationContainer instance, prior to starting it.
3. Set up a Pico container within the ApplicationContainer instance.
4. Register all dependencies that your Presentation Model/Presenter classes will need in your Pico container. You do not need to register your Views.
5. Override the ApplicationContainer's GetService() method, make it to look further into its already setup Pico container for anything that it should return.

Now from your Views, you can gain access to its dependency, in this case a Presentation Model/Presenter, by calling:

        private void PageOneView_Load(object sender, System.EventArgs e)
        {
            // Add this View to ApplicationContainer. Otherwise we have to instantiate
            // each and every view in public static void Main() and do the adding there.
            base.FindForm().Container.Add(this);
 
            // This GetService() call will now find what we need in ApplicationContainer.
            IPageOnePM service = (IPageOnePM)base.GetService(typeof(IPageOnePM));
        }

And modify your PresentationModel's constructor to include an additional parameter to take in the web service class. Then your class can start using the web service functionality, while in unit testing you can mock/stub it out!

    public interface IWebService
    {
    }
 
    public class WebService : IWebService
    {
    }
 
    public class PageOnePM : PresentationModel, IPageOnePM
    {
        private IWebService webservice;
 
        public PageOnePM(IWebService webservice)
        {
            this.webservice = webservice;
        }
    }
 
    // ApplicationContainer class
    private static IMutablePicoContainer createMutablePicoContainer()
    {
        IMutablePicoContainer pico = new DefaultPicoContainer();
        // ...
        pico.RegisterComponentImplementation(typeof(IWebService), typeof(WebService));
        // ...
        return pico;
    }

So now, we have eliminated the child user control not knowing how to get a Presentation Model/Presenter problem, because a user control is also, well, a component in the same ApplicationContainer. We have also solved the ugly setter injecting child user controls' dependencies problem. You also did not modify a single View default constructor! We can now happily use our Visual Studio .NET IDE to do WYSIWYG design and manage good OO design, plus Inversion of Control-ing our dependencies.

After-thought: Though the title of these two posts say Presentation Patterns, in my mind they are more geared towards just for Presentation Model, due to in my opinion the difference in directional references between Presentation Model and Model-View-Presenter. I mentioned briefly about this in my earlier post here.

By the way, note that now each page can have access to multiple Presentation Model objects, instead of the what-it-used-to-be 1-to-1 relationship between a page and its Presentation Model, so one can do something similar to Ruby on Rails where each View can make calls to multiple Controllers which operates on various Models to complete the desired action! This makes code-sharing between Presentation Models much easier, and also each Presentation Model can be named not after their page but by application functionality!

Applying PicoContainer.NET with Presentation Patterns - Part I

There is a couple favourite presentation-layer design patterns that I have been consistently using to build .NET applications. In particular, they are Presentation Model and Model-View-Presenter. Both are extremely handy when it comes to making the code-behind of your View more testable by delegating its responsibilities to another layer of code. From that layer, you can start chipping in various flavors of dependency injections and stubs and mocks to start going all-out unit testing assault to your code.

In many cases, on a per View basis (like a page or user control), there is a Presentation Model class, or a Presenter class, sitting behind it ready to receive a call from its corresponding View, then execute the behavior called upon.

PicoContainer.NET (Pico from here on) encourages more decoupled object-oriented class design by helping you to manage your classes' dependencies. As a quick example, suppose you have a class iPod (you know what it is, right?) and a class Battery. Obviously iPod depends on a battery. However, the iPod class at construction time creates its own Battery instance to use and cannot take any other battery types (to the dismay of iPod users). Now your iPod stopped working and you have to figure out why. If you need to test your iPod, you have to be extremely creative to figure out where the problem is related to its battery or not somehow. How do you know if the problem lies in the iPod, or its battery? This is why good code minimize dependencies. Suppose, however, your iPod receives a Battery instance through its constructor. Then in the unit tests of your iPod class, you can then mock/stub the battery out and start exclusively interrogating and verifying the behaviors of your iPod class. iPod now can be injected with a mock Battery instance, and hence your iPod class is more testable, and its tests are more "unit" and not touching other parts of your code. You can then start writing more automated unit tests that run in milliseconds easily.

The question now is who then is responsible for passing a Battery instance into iPod. That's where Pico shines. If, as mentioned, your iPod has such a constructor that takes in the Battery dependency, using Pico as a container, you register your iPod and Battery class into the container, when you ask the container for an iPod instance, the Battery instance (since it is also registered) will be instantiated automagically and you can now start using your iPod object, without having to hard-code your iPod class to instantiate its own Battery.

Because Pico is such a great tool, I have used it in various .NET projects of mine. Since Pico uses a registration scheme, whatever you put into a Pico container you get its dependency-wiring functionality for free, many things tend to be put into it. Afterall, Pico encourages good design, right? As a result, Views, Presentation Models, Presenters, etc., are all registered into Pico container. The Pico container is set up at public static void Main() time, and when you run Application.Start(form) you pass it into your form object. From there all of your Views can programmatically ask for its dependencies. Life is good.

Why are Views being put into the container? Because every View needs a delegating Presentation Model or Presenter to handle its behavior, meaning, every View "depends" on a Presentation Model or Presenter. As a result, they all get registered into Pico in order for the form to get navigate the user to the correct View.

Life is good - until stuff happens. Since constructor injection is the preferred way to inject dependencies, if your View "depends" on something, then your View would need a constructor passing in its dependencies before Pico can start wiring dependencies on your behalf. Every .NET programmer knows that, every View (regardless you use a Form or User Control) has a default parameter-less constructor that has a default line InitializeComponent(). This is because the Visual Studio .NET IDE uses this constructor to support WYSIWYG at design time. Tampering with or getting rid of this constructor and/or the InitializeComponent() method call will get your IDE trouble in supporting editing your View at design time.

A second problem that would arise is that, consider I have a User Control. It is also a View, albeit being used and instantiated by its parent View (maybe another User Control). Since this child user control instance is created by the auto-generated code from the parent Views InitializeComponent() method call, you cannot modify this child user control's default constructor to take in its "dependency" Presentation Model/Presenter, and then hope that its parent Form can somehow inject the child control's dependencies inside its InitializeComponent() method. Even if you can modify these auto-generated code, your IDE design-time support is now in jeapardy. Because of this problem, some Pico programmers would buck the constructor injection coding consistency, pass in the child user control's dependencies into the parent View's constructor, and start using setter injection to inject the child user control's dependencies, after the parent Form's InitializeComponents() method call. That way at least you can leave the child user control's default constructor alone and continue to use VS's strong IDE support, while managing your child control's dependencies.

What to do? Now comes Part II - My attempt to get the best sides of all worlds: Use Pico to manage dependencies; and have VS.NET IDE WYSIWYG support.

Thursday, March 16, 2006

Pragmatic NAnt scripting

Why am I talking about NAnt when the rest of the world is in love with Ruby and Rake? Well, I still think NAnt has its value. Rake is still pretty raw (it doesn't even have an NUnit task yet), and until it has matured there is still a huge group of developers out there not wanting to spend a huge deal of time learning Ruby/Rake. Plus, most companies out there who have already adopted NAnt is stuck with their NAnt code. This article provides some insights on how to provide such companies business value by making NAnt scripts more maintainable.

Based on my experiences with NAnt, I am concluding the followings are true:

NAnt scripts are very hard to maintain, due to reasons such as:
- Target dependencies are all intermingled into a big unmanageable web.
- No consistencies across NAnt scripts. There is no coding convention in NAnt scripts.
The scripts have to handle a lot if you think about it. It has to handle...
- Pre-Build (like build artifact directory creations)
- The CI (Continuous Integration) cycle like GetLatest, Compile, UnitTests, OtherTests.
- Post-Build (like package, stage, distribute, deploy, blah...)
Build scripts are nasty and no one wants to inherit, and that's why names like "build monkey" (and even more provocative ones) are created and stick to people who initially start the script and tend to stick to those people until they leave the project. It is a thankless job.
Build scripts are, nevertheless, crucial to any projects. If you have a good set of build scripts, they can help developers writing higher quality code, BAs to verify story functionalities quicker, QA to mitigate environment nightmare, and streamlines deployment process by deployment team.

And here are some tips that I have found to be useful to extend the maintainability and lifespan of your build scripts.

Have multiple build script files
I would break down what I see in many projects one monolithic projectname.build file into many smaller build files named after their build function. For example, the file below is called Sesame.build. It has one and only one purpose: to build a project, and test it. I am putting these two build functions together into one build file because the targets involving testing the build output is usually small, and usually go hand-in-hand together. Conveniently co-locating them into one file I think is better than creating a small and isolated test.build file. But other build disciplines, such as package & deploy, by itself could be a big deal, so if complexity warrants I will have a separate file for them. The only encapsulation we can apply to build scripts is really separating them into files.

<?xml version="1.0" ?>
<project name="cruise" default="all">

    <target name="all" depends="get, build, tag" description="Target executed by CCNet." />

    <target name="get" depends="get.get_latest" description="Gets the latest source code from Subversion." />
    <target name="build" depends="build.build" description="Compile and build the source code." />
    <target name="tag" depends="tag.tag" description="Tag the successful build by the naming convention tags\\CRUISE-B999" />

    <target name="get.get_latest">

    </target>

    <target name="build.build">
        <nant buildfile="Sesame.build" target="all" />
    </target>

    <target name="tag.tag">

    </target>

</project>

<?xml version="1.0" ?>
<project name="Sesame" default="all">

    <target name="all" depends="build, test" description="Compile, build, and test the Sesame project." />

    <target name="build" depends="build.compile, build.database" description="Compiles the .NET source code and setup local database instance." />
    <target name="test" depends="test.unit_test, test.other_test" description="Runs unit tests and functional tests." />

    <target name="build.compile">
            ...
    </target>

    <target name="build.database">
            ...
    </target>

    <target name="test.unit_test">
            ...
    </target>

    <target name="test.other_test">
            ...
    </target>

</project>

NAnt target categorizations and naming convention
The ultimate sin of unmaintainable build scripts more or less attribute to the over-profileration of target dependencies. When targets start having a complex web of dependencies, build scripts will take an enormous amount of courage and time to repair.

By breaking down all targets in a single NAnt build file into three categories, and through coding consistency and naming conventions, build scripts can last for a very long time. I find the following categorizations of all NAnt targets in your build scripts useful.

<?xml version="1.0" ?>
<project name="Sesame" default="all">

    
    <target name="all" depends="build, test" description="Compile, build, and test the Sesame project." />

    
    <target name="build" depends="build.compile, build.database" description="Compiles the .NET source code and setup local database instance." />
    <target name="test" depends="test.unit_test, test.other_test" description="Runs unit tests and functional tests." />

    
    <target name="build.compile">
        // Compile using the build.solution_configuration property value...
    </target>

    <target name="build.database">
        // Rebuild database using the database_server property value...
    </target>

    <target name="test.unit_test">

    </target>

    <target name="test.other_test">

    </target>

</project>

Points of interests:

Level 1: These are targets that orchestrates the order of executions of various Level 2 targets and not Level 3 targets. They will only contain depends, but never have a target body. They are the common entry points into the build function the script file represents (eg. target "all" in cruise.build will be called by CruiseControl.NET to kick off the CI build process). They must have descriptions. I prefer names to be underscore-delimited.

Level 2: These are targets that group Level 3 targets together to make a cohesive unit of work of function. For example, a "clean" target might do altogether a few things to clean a build: build artifacts directories, build results directories, VS.NET bin/obj/VSWebCache, etc. They again will never have a target body. These targets will also have descriptions. I again prefer their names to be underscored.

Level 3: These targets will *never* contain depends. They will *only* do one very refined detail piece of work. In addition, their names should be namespaced by a Level 2 target name using a period, and then their function, so that they are easily distinguishable from other targets. This helps newcomers to recognize them and start treating them differently. They never have descriptions (think of these as private methods in a class that does one very specific function for you).

The namespacing naming convention can also be expanded to properties as well.

Use properties to consolidate paths
Have a file (eg. common_paths.build) that extensively use NAnt properties to consolidate your source's folder structure. This will save a lot of code duplication in your build scripts in the longer run if you intend to keep your build scripts for a while.

<?xml version="1.0" ?>
<project name="common_paths.nant" default="all">
    <property name="trunk.dir" value="." />

    <property name="build_output.dir" value="${trunk.dir}\build_output" />
    <property name="build_results.dir" value="${trunk.dir}\build_results" />
    <property name="source.dir" value="${trunk.dir}\source" />
    <property name="tools.dir" value="${trunk.dir}\tools" />

    <property name="dotnet.dir" value="${source.dir}\dotnet" />
    <property name="dts.dir" value="${source.dir}\dts" />
    <property name="sql.dir" value="${source.dir}\sql" />

    <property name="nunit.dir" value="${tools.dir}\nunit" />
    <property name="nant.dir" value="${tools.dir}\nant" />
    <property name="nantcontrib.dir" value="${tools.dir}\nantcontrib" />

</project>

Use the target description attribute
NAnt provides a -projecthelp command line switch to list all of the targets in a given build file. When you give targets a description these targets gets first-class recognition in the listing as "Main Targets":

Combining this tip with the naming convention of the Level 3 targets can be a very powerful technique in improving the readability of your build scripts. As a bonus tip, consider also implementing a target named "help" to display all this -projecthelp target lists.

Use depends or call, but not both
If you start using both, you will be impairing the readability of your build scripts. They more or less do the same thing anyways. I would use depends over call because newcomers to NAnt would be much more likely to learn about depends ahead of call.

Know your property inheritance
NAnt's property inheritance convenience is commonly overlooked and under-valued. Many people chose using <call> task to preserve locally declared property values, rather than experimenting property inheritance of various kinds. I am publishing my findings here.

<?xml version="1.0" ?>
<project name="example" default="example">

    <include buildfile="common_paths.nant" unless="${property::exists('trunk.dir')}" />

    <target name="example" depends="calling_script.properties">
        <nant buildfile="example_two.build" inheritall="true">
            <properties>
                <property name="solution.configuration" value="DEBUG" />
                <property name="build_output.dir" value="c:\modified_build_output" />
            </properties>
        </nant>
    </target>

    <target name="calling_script.properties">
        <property name="calling_script.depends_inherited" value="Inherited from depends clause." />
    </target>

</project>

<?xml version="1.0" ?>
<project name="example_two">

    <include buildfile="common_paths.nant" unless="${property::exists('trunk.dir')}" />

    <echo message="1) Explicitly inheriting property from nant task tag: solution.configuration='${solution.configuration}'" />
    <echo message="2) Explicitly overriding property in calling script's nant task tag: build_output.dir='${build_output.dir}'" />
    <echo message="3) Implicitly inheriting property from calling depends clause: calling_script.depends_inherited='${calling_script.depends_inherited}'" />

    <echo message="Is (1) readonly? ${property::is-readonly('solution.configuration')}" />
    <echo message="Is (2) readonly? ${property::is-readonly('build_output.dir')}" />
    <echo message="Is (3) readonly? ${property::is-readonly('calling_script.depends_inherited')}" />

</project>

1 & 2) Inherit properties through <nant> task
You can call the <nant> task and include a <properties> set within the tags, while setting the inheritall attribute to true to pass those properties to the build script that is getting invoked. In addition, as shown in (2), you can also override the loaded property "build_output.dir" in the properties section prior to passing it into the callee build script.

If you use this style of properties passing, it is a very good idea to pass them as readonly=true to avoid your customization of script behaviors to be reset again by the callee script.

3) It's important to know that properties declared earlier in the depends clause can be used further downstream in the later depends targets, even into another callee build script. This technique is useful when you do not want to load all properties for all targets up front (with the side-effect that the last target in your execution will have access to all properties declared in every single previously executed target, but I am okay with it because I rarely run into properties nightmare compare to build scripts nightmare). Therefore, the following readable script can be achieved:

<?xml version="1.0" ?>
<project name="Sesame" default="all">

    <target name="all" depends="build, test" description="Compile, build, and test the Sesame project." />

    <target name="build" depends="build.properties, build.compile, build.database" description="Compiles the .NET source code and setup local database instance." />

    <target name="build.properties">
        <property name="build.solution_configuration" value="DEBUG" unless="${property::exists('build.solution_configuration'}" />
        <property name="build.database_server" value="localhost" unless="${property::exists('build.database_server'}" />
    </target>

    <target name="build.compile">
        // Compile using the build.solution_configuration property value...
    </target>

    <target name="build.database">
        // Rebuild database using the database_server property value...
    </target>

I am sure there are lots of other tips as well, but that's it for this post. Comments, feedback welcome.

STEPHEN CHU . com