Dealing With Technical Debt

The rails project I’m talking about is over three years old and has seen
commits from 27 developers in that period. These developers were both
co-workers, freelancers, off-shore developers and designers of different
levels of expertise.

Technical Debt Inventory

Needless to say, as most projects of this size and age, this one has plenty of
technical debt. Let’s make an inventory.

The test suite takes approximately eighty minutes to run. These are all
RSpec tests, including features. Luckily we can split the entire suite up
into smaller parts using Travis, but still the entire thing takes about
fourty minutes.

The main cause for these slow specs is a lack of understanding about how to
write good tests. For example, testing if search and pagination works,
someone thought it fine to create fifty ActiveRecord objects in a before
block.

These big offenders are easily remedied. Others are a bit harder, as
some specs require an ellaborate tree of models to test functionality.

This project has no cucumbers to discuss and review features with
Product Owners.
These cucumbers would have been awesome to discuss
features both with product owners and team members. But at the start of the
project cucumber was deemed too much hassle and Rspec feature specs were
used instead.

The main issue now is that the line between features specs and unit specs
is a thin one and people often mistake one for the other. This is probably
one of the reasons so many specs create many models and make the entire
suite slow.

There are parts of the code that are not well tested – or not at all. This
is a dangerous one. Because, from the start, nobody monitored test coverage or
did proper code reviews, there are patches of code that are not well tested,
and there are some that are not tested at all.

There are various factors to why this happened, but lack of pair
programming, code review and pressure to finish a product have resulted in
this.

Lots of smells and lots of noise. All the points above indicate there
is a certain amount of technical debt in this project. Another pointer are
code smells. To name a few I’ve seen over the past few months:

  • Long methods
  • Conditional Complexity
  • Data Clumps
  • Alternative Classes With Different Interfaces
  • Indecent Exposure
  • Uncommunicative Names
  • Divergent Change
  • Shotgun Surgery
  • Lazy Classes
  • Inappropriate Intimacy
  • Train Wrecks (or Message Chains)
  • Feature Envy

All of these found their way into the project, sneaking in besides our best
efforts to pair program and code review almost religiously.

Why does this matter?

The product we ship works. Although the specs take a long time to run, and
there are some code smells in there, the suite is actually green and we
have confidence the product works as expected.

So why complain about technical debt? In my opinion as a software
craftsman, softare should

  1. work
  2. be clean and readable
  3. allow for change
  4. be well tested

Well, the code works. But clean, readable, easy to change and well tested are
debatable.

Responding to change is still possible, but it’s a bit more painful than
it should have been. There are some untested patches of code that we would
really would like to see tested. And of course, code should be readable and
easy to understand. We don’t always have these things and it’s holding back
progress sometimes

So, you have some Technical Debt on your hands

Recently two new developers joined this project and they immediately asked
the right question:

What can we do to make this project better?

In my opinion, as a software developer, this is the one magic question.
What can you do, right now, to make this project better.?

Besides the fact that you ask yourself what you yourself can do, there’s
another important component in that question: better.

It’s not about fixing all the things. It’s about improving the things you
touch just a tiny bit. Over the course of a few weeks or months these small
improvements start to add up and really make a difference.

It’s often called the Boy Scout Mentality. Leave things behind in a better
state then how you found them when you arrived.

So, although the projects works fine, we as developers are now constantly
improving the code and specs we touch.

Eliminating Technical Debt

Just like 100% Test Coverage, Low Technical Debt is not and end goal, but
an ongoing mission.

Your application will always have some form of technical debt. Even if you
have the best programmers in the world, it will happen.

What you can do is focus on minimizing technical debt.

The team currently working on this project now has formulated a few
basic guide lines to help us bring down the technical debt of this project.
We are not going to do it over night, but in a few months time we will have
made quite a dent.

Guidelines

The overall objective is this:

Whatever you do, leave it in a better shape than you found it in. This
goes for code, documentation, specs, whatever.

When implemting features or changes, we use a few simple steps to help us
along:

  • Is this code I’m about to change well tested?
    If not, fix it now. Update those specs to be more efficient, handle
    the edge cases you see popping up right now. Write cucumbers.
  • With your improved specs green, refactor the existing code
    Your new specs suite is fast and elegant. Use it to refeactor the
    existing codebase to make change easier.

Congratulations, you have just made your life much easier. Note that you have
not written a single line of code yet for you new feature or change. But the
cleaning up you just did will greatly benefit you and the time it takes to
write this feature.

  • Write specs for your change or feature
    This is quite easy now, as it should add nicely to the changes you made
    in the first step.
  • Implement your features or change
    Because you refactored most of the code smells out in the second setp, you
    should be able to easily change your code.

These are four easy steps to minimize technical debt in your project.

You should not wait until you hit the technical debt limit of your project.
Minimizing technical debt is just as important as writing tests. It should
be a team priority from Day 1.

Minimizing techincal debt is not all too different from doing proper
test driven development. It requires rigorus discipline and skills. And just
like with tests, you will not handle that technical debt after this crunch
period.

Next time you open a project, ask your self, “How can I leave this project
in a better state than I just found it?”

Backup PostgreSQL from a Rails project to Amazon S3

Creating a backup of a (PostgreSQL) database in your Rails application should be easy. Well, so seems to be playing Eddie Van Halen’s Eruption. Nevertheless, it took me about 1.5 day to have this sorted out and get this up and running.

So I figured, why not sharing it with the world. It might be helpful to someone.

Creating a location, credentials and permission on Amazon

The bucket

Our back up will be uploaded to an Amazon S3 bucket my-app.backups. For this, of course, this bucket needs to be created. For this you just follow the AWS documentation.

Lifecycle

It’s ok for backups older than 30 days to be removed (or at least: in our case). Instead of managing this programmatically, you can define a rule for your bucket which takes care of cleaning up old backups. In the Amazon S3 console, select the bucket, choose Properties and open LifeCycle. Here you add a rule to have files of a specific age cleanup for you.

IAM

We create a separate user system.my-app.backup for performing the backup. We also define a group Backup where the user belongs to and give this group access to the bucket. Creating a user and a group is very straight forward with the IAM Management Console, but keep the following in mind:

  • Credentials for a user are only supplied once, so make sure you store them in a safe place.
  • The simplest way to define permissions is by using the Policy Generator, but it still took me a few hours to get this working, cause I misunderstood the format to use when defining a Resource (hence the /*, which grants the group for the content of the bucket). In the end, I got this (working) permission:

    {
      "Version": "2012-10-17", 
      "Statement": [
        {
          "Sid": ...,
          "Effect": "Allow",
          "Action": [
            "s3:*"
          ],
          "Resource": [
            "arn:aws:s3:::my-app.backups/*"
          ]
        }
      ]
    }
    

The Amazon stuff is all set up. Before we switch to our code, a quick word about where to store credentials.

Location of credentials

We like to keep credentials out of our codebase. Instead we define environment variables and use these in our application code (see below). The credentials for our Amazon user are stored in /etc/environment:

export S3_BACKUP_KEY=*******  
export S3_BACKUP_SECRET=*******

Maybe there a better place to define environment variables, but this works for us.

Coding the backup

We use SettingsLogic for maintaining application settings such as user credentials that may change per environment, but other solutions are applicable. SettingsLogic uses config/application.yml and has a key for every environment. Since we can also use ERB in YAML, we can define credentials for S3 that uses the key and secret we defined earlier a environment variables:

backup:
  bucket: 'my-bucket'
  connection_settings:
    aws_access_key_id: <%= ENV['S3_BACKUP_KEY'] %>
    aws_secret_access_key: <%= ENV['S3_BACKUP_SECRET'] %>
    region: 'eu-west-1'

As you can see, we also defined other S3 settings, to keep them all in one place.

When you start a Rails console and enter Settings.backup.connection_settings.aws_access_key_id you will see the key you defined in your environment file.

backup gem provides an easy to use interface for handling backup. It (v4) also suits our needs. Although backup advices NOT to include the gem in Gemfile, we do want to store the configuration for the backup in the Rails application. How can we run backup outside of our application environment and still use this configuration and Rails specific values? This is how we did it.

  • First you install the backup gem:

    gem install backup

  • Run the Generator to create a general configuration and a model for a backup. Our model is called db_backup and is automatically stored in config/models/db_backup.rb by the generator.

In :db_backup model you define things like the database name, user, password and credentials for S3. We already have defined these in database.yml and application.yml so we want to re-use these. Include the following lines in the beginning of the model:

require 'yaml'
require 'erb'

rails_env       = ENV['RAILS_ENV'] || 'development'
database_yml    = File.expand_path('../../database.yml', __FILE__)
db_config       = YAML.load_file(database_yml)[rails_env]
application_yml = File.expand_path('../../application.yml', __FILE__)
app_config      = YAML.load(ERB.new(File.read(application_yml)).result)[rails_env]['backup']

Now these db_config and app_config can be used to configure our :db_model:

Model.new(:db_backup, 'Database backup for my app') do
  database PostgreSQL do |db|
    db.name               = db_config['database']
    db.username           = db_config['username']
    db.password           = db_config['password'].to_s # nil not allowed
    db.host               = "localhost"
  end

  store_with S3 do |s3|
    s3.access_key_id     = app_config['connection_settings']['aws_access_key_id']
    s3.secret_access_key = app_config['connection_settings']['aws_secret_access_key']
    s3.region            = app_config['connection_settings']['region']
    s3.bucket            = app_config['bucket']
    s3.path              = "/#{rails_env}"
    s3.fog_options = {
      path_style: true
    }
  end

  compress_with Gzip
end

This configuration should be very straight forward, except for the path_style key; set this to true if your bucketname contains dots.

Perform the backup

To perform a backup you use the following command (replace :environment and :path/:to/:app):

RAILS_ENV=:environment backup -t db_backup -c :path/:to/:app/config/backup.rb

The catch here is not to cd into the application directory first when you’re using rvm (or other version manager), since the backup command is not available then.

backup will upload a file to Amazon S3 and puts it in the my-app.backup bucket in :environment/:model/:timestamp/db_backup.tar

Schedule the backup

With whenever you can define cron jobs in Ruby. A few caveats here:

  • we only want to backup our production and staging environment
  • when calling the backup we need to pass the RAILS_ENV variable
  • we need to pass a path to backup.rb located in the application’s config folder

In config/schedule.rb we can define a custom job_type that uses whenever‘s variables. This job_type is only called for specific environments, so the cron job is only defined here:

job_type :backup, "source ~/.rvm/scripts/rvm && RAILS_ENV=:environment backup perform -t :task -c :path/config/backup.rb"

case @environment
when 'production', 'staging'
  every 1.day, at: '10:04pm' do
    backup 'db_backup'
  end
end

db_backup is the name of our model. You can emit source ~/.rvm/scripts/rvm if you don’t use rvm. Here :path, :environment and :task are whenever variables, so leave them as is.

Backup on deployment

Since we already have a setup, why not use this to backup the database when we deploy a new release. What we need to keep in mind is that the backup gem needs to be installed system wide, so it’s a good thing to have this taken care of by a capistrano task as well.

We use Capistrano 2, but I’m sure most of the code applies to v3.

In config/deploy.rb we define a task my-app:backup:install that only installs the backup gem if needed and then performs a backup. We call this task in the deploy:create_symlink hook so the current_path points to the folder of the release you’re currently deploying.

namespace :my-app do
  namespace :backup do
    task :install do
      run "if ! [ $(gem list backup -i) == true ]; then gem install backup --no-ri --no-rdoc ; fi && RAILS_ENV=#{rails_env} backup perform -t db_backup -c #{File.join(current_path, 'config/backup.rb')}"
    end
  end
end

after "deploy:create_symlink", "my-app:backup:install"

Restore the backup

Restoring a backup is (currently) done by hand. Of course this can be automated, but here’s our process.

The backup creates a plain-text SQL script file so we can restore it using psql. psql will ask to enter the user’s password. You can find this in database.yml.

  • Download the appropriate db_backup.tar file from Amazon S3 to your server in the /tmp folder
  • Stop the web server, recreate and restore the database and start the server:

    cd :path/:to/:app/current
    sudo apachectl stop
    bundle exec rake db:drop db:create; tar -xvf /tmp/db_backup.tar -C /tmp && gunzip -c /tmp/db_backup/databases/PostgreSQL.sql.gz | psql -U :user :database
    sudo apachectl start

Again, replace :path/:to/:app, :user and :database with the proper values. :user is the user defined in database.yml for the specified environment.

And that’s it. Hope you’ll never need it so you can spend more time on expanding Eruption but if you do, may your backup be restored within seconds.

Building GitHub Pull Requests with Jenkins

We at Kabisa are pretty heavy Jenkins users. We live and breathe Test Driven Development, so a CI server continuously running our tests and building our apps is vital.

About a year ago we switched some projects over to Travis CI, because it had such awesome Github pull request integration.

Nowadays it’s possible to make Jenkins build your GitHub pull requests as well, let me show you how.
Continue reading

Kabisa supports Rails Girls

As the first employee of Kabisa, I\’ve seen Kabisa grow from the small three-person start-up to the business we are today. And that business consists of men. Not mostly men, just men. All developers are men, no women. And that\’s not a company policy.

Throughout all of my career as a software engineer I\’ve never encountered many women. Even in college the number of female students was less that 1%. Understandably, it\’s not easy for girls or women to find their way into a male dominated industry. Just take look at firefighters and the military. But there should be no obstacles preventing girls from pursuing a career in technology or software engineering.
Continue reading

CamelOne 2013

Last month I had the privilege of attending the CamelOne 2013 Conference thanks to Kabisa. The conference gave me the opportunity to meet with some of the creators and core committers of todays most popular open source Apache Foundation products like Apache ActiveMQ, Apache ServiceMix but most and for all: Apache Camel. Besides that, the keynotes I attended opened some new perspectives, which I’m already putting in to practice as we speak.
Continue reading