Tuesday, July 9, 2013

Uploading CSV files to a Rails Application using ActiveAdmin

Recently I had try and upload CSV files to a rails application via Active Admin. I had the inkling that maybe I was not the first to have done this. A short google search later and I was lead to this answer on Stack Overflow

I loved it except for the processing class listed under csv_db. It seemed too limiting in that it requires EVERY column to be there whether data is present or not. I recalled a Railscast that offered up a much more flexible solution and created this:

require 'csv'
class CsvDb
  class << self
    def convert_save(model_name, csv_data)
      begin
        target_model = model_name.classify.constantize
        CSV.foreach(csv_data.path, :headers => true) do |row|
          target_model.create(row.to_hash)
        end
      rescue Exception => e
        Rails.logger.error e.message
        Rails.logger.error e.backtrace.join("\n")
      end
    end
  end
end


If you replace the code from the Stackoverflow post in csv_db with this you should be able to load any number of columns you wish. As soon as I figure out the updating of existing records I will post a follow-up.

Sunday, April 28, 2013

Flay, a gem to help improve the maintainability of your code

Recently I read a post where the author listed the must have gems for rails development. Being an avid watcher of Railscasts I knew many of them however one, flay, caught my attention. It comes out of the Seattle Ruby group which also brought us flog.

Flay is along the same lines as Flog; it analyzes your code looking for issues. Rather than looking for tortured code though it is looking for similar or duplicate code blocks. You install with `gem install flay` and then run it against your files, e.g.:

flay ./app/models/*.rb

A report is generated of the suspect areas like this:

macscott:test-project scottshea$ flay ./app/models/*.rb
Total score (lower is better) = 1666

1) Similar code found in :call (mass = 170)
  ./app/models/level.rb:8
  ./app/models/level.rb:9
  ./app/models/level.rb:10
  ./app/models/level.rb:11
  ./app/models/level.rb:15
  ./app/models/level.rb:17
  ./app/models/level.rb:19
  ./app/models/level.rb:20
  ./app/models/level.rb:22
  ./app/models/level.rb:23

2) Similar code found in :defs (mass = 154)
  ./app/models/item_step.rb:260
  ./app/models/response.rb:195

3) Similar code found in :defs (mass = 138)
  ./app/models/feedback.rb:62
  ./app/models/hint.rb:54
  ./app/models/subtitle.rb:51

4) Similar code found in :call (mass = 136)
  ./app/models/level.rb:12
  ./app/models/level.rb:13
  ./app/models/level.rb:14
  ./app/models/level.rb:16
  ./app/models/level.rb:18
  ./app/models/level.rb:21
  ./app/models/level.rb:24
  ./app/models/level.rb:25

5) IDENTICAL code found in :defn (mass*2 = 128)
  ./app/models/report_generator.rb:7
  ./app/models/summary_report_generator.rb:7

6) IDENTICAL code found in :defn (mass*2 = 120)
  ./app/models/image.rb:17
  ./app/models/sharded_image.rb:23

[truncated]

The total app score of 1666 can be viewed in its individual components showing areas that provide the most bang for the buck. For experienced developers operating on their own or in a small team Flay may be unnecessary. However, on larger projects (as the one I ran it on) or those with beginner or intermediate programmers it can help increase the maintainability of your codebase.

I am not sure where the 1666 would rank on the overall chart (is that really bad? representative?) but the 'lower is better' advice holds true. This Stackoverflow question offers some interpretation of the score but really the best advice is "don't let it get higher!"

Monday, April 22, 2013

A rake task to automate deployment and database migration to Heroku

One of my major annoyances with Heroku is that the deploy does not automatically run any database migrations. Seriously?! So I ran into this tonight when I am trying to explain to a seasoned SA that while Heroku automatically will recompile your assets for you it will not do the migrations. A Google turned up this Stackoverflow question which in turn led to this gist. And I am excited to be trying it out.

Saturday, April 20, 2013

Using Sidekiq with Faye to notify client web pages and applications of updates

Recently I was given a small coding challenge of how I would update sites & client applicaitons with notifications of content being released. My mind immediately flew to the faye gem. However, this would be a serious block if my app had to halt everything to send out the notification to a bunch of browsers. So, I added in Sidekiq. In the Content Model I put this to call out to Sidekiq:
  def self.release_next
    ContentWorker.perform_async
  end
The ContentWorker then looks like this:
class ContentWorker
  include Sidekiq::Worker
  sidekiq_options queue: "content"

  def perform
    next_release = Content.next_release(Content.last_release)
    # puts the data into JSON format; for actual content it could encode a url that is then loaded
    vars = ["title" => next_release.title,
             "site" => next_release.site,
             "released_at" => next_release.released_at.strftime("%l:%M:%S %p %m-%d-%Y")].to_json
    message = {:channel => "/releases", :data => vars, :ext => {:auth_token => FAYE_TOKEN}}

    uri = URI.parse(FAYE_SUBSCRIPTION_PATH)
    Net::HTTP.post_form(uri, :message => message.to_json)
    next_release.update_attribute(:released_at, DateTime.now)
  end
end
I created a designated queue in Sidekiq for the messages and then made the Faye channel creation configurable via an environment variable from an initializer file (/config/initializers/faye_configuration.rb):
FAYE_SUBSCRIPTION_PATH = "http://localhost:9292/faye"
The client then just listens in on the Faye channel and acts upon a message coming through. Here is the Javascript I wrote for listening to the Channel and adding the new release to the top of the view table just below the headers:
$(function() {
   var faye = new Faye.Client("http://192.168.0.12:9292/faye");
   faye.subscribe("/releases", function(data) {
      add_to_table(data);
   });
});

function add_to_table(data){
    var json_obj = jQuery.parseJSON(data);
    var $content_table = $('#content_table');
    if ($content_table.find('tr').length >= 10 ) {
        $content_table.find("tr:last").remove();
    }
    $("#header").after("<tr><td>" + json_obj[0].title+ "</td><td>" + json_obj[0].site + "</td><td>" + json_obj[0].released_at + "</td></tr>");
}
All in all a nice little challenge. I have not had a chance to try it out beyond my home network yet so I am not sure how performant the solution would be.

Railscasts used for this:
Faye
Sidekiq

Wednesday, April 3, 2013

Using Unicorn Worker Killer to help reduce queue backlog on Heroku with Unicorn

In light of the update from Heroku CTO Adam Wiggins and having been part of the efforts to test the larger Dynos with Unicorn on Heroku I felt it necessary to share Unicorn Worker Killer; a tool we have found indispensable.

Based on configurable thresholds for memory and number of requests received it will kill off a Unicorn Worker. For those of you on Heroku this is valuable in that it then returns all the requests back to the Heroku random routing. While not an exact correlation between too many requests and memory size it does help keep individual workers from becoming too overloaded.

Add this to your Gemfile:

gem 'unicorn-worker-killer'

The gem suggests that you configure your `config.ru` file for the thresholds. This can be cumbersome on Heroku if you need to test out different settings.

Thankfully you can also control the thresholds via environment variables. In config.ru do:
max_request_min =  ENV['MAX_REQUEST_MIN'].to_i || 3072
max_request_max =  ENV['MAX_REQUEST_MAX'].to_i || 4096

# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, max_request_min, max_request_max

oom_min = ((ENV['OOM_MIN'].to_i || 192) * (1024**2))
oom_max = ((ENV['OOM_MAX'].to_i || 256) * (1024**2))

# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, oom_min, oom_max


The run this on the heroku command line:
heroku config:add OOM_MAX=256 memory_limit_min =192 MAX_REQUEST_MIN=3072 MAX_REQUEST_MAX=4096 -a unicon-ttm-sandbox
That will add in the variables needed to control the thresholds. The example shows the defaults though we found dropping the OOM_MAX to 216 worked best

Thursday, March 14, 2013

Heroku plugin providing insight into the Postgres database for your application

Heroku provides a lovely plugin called heroku-pg-extras which can provide some insight into the inner workings of the postgres databases connected to your application. By default it will hit the database listed in the DATABASE_URL variable for Heroku. However you can specify the database using the database color url; e.g. HEROKU_POSTGRES_PURPLE_URL.

The plugin does the following:
  • cache_hit - calculates your cache hit rate (effective databases are at 99% and up)
  • index_usage - calculates your index hit rate (effective databases are at 99% and up)
  • ps - view active queries with execution time
  • locks - display queries with active locks
  • blocking - display queries holding locks other queries are waiting to be released
  • kill - -f,--force; terminates the connection in addition to cancelling the query
  • total_index_size - show the total size of the indexes in MB
  • index_size - show the size of the indexes in MB descending by size
  • seq_scans - show the count of seq_scans by table descending by order
  • long_running_queries - show queries taking longer than 5 minutes 
  • bloat - show table and index bloat in your database ordered by most wasteful
  • mandlebrot - show the mandelbrot set
These are based on the stats collections process used by postgres which has its own body of literature in collection and analysis starting with their main documentation. It may be worth opening a ticket with Heroku to clear the stats first as they accumulate over time. Changes to the application will not be readily apparent if you still have the stats from the previous app state. Therefore you may want to open a ticket with Heroku to clean the stats (you need to be a superuser to do this yourself and Heroku does not give you superuser access).

Disclaimer: I am a contributer to the project

Errors building Nokogiri on Lion/Mountain Lion

I kept getting an odd error with Nokogiri on my new Macbook:

WARNING: Nokogiri was built against LibXML version 2.9.0, but has dynamically loaded 2.7.8

With help from this post I was able to resolve the issue. It seems that "This happens because the Lion system default libxml2 (loaded at bootstrap) is used, regardless of which libxml2 Nokogiri was built against". No real original work on my part here other than a Google search. However I wanted to spread the effort of Michele Gerarduzzi a litte further and give credit where credit is due.

The relevant post: Get rid of Nokogiri LibXML warning on OSX Lion

Thursday, February 14, 2013

Heroku with Unicorn backlog settings and performance

In light of this post from Rap Genius and subsequent blow up on Hacker News I decided to share what we did to play with the request queue on Unicorn & Heroku.

In the app/config/unicorn.rb we changed the backlog line to this:

:backlog => Integer(ENV['UNICORN_BACKLOG'] || 200)

And then we can alter the backlog as necessary via:

heroku config:set UNICORN_BACKLOG=25 -a <app_name>

We found 25 to be a sweet spot for two Unicorn Workers but it may be different with four (which we are experimenting with now). Again this is also relative to the app so your results may (and probably will) be different.

Update:

I neglected to mention that you have to move the port declaration from config.ru and put it in the unicorn.rb too. The full line should be something like this:

listen ENV['PORT'], :backlog => Integer(ENV['UNICORN_BACKLOG'] || 200)

Saturday, February 9, 2013

Formatting Source for a blog post

I found this handy blog post for formatting my source code to place here in my last blog post. I encourage you to check it out!

Heroku: Scaling Dynos at night and in the morning

I needed a rake task to auto scale Dynos on Heroku; wind them down at night and wind them up in the morning. I found this helpful post on Stackoverflow. I made a few modifications and would up with this:

namespace :scale_dynos do
  require 'heroku-api'
  desc "scales up dynos"
  task :up do
    dyno_max = [6,0].include?(Time.now.wday) ? ENV['WEEKEND_DYNO_MAX'] : ENV['DYNO_MAX']
    heroku = Heroku::API.new(:api_key => ENV['HEROKU_API_KEY'])
    heroku.post_ps_scale(ENV['APP_NAME'], 'web', dyno_max.to_i)
  end

  desc "scales down dynos"
  task :down do
    dyno_min = ENV['DYNO_MIN']
    heroku = Heroku::API.new(:api_key =>  ENV['HEROKU_API_KEY'])
    heroku.post_ps_scale(ENV['APP_NAME'], 'web', dyno_min.to_i)
  end
end


It adjusts for the weekend to scale to a different level if you set the ENV varaible. Additionally here is how I set the Heroku config vars. The add command does them all at once but the remove goes through one by one, removes the VAR and restarts the app.

heroku config:add DYNO_MIN=1 DYNO_MAX=25 WEEKEND_DYNO_MAX=1 APP_NAME=<your app name> HEROKU_API_KEY=<your api key> -a <your app name>
heroku config:remove DYNO_MIN DYNO_MAX WEEKEND_DYNO_MAX APP_NAME HEROKU_API_KEY -a <your app name>

You will need the heroku-api gem. I also set the Gemfile entry to this:

gem 'heroku-api', :require => false

Since the rake task is called twice a day there was no need to load the gem into memory for the rest of the time.