After a few months using Chef, the need to run commands on many nodes at once appeared. While Chef itself has a
knife ssh option, it does not provide an interactive way of running commands on a group of nodes and then getting an immediate response. And if you want to run one command, wait for the response, then run another, you're totally out of luck.
Capistrano is a great Ruby tool to orchestrate things. It also provides a nifty interactive shell (
cap shell) that is perfect for running commands via ssh to many hosts. One thing that Capistrano lacks, though, is an easy way to integrate with Chef. Or so I thought.
While there's a gem (capistrano-chef) that adds that functionality, its development slowed down a bit. (There's a new version for Capistrano 3, but I found my method depends less on third-party gems)
But as it turns out, we don't need custom gems. Because a Capfile is essentially Ruby, we can tailor it to help Capistrano talk to our Chef server very easily (and with powerful results).
Here's how I have it on my Capfile:
require 'rubygems' require 'chef/config' require 'chef/knife' require 'chef/data_bag_item' require 'chef/search/query' # define roles and presets set :user, 'myuser' set :verbosity, 0 set :stage, :production set :default_shell, "TERM=dumb /bin/bash" # Load up our Chef config assuming that it's in $HOME/.chef/ config = File.expand_path(ENV['HOME'] + "/.chef/knife.rb") Chef::Config.from_file(config) # query our Chef server to find out all our nodes query = Chef::Search::Query.new prod_servers = query.search(:node, 'platform:ubuntu AND chef_environment:production') dev_servers = query.search(:node, 'platform:ubuntu NOT chef_environment:production') # compile a list of fqdns for all our nodes, split by Chef role app_servers = prod_servers.collect do |w| w["fqdn"] if w["roles"].include?("app_server") end.compact db_servers = prod_servers.collect do |w| w["fqdn"] if w["roles"].include?("db_server") end.compact # bonus group of all prod + dev servers all_servers = prod_servers.concat(dev_servers).flatten.compact.collect do |w| w["fqdn"] if w["fqdn"] end # Here we just pass the IP addresses of our nodes for Capistrano to run on role :app, *app_servers role :db, *db_servers role :all, *all_servers # lastly, we define a task to run on db servers (you'd call this with cap do_something_on_db_servers) desc 'List databases on cluster' task :do_something_on_db_servers, :roles => :db do run "echo 'I am doing something on db servers'" end
The important thing to keep in mind is that every time you query your Chef server, Capistrano has to wait for the response before proceeding. So the less calls you make to your server, the quicker it is. That's why we do all the processing locally, after the Chef server replied to our request. Since Capistrano runs this code on every invocation of
cap, keeping it lean ensures you don't spend the bulk of your time staring at a blank prompt.
Footnote: this only applies to Capistrano <= 2. Capistrano 3 hurts (for this sort of stuff, at least).