Site has moved

My site can now be found at:

Please make sure to follow and subscribe there, as that will be the new source of my posts.

Thanks, and I look forward to seeing you there,

Leave a comment

Cross Product in Ruby with Array

I was digging around the Ruby standard library on Enumerable and Arrays to find if there was a method to get the cross product of two arrays in Ruby and found Array.product.

I was showing the code that used this to another teammate and they didn’t realize what the product method was doing, and upon telling them it returns the cross product of two arrays, he was pleasantly surprised to hear about it, so I realized I should share this for anyone else who may not know about it.

Below is an example usage of the product method, and more can be found on the ruby-doc site for Array.

[1] pry(main)> [1, 2, 3, 4, 5].product [:a, :b, :c]
=> [[1, :a],
 [1, :b],
 [1, :c],
 [2, :a],
 [2, :b],
 [2, :c],
 [3, :a],
 [3, :b],
 [3, :c],
 [4, :a],
 [4, :b],
 [4, :c],
 [5, :a],
 [5, :b],
 [5, :c]]
[2] pry(main)> 

Hope someone else can find this useful as well.



Leave a comment

cronolog and STDERR

At work we use cronolog for automatic rotation of log files for a number of processes since those processes just write to STDOUT and STDERR instead of using a proper logging library. Unfortunately, that means when running the script/program we have to redirect STDERR to STDOUT, and then pipe the results to cronolog, since cronolog reads from STDIN. The result looks something along the lines of the following:

ruby main.rb &2>1 | cronolog /logs/main.log /logs/main-%Y-%m-%d.log

The problem with this is if that errors are few and far between, as one hopes they should be, then it might be really tricky to find the errors amongst the other logging. Ideally, I thought it would be nice to have STDOUT go to one log file, and STDERR get written to a err file for the process.

After some digging into From Bash to Z Shell I found something about process substitution in the Bash shell. After a little experimentation and tweaking, I came up with the following:

ruby main.rb \
     > >(/usr/sbin/cronolog /logs/main.log /logs/main-%Y-%m-%d.log) \
     2> >(/usr/sbin/cronolog /logs/main.err /logs/main-%Y-%m-%d.err)

This allows me to use cronolog with both the STDOUT and STDERR streams. By using cronolog in the process substitution, it allows the output streams to be treated as input streams to cronolog, where as before I had to combine them into one stream and then pipe the single stream to cronolog as in the first example.

Hope this can help someone else, and save some hours of digging.


, , , , ,

Leave a comment

The Value of Mandatory Code Reviews

At a previous job we had mandatory code reviews, where every single change to the code was required to be reviewed before being checked in, even if you were fixing a single typo. At other jobs we did not have any real code review policy; you could get one if you wanted, but it took some nagging to get it to happen. The most value I had seen from code reviews was in the environment where they were mandatory.

To understand the value of mandatory code reviews, first we must understand the value of a single code review.

I write code

Yes. Yes, you do.

Why don’t you have an editor then? Writers have editors to help them make sure they are being coherent in their writing, and to catch errors both syntactic and semantic. Their role is to be a critical eye with the goal of making the end product better. Are you being too wordy? Are you writing for your audience? Are you being too clever?

As a programmer, a code review is your way of getting the code you wrote to be edited. In a code review, you are being edited by not only your peers, but your audience as well. Getting a code review is not only like a writer having an editor review their writing, but also getting a trusted fan to read an early copy of their writing to make sure their audience will be able to follow their narrative.

What was with that rifle?

Code is the story of the system, it’s point is to communicate to the reader what the system is about. One of my favorite quotes about programming is from the forward to the first edition of Structure and Interpretation of Computer Programs:

[…] a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.

–Hal Abelson, Jerry Sussman and Julie Sussman
Structure and Interpretation of Computer Programs

Otherwise, why are we programming in higher level languages instead of binary and punchcards? Why do we care about meaningful names, not only for items in the program, but also recurring patterns across multiple codebases. Why? Because higher level concepts and abstractions help us to define a language we can communicate with each other using, so we express ideas and tell a story with more clarity.

“Remove everything that has no relevance to the story. If you say in the first chapter that there is a rifle hanging on the wall, in the second or third chapter it absolutely must go off. If it’s not going to be fired, it shouldn’t be hanging there.”

–Anton Chekhov

Just as in any other form of a story, when we get the narrative of our code wrong, by introducing extraneous concepts, or poorly weaving in threads of thought from other areas of the code, we are introducing a rifle that never goes off. When we do that we risk that we are no longer telling a story, but merely rambling, leaving the reader confused and trying to understand if there is an underlying point to all of it.

What is the story you are trying to tell?

I am pretty sure it is a safe bet to say everyone who has programmed for even a small amount of time has gotten so deep into the problem, they lose track of the bigger picture. A similar piece of code to what you just wrote lives elsewhere; you just copied and pasted the same code five times; you have created conditionals six levels deep; you start mixing different levels of abstraction. All of these are easily done when trying to get your head around a problem and *just* get it working, because you are *this* close to being done.

You have been down amongst the trees for so long, you have lost sight of the forest; the code has become a jumbled mess of ideas that no longer tells a story, and leads you to miss what should be obvious in the code. Too many times on getting code review, no sooner than first words are just leaving my mouth do I realize that I forgot something big, and then had to proceed to apologize for wasting their time, and let them know I would need to get them back once I fix the issue.

Just knowing you are going to walk someone through the change forces you to get back to the big picture and look at the code you just wrote from a different perspective. Even if it is just rubber ducking what you are going to be reviewing before doing the actual review helps you to get your thoughts straight.

If the best way to find what you don’t know is by trying to teach it, shouldn’t you practice teaching your new code first? By practicing teaching your changes, you help yourself to understand the questions that the person you are going to get to review your code might have, or anybody else who looks at the code.

Get out of my head, and into their heart

Being able to empathize with the person who will be doing the code review allows you to get out of your own head and see the code as someone else will see it. Even better than empathizing with that person, is to have sympathy for the person who will be reviewing the code, or making the next modification to the code, and have a genuine concern for how she will feel when it comes to be her turn to update the code.

That is the difference between thinking “the next person who touches this code is going to be pissed” and “the next person who touches this will be pissed, but no one should feel that way about code, so I should make this better”.

Being able to take that perspective allows you to get a new view of the code that you probably didn’t have at the time when you were heads down trying to “just get it to work.” Being able to sympathize with the reviewer, as well as the following people who are going to be interacting with the code, is going to make you want to have the code be just that much clearer. Odds are that you will be one of the following people who will be interacting with the code that was just changed, since you are now the last person who touched that code.

Do you want to be working in a code base that makes you angry?

The Golden Rule

Treat others as you would like to be treated.

Ask yourself “If I have to come back again and be the next person to work on this, am I going to be upset?” You would go back and make it nicer if you answered “Yes”, wouldn’t you? Why should anything be any different if you asked that question on behalf of someone else? Don’t you want your teammates to do the same for you? Or do you want them to just think “I don’t care, I just want to be done with this. It can be Jimmy’s problem when he works on this next”?

Don’t forget to think of your teammates when you are going to get your code reviewed as well. You are busy trying to get your tasks done, and don’t want to have to spend anymore time than necessary going over someone else’s code, right? You would be upset if a teammate isn’t even sure what they did to address the issue; introduced changes that if they thought about it for a moment before calling you over would be obvious that they break other parts of the application; you make it all the way through the code review only to end the review by finding an issue breaking major parts of the application that would be obvious if they took an extra few minutes to run the tests; you spend extra time pointing out all of the ways their code is not matching the style guide as set forth by the team.

Why is your time so much more special than your teammates, that you can’t spend extra time reviewing the changes on your own first, catching as many errors as you can, and making the review as effortless as possible for the her? Isn’t that how you would like to be treated when asked for a code review?

It is not that big of a deal

Making sure the code meets the good style guidelines is not really that big of a deal on a single change, is it? Correct, individually it is not, but when every change ignores that and takes it’s own format, it becomes death by a thousand cuts. While the code reviews themselves don’t force you to be less lazy in the code you write, it does put additional pressure for you to make the code clearer.

If done well, you have a reviewer who will be “nitpicking” your code. They will be looking for missed cases, duplicated code, code that belongs elsewhere, and hopefully anything they can see that they don’t like about the code. Her job, as the reviewer, is to try and find as much as can be improved in relation to the code you have just changed. It could even be code you didn’t touch, because you have now introduced duplication that should have been gotten rid of. It is your job, to set your ego aside, and take all the punches that are dished out on the code review, and to remember that it is not a personal attack, but with the goal to make the software that much better and easier to adapt.

It is also her job as the reviewer to make sure to tell you when she sees things she likes, and when you have just taught her something new. The goal here is to make the code better, and if you can teach the reviewer something that can make the next piece of code they write better, everybody wins.

A rising tide raises all ships

When you get your feedback on the review, the goal is to be able to take things you didn’t think about and integrate them back in to the code to make it better. Even better is when you can integrate feedback on things you didn’t even know about. This sharing of knowledge tends to be infectious as well, be it an text editor tip, command line trick, language functionality, or a part of the codebase.

You share a tip with Sally, Sally thinks that is useful, uses it, and shares it during her next few reviews both as reviewer, and the person being reviewed; Jacob and Billy then each use it and share it with Jenny and Clyde, as well as Butch and Megan, respectively. Soon that tip has spread though the team and is making the code base, or development experience, that much better.

Information want to be free and code reviews help disseminate information to as many people as possible. The more you can disseminate the information, the better the odds of someone having critical feedback on it becomes.

How Bazaar

“Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix will be obvious to someone.”

–Eric S. Raymond

The above is Linus’ Law as formulated by Eric S. Raymond in The Cathedral and the Bazaar. While most workplace projects don’t operate at the scale of large successful open source software, there are likely people on your team that have some sacred knowledge about that system, or know the lore behind it at least. A code review helps to spread that knowledge outside of yourself and increase the odds that the other person will know about the changes you just made.

Also by getting different people to do your code reviews instead of always just one person, you help spread that knowledge out, so that even if that one person somehow never finds out about these changes, you are not the only person who knows about this change now.

You sure are Extreme

But I pair program all day, so I have a constant code review done, right? I agree that pair programming turns the code review knob to 11 to paraphrase Nigel Tufnel, I am not convinced a thorough code review would still not provide useful feedback. While I haven’t done pair programming as rigorously, or religiously, as others, and do find pair programming very useful, I have found the benefits of getting a different pair of eyes on the code still hold even on code generated from pairing.

If anybody has any feedback about the effectiveness of code reviews on code that was produced using pair programming done rigorously, I would love to get your understanding of how much of a benefit you have seen, or not seen.

Sign on the dotted line

“I got someone to review the code, time to check in!”

Sadly no. I found it works best to say the code is not ready to be checked in to the target branch until all issues raised in the code review have been addressed and signed off by the reviewer. I know, I know, I can hear you saying: “I can’t fix this thing that they want cleaned up because it will break something else.”

All issues have to be *addressed*, and signed off by the reviewer, but they do not all have to be *fixed*. If making a change to fix an issue that the reviewer expressed concern about will cause more bugs, or a much larger change, then that needs to be communicated with the reviewer and they need to understand the implications of a change they are suggesting. If you can’t make a change they are suggesting, create the open dialog and inform them why, if they then agree, you have addressed the issue satisfactorily and can move on.

“But they won’t give and are just being stubborn and want to get their way…”

I am making an assumption that we are operating under collective code ownership. Assuming that is the case, if I get a code review from someone I need to remember that it is their code too, they have a stake in this code base and that everyone is on the same team, and the team should have the goal about making things better. Don’t forget that somebody took their time to review your code, and work to understand why they are making the suggestions they are. If one person in the review feels the other is truly being unreasonable, it never hurts to have an impartial party help settle decisions.

Once the issues have been addressed, and everyone agrees that the code will is the best it can be given the current constraints, then you are allowed to get it checked in and merged into the master/main branch. I would also suggest documenting in the change somewhere the people who reviewed the code.

Thanks for all the fish

One final note on the code review process, and that is to thank the person, or people, who reviewed the code for their time and input, especially if they volunteered to do the code review on their own. They took the time to give you feedback on the code you wrote, so it is always nice to recognize that they didn’t have to be the one to do that, and thank them for their feedback and comments.


While the value a single well done code review provides is independent of how many other code reviews are done, the total value across all code reviews becomes compounded the more code reviews are done. As long as the software keeps changing, the natural state tends to move towards complexity, and only with care and vigilance, do the changes to a software system improve the state of the software.

When code reviews are not mandatory, those who don’t recognize the value of code reviews, either don’t ask for them, or if they do, are just looking to get a checkmark sign off that they followed the suggestion of having a code review. If you have a team where a good portion of the members see the value of code reviews, then they can help enforce the mandatory nature by mentoring and guiding people on what a good code review consists of.

By having a team decide that mandatory code reviews are to be done on all changes, the team declares as an group that they value the result of well done code reviews, and don’t want people to take short cuts in their review, but to treat code reviews with diligence because they are an important part of the development process. Instilling this culture allows people to hold each other to that standard, even when the team has some developers who don’t like code reviews or don’t believe that code reviews are valuable.

Each well done code review that happens is like getting a small dividend on your changes, as a well done code review helps to make sure the code that is being checked in for a change comes out clean. By their nature, code reviews help to make sure that each change that goes into the system is following The Boy Scout rule. For those who aren’t familiar, The Boy Scouts of America have a rule: “Leave the campground cleaner than you found it.”

When applied to the code, and ensured with mandatory code reviews, I have found this one of the best ways to ensure any technical debt that has been accrued gets payments made against it. Mandatory code reviews when backed by the team have a drastic benefit on the code: every change that is being made gets boy-scouted, and those areas with the most churn are going to reap the most benefits, as they are going to get cleaned up quicker.

Having mandatory code reviews, done well, on every change, ensures that the system is always working towards a state of decreased entropy. Mandatory code reviews provide the framework to self-reinforce on your team that you are going well, and as Uncle Bob Martin so frequently points out, that is they only way to go fast.

I would love to know your experiences with code reviews, so comment below and let me know how code reviews have worked for you, positive or negative.


Leave a comment

Software Development Podcasts – 2013 Edition

I was recently chatting with some coworkers about podcasts I listen to, so I thought I should document that list for easy sharing and to find some gems I am missing.

I have taken advantage of my commute time and turned my commute into Automobile University as talked about by Zig Ziglar. I heard this idea via some fitness blogs I was reading where the trainers were talking about ways to continuously improve, and decided I would apply that idea to my commute, walks, or even running errands.

The other thing I have started taking advantage of is the ability of podcast players to play at double speed. Most podcasts out there do well at one-and-a-half or double speed, and have heard that some players even support three-times speed. This allows you to dramatically increase your consumption rate if you can follow along at those speeds. You may not understand everything that is said, but you can always go back and re-listen to sections if needed, let it broaden your known unknowns, and at the least it should help to remove some of your unknown unknowns.

I did a listing of Software Development Podcasts previously, and am going to try and make this a yearly or bi-yearly update based off how frequently this list of podcasts change in my rotation.

.NET Podcasts

Ruby Podcasts

  • Ruby Rogues – Panel discussion on various Ruby related topics and projects.

Clojure Podcasts

  • The Cognicast – Formerly Think Relevance podcast
  • Mostly λazy – Infrequent updates, but enjoyed the episodes that have been released

JavaScript Podcasts

  • JavaScript Jabber – Panel discussion on JavaScript topics, started by the host who started Ruby Rogues. The first episodes were hard to listen to due to some negativity, but have picked up listening to it again in the 50’s episode numbers, and working my way back as I get a chance.

Erlang Podcasts

  • Mostly Erlang – Panel discussion mostly about Erlang, but touches on related topics and other functional programming languages and how they relate to Erlang.


  • The Changelog – Podcast about Open Source projects from The Changelog
  • The Wide Teams Podcast – Hosted by one of the panelists of Ruby Rogues, with a focus on distributed software development, with the goal to find out the good and the bad experiences and help share information on how distributed teams work.
  • Software Engineering Radio – Recently I have only been finding a few shows on topics that seem interesting, but have a large backlog of shows with interesting topics.
  • GitMinutes – Podcast covering Git source control management.

New Comers

These are podcasts that I have only listened to a couple of episodes of, either because they have only released a couple, or have just started trying them.

On my list to check out

  • Food Fight – Podcast on DevOps
  • The Freelancers Show – Started by the same host of JavaScript Jabber and Ruby Rogues about freelancing. I would think the information would be relevant to full time employees even for working to build ones career.

If you have any other podcasts that are good please feel free to add your list of podcasts that I have left out to the comments.

**Updated 2013-10-24 7:54 CDT to include link to previous list of Software Development Podcasts
**Updated 2013-10-24 22:13 CDT to include The Changelog, a “podcast covering what’s new and interesting in open source”
**Updated 2013-10-24 22:28 CDT to include GitMinutes

, , , , , , , , , , ,

Leave a comment

DFW Erlang User Group

If you are located in the Dallas/Fort Worth Metroplex and are interested in Erlang, here is the friendly reminder that we have a User Group for you.

It doesn’t matter if you are Joe Armstrong, or Robert Virding; have just heard something about how Facebook uses it or that CouchDB, RabbitMQ and Riak are built on it; or from the cartoon about “writing a map-reduce query in Erlang”, we want you to come join us in building a community around Erlang.

We have our next two meetings scheduled and we will be continuing to cover Études for Erlang.

Please join us at the following locations:


Leave a comment

Ruby and Puma – Read error: #<NoMethodError: undefined method `each’ for #<String:0x00000001611790>>

Was standing up a Puma web service late into the work day yesterday, and could not figure out why I was getting the below error when I should be seeing my results as a CSV result.

2013-05-28 19:00:59 -0500: Read error: #<NoMethodError: undefined method `each' for #<String:0x00000001611790>>
/home/reporting/reporting_mux/shared/bundle/ruby/1.9.1/gems/puma-2.0.1/lib/puma/server.rb:482:in `handle_request'
/home/reporting/reporting_mux/shared/bundle/ruby/1.9.1/gems/puma-2.0.1/lib/puma/server.rb:243:in `process_client'
/home/reporting/reporting_mux/shared/bundle/ruby/1.9.1/gems/puma-2.0.1/lib/puma/server.rb:142:in `block in run'
/home/reporting/reporting_mux/shared/bundle/ruby/1.9.1/gems/puma-2.0.1/lib/puma/thread_pool.rb:92:in `call'
/home/reporting/reporting_mux/shared/bundle/ruby/1.9.1/gems/puma-2.0.1/lib/puma/thread_pool.rb:92:in `block in spawn_thread'

The original return value of the call method was setup as:

[200, {"Content-Type" => "text/html"}, str]

For those with some more experience in using Puma, or Rack, you may even see what the problem was right away. Being my first attempt at standing up a Puma instance, I chased down a few red herrings, but at least they were needed updates. I first tried to update the MIME type to be text/comma-separated-values instead of text/html, and had the results setup to return as an attachment. These changes were kept, as they were the behavior of an existing implementation we are mirroring, but the call was still erroring out.

I finally stumbled across Puma’s Example config.rb file on and buried in there I saw the issue. The last element of the array returned by a Rack call method, has to be an Array, or at least Enumerable. I was returning a String, and that is what was causing the #<NoMethodError: undefined method `each' for #<String:0x00000001611790>>. I changed the last element to be the string wrapped in an array, and voilà, everything worked. So below is what I wound up with after the too long bug hunt.

[200, {"Content-Type" => "text/comma-separated-values",
       "Content-Disposition" => "attachment; filename=#{filename}" }, [str + "\n"]]

Hope this helps someone else with a similar problem, and can save you the long evening of debugging that I went through.

, ,

1 Comment

Lumberjack – lumberjack.nginx (version 0.1.0)

As I posted last time, lumberjack is my start of a log line analyzer/visualizer project in Clojure. This write up will cover the version 0.1.0 lumberjack.nginx namespace.

As this is a version 0.1.0, and to get it out, I am parsing Nginx log lines that take the following format, as all the log lines that I have been needing to parse match it. - - [18/Mar/2013:15:20:10 -0500] "PUT /logon" 404 1178 "" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"

The function nginx-logs takes a sequence of Nginx log filenames to convert to a hash representing a log line by calling process-logfile on each one.

(defn nginx-logs [filenames]
  (mapcat process-logfile filenames))

The function process-logfile takes a single filename and gets the lines from the file using slurp, and then maps over each of the lines using the function parse-line.

(defn- logfile-lines [filename]
  (string/split-lines (slurp filename)))

(defn process-logfile [filename]
    (map parse-line (logfile-lines filename)))

At this point, this is sufficient for what I am needing, but have created an issue on the Github project to address large log files, and the ability to lazily read in the lines so the whole log file does not have to reside in memory.

The function parse-line, holds a regex, and does a match of each line against the pattern. It takes each part of the match and associates to a hash using the different parts of the log entry as a vector of the keywords that represent each part of the regex. This is done by reducing against an empty hash and taking the index of the part into match, the result of re-find.

(def parts [:original

(defn parse-line [line]
  (let [parsed-line {}
        pattern #"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})? - - \[(.*)\] \"(\w+) ([^\"]*)\" (\d{3}) (\d+) \"([^\"]*)\".*"
        match (re-find pattern line)]
    (reduce (fn [memo [idx part]]
                (assoc memo part (nth match idx)))
            parsed-line (map-indexed vector parts))))

Looking at this again a few days later, I went and created and issue to pull out the definition of pattern into a different definition, outside of the let, and even the parse-line function. I also want to go back and clean up the parsed-line from the let statement as it does not need to be declared inside the let, but can just pass the empty hash to the reduce. This was setup there before I refactored to a reduce, and was just associating keys one at a time to the index of matched as I was adding parts of the log entry.

Any comments on this are welcome, and I will be posting details on the other files soon as well.


, ,

1 Comment

Lumberjack – Log file parsing and analysis for Clojure

I have just pushed a 0.1.0 version of a new project called Lumberjack. The goal is to be a library of functions to help parse and analyze log files in Clojure.

At work I have to occasionally pull down log files and do some visualization of log files from our Nginx webservers. I decided that this could be a useful project to play with to help me on my journey with Clojure and Open Source Software.

This library will read in a set of Nginx log files from a sequence, and parse them to a structure to be able to analyze them. It currently also provides functionality to be able to visualize the data as a set of time series graphs using Incanter, as that is currently the only graphing library I have seen so far.

A short future list of things I would like to be able to support that come to mind very quickly, and not at all comprehensive:

  • Update to support use of BufferedReader for very long log files so the whole file does not have to reside in memory before parsing, and take advantage of lazyness.
  • The ability to only construct records with a subset of the parsed data, such as request type, and timestamp.
  • The ability to parse log lines of different types, e.g. Apache, IIS or other formats
  • Additional graphs other than time series, e.g. bar graphs to show number of hits based off of IP Address.
  • Possibility of using futures, or another concurrency mechanism, to do some of the parsing and transformation of log lines into the data structures when working on large log files.

The above are just some of my thoughts on things that might fit well as updates to this as I start to use this more and flush out more use cases.

I would love comments on my code, and any other feedback that you may have. This is still early but I wanted to put something out there that might be of some use to others as well.

You can find Lumberjack on my Github account at

Thanks for your comments and support.

, ,


Error installing iconv

Today I was trying to install the iconv gem, and was getting this error

$gem install iconv
Building native extensions.  This could take a while...
ERROR:  Error installing iconv:
	ERROR: Failed to build gem native extension.

        <rvm_dir>/rubies/ruby-2.0.0-preview2/bin/ruby extconf.rb
checking for rb_enc_get() in ruby/encoding.h... yes
checking for iconv() in iconv.h... no
checking for iconv() in -liconv... no
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers.  Check the mkmf.log file for more details.  You may
need configuration options.

Provided configuration options:

Gem files will remain installed in <rvm_dir>/gems/ruby-2.0.0-preview2/gems/iconv-1.0.2 for inspection.
Results logged to <rvm_dir>/gems/ruby-2.0.0-preview2/gems/iconv-1.0.2/ext/iconv/gem_make.out

After a bit of searching, I found a number of answers suggesting that I would need to reinstall ruby via RVM, but in the following StackOverflow question

In it was the solution:
gem install iconv -- --with-iconv-dir=/usr/local/Cellar/libiconv/1.13.1

After that, all was well.


, ,

Leave a comment