Archive for category Command Line

Log File parsing with Futures in Clojure

As the follow up to my post Running Clojure shell scripts in *nix enviornments, here is how I implemented an example using futures to parse lines read in from standard in as if the input was piped from a tail and writing out the result of parsing the line to standard out.

First due to wanting to run this a script from the command line I add this a the first line of the script:

 
!/usr/bin/env lein exec

As well, I will also be wanting to use the join function from the clojure.string namespace.

 
(use '[clojure.string :only (join)])

When dealing with futures I knew I would need an agent to adapt standard out.

(def out (agent *out*))

I also wanted to separate each line by a new line so I created a function writeln. The function takes a Java Writer and calls write and flush on each line passed in to the function:

(defn writeln [^java.io.Writer w line]
  (doto w
    (.write (str line "\n"))
    .flush))

Next I have my function to analyze the line, as well as sending the result of that function to the agent via the send-off function.

(defn analyze-line [line]
  (str line "   " (join "  " (map #(join ":" %) (sort-by val > (frequencies line))))))

(defn process-line [line]
  (send-off out writeln (analyze-line line)))

The analyze-line function is just some sample code to return a string of the line and the frequencies of each character in the line passed in. The process-line function takes a line and calls send-off to the agent out for the function writeln with the results of calling the function analyze-line.

With all of these functions defined I now need to just loop continuously and process lines that are not empty, and call process-line for each line as a future.

(loop []
  (let [line (read-line)]
    (when line
      (future (process-line line)))
      (recur)))

, , , , ,

1 Comment

Running Clojure shell scripts in *nix environments

I was recently trying to create a basic piece of Clojure code to play with “real-time” log file parsing by playing with futures. The longer term goal of the experiment is to be able to tail -f a log file pipe that into my Clojure log parser as input.

As I wasn’t sure exactly what I would need to be doing, I wanted an easy way to run some code quickly without having to rebuild the jars through Leiningen every time I wanted to try something, in a manner similar to the way I am thinking I will be using it if the experiment succeeds.

I created a file test_input with the following lines:

1 hello
2 test
3 abacus
4 qwerty
5 what
6 dvorak

With this in place, my goal was to be able to run something like cat test_file | parser_concept. After a bit of searching I found the lein-exec plugin for Leiningen, and after very minor setup I was able to start iterating with input piped in from elsewhere.

The first step was to open my profiles.clj file in my ~/.lein directory. I made sure lein-exec was specified in my user plugins as so:

{:user {:plugins [[lein-exec "0.2.1"]
                  ;other plugins for lein
                 ]}}

With this in place I just put the following line at the top of my script.clj file:

#!/usr/bin/env lein exec

I then changed the permissions of script.clj file to make it executable, I was able to run the following and have my code run against the input.

cat test_input | ./script.clj

I will be posting a follow up entry outlining my next step of experimenting with “processing” each line read in as a future.

, , , , ,

5 Comments

Remove first and last lines from file in OS X

Just a quick post to help burn this into longer term memory.

Today I was having to check some info in a generated csv file that had a header and footer row. I only wanted the records in between, so I needed to remove the first and last lines of that CSV, after I got the columns I needed.

cut <args> my.csv | tail -n +2 | sed '$d'

The tail -n +2 command starts at the second line and outputs the input stream/file. The sed '$d' command deletes the last line of the file.

Leave a comment