Archive for category Command Line
Log File parsing with Futures in Clojure
As the follow up to my post Running Clojure shell scripts in *nix enviornments, here is how I implemented an example using futures to parse lines read in from standard in as if the input was piped from a tail
and writing out the result of parsing the line to standard out.
First due to wanting to run this a script from the command line I add this a the first line of the script:
!/usr/bin/env lein exec
As well, I will also be wanting to use the join
function from the clojure.string
namespace.
(use '[clojure.string :only (join)])
When dealing with futures I knew I would need an agent to adapt standard out.
(def out (agent *out*))
I also wanted to separate each line by a new line so I created a function writeln
. The function takes a Java Writer
and calls write
and flush
on each line passed in to the function:
(defn writeln [^java.io.Writer w line] (doto w (.write (str line "\n")) .flush))
Next I have my function to analyze the line, as well as sending the result of that function to the agent via the send-off
function.
(defn analyze-line [line] (str line " " (join " " (map #(join ":" %) (sort-by val > (frequencies line)))))) (defn process-line [line] (send-off out writeln (analyze-line line)))
The analyze-line
function is just some sample code to return a string of the line and the frequencies of each character in the line passed in. The process-line
function takes a line and calls send-off
to the agent out
for the function writeln
with the results of calling the function analyze-line
.
With all of these functions defined I now need to just loop continuously and process lines that are not empty, and call process-line
for each line as a future.
(loop [] (let [line (read-line)] (when line (future (process-line line))) (recur)))
Running Clojure shell scripts in *nix environments
I was recently trying to create a basic piece of Clojure code to play with “real-time” log file parsing by playing with futures. The longer term goal of the experiment is to be able to tail -f
a log file pipe that into my Clojure log parser as input.
As I wasn’t sure exactly what I would need to be doing, I wanted an easy way to run some code quickly without having to rebuild the jars through Leiningen every time I wanted to try something, in a manner similar to the way I am thinking I will be using it if the experiment succeeds.
I created a file test_input
with the following lines:
1 hello 2 test 3 abacus 4 qwerty 5 what 6 dvorak
With this in place, my goal was to be able to run something like cat test_file | parser_concept
. After a bit of searching I found the lein-exec plugin for Leiningen, and after very minor setup I was able to start iterating with input piped in from elsewhere.
The first step was to open my profiles.clj
file in my ~/.lein
directory. I made sure lein-exec was specified in my user plugins as so:
{:user {:plugins [[lein-exec "0.2.1"] ;other plugins for lein ]}}
With this in place I just put the following line at the top of my script.clj
file:
#!/usr/bin/env lein exec
I then changed the permissions of script.clj
file to make it executable, I was able to run the following and have my code run against the input.
cat test_input | ./script.clj
I will be posting a follow up entry outlining my next step of experimenting with “processing” each line read in as a future.
Remove first and last lines from file in OS X
Posted by Proctor in Command Line, OSX, sed, tail on October 3, 2012
Just a quick post to help burn this into longer term memory.
Today I was having to check some info in a generated csv file that had a header and footer row. I only wanted the records in between, so I needed to remove the first and last lines of that CSV, after I got the columns I needed.
cut <args> my.csv | tail -n +2 | sed '$d'
The tail -n +2
command starts at the second line and outputs the input stream/file. The sed '$d'
command deletes the last line of the file.