On May 14, 2009, at 7:14 AM, Stuart Halloway wrote:
>
> FYI: I am working on an open-source CSV parser in Clojure. Splitting
> on delimiters is rarely enough in my experience.
That would be wonderful. I have wrapped OpenCSV for my own purposes
but would of course prefer not having another library dependency. My
code wound up like this:
(import 'java.io.FileReader 'au.com.bytecode.opencsv.CSVReader))
(defn read-csv [filename]
(let [reader (CSVReader. (FileReader. filename))]
(map vec (take-while identity (repeatedly #(.readNext reader))))))
I'd welcome anyone's remarks about that.
One thing I have in my CSV toolkit which might be of general utility
is a function which reads the first row sans '#' as column names and
returns hashes. So if you have this CSV file:
#name,age,language
Daniel,27,Clojure
David,26,Python
You get this seq as a result: ({:name "Daniel", :age "27", :language
"Clojure"} {:name "David", :age "26", :language "Python"})
I'm not quite sure what to call such a function, and I find this
function's implementation seriously ugly. I'd appreciate anyone's
feedback to clean it up.
(defn csv->relation
"Parses a CSV file and returns a seq of dictionaries."
[filename]
(let
[data (read-csv filename)
[[pre-col1 & columns] & rows] data
col1 (if (= \# (.charAt pre-col1 0)) (.substring pre-col1 1)
pre-col1)
headers (map keyword (cons col1 columns))]
(map #(apply
hash-map
(mapcat list headers %))
rows)))
Thanks,
—
Daniel Lyons
http://www.storytotell.org -- Tell It!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---