Re: which data structure should I use?

Gabriel Genellina Fri, 15 Jan 2010 14:28:46 -0800

En Fri, 15 Jan 2010 01:56:24 -0300, Eknath Venkataramani<eknath.i...@gmail.com> escribió:

I have a txt file in the following format:

[code]
"confident" => {
  count => 4,
  trans => {
     "ashahvasahta" => 0.74918568,
    "atahmavaishahvaasa" => 0.09095465,
    "pahraaram\.nbha" => 0.06990729,
         "mailatae" => 0.02856427,
           "utanai" => 0.01929341,
             "anaa" => 0.01578552,
         "uthaanae" => 0.01403157,
         "jaitanae" => 0.01227762,
    },
},
"consumers" => {
  count => 4,
  trans => {
    "upabhaokahtaa" => 0.75144362,
...

and I need to extract "confident" , "ashahvasahta" from the first
record, "consumers",  "upabhaokahtaa" from the second record...
i.e. "word in english" and the "first word in the probable-translations"

The most robust way would be to write a specific parser for such format.Should be easy using pyparsing http://pyparsing.wikispaces.com/

If you can guarantee certain properties (e.g. lines like "confident","consumers" are always in a separate line; translations appear one perline; no line breaks before/after the => sign, etc.) then you couldprocess the file line by line, looking at those separators. But only dothat is you are completely sure the format is fixed (e.g. the file iscomputer-generated, not human-written). Anyway, it isn't much easier thanwriting a real parser, and the latter is a lot more reliable. Learning howto use a tool like pyparsing is in no way a waste of time.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Re: which data structure should I use?

Reply via email to