On 8 Jul 2009, at 17:13, Garry Bettle <garry.bet...@gmail.com> wrote:

Hi,

I've been programming for over 20 yrs, but only the last few in python
and then only in dribs and drabs.

I'm having a difficult time parsing a delimited string.

e.g.

100657641~GBP~ACTIVE~0~1~~true~5.0~1247065352508~:
3818854~0~24104.08~4.5~~22.1~false| 4.4~241.67~L~1~4.3~936.0~L~2~4.2~210.54~L~3~| 4.5~19.16~B~1~4.6~214.27~B~2~4.7~802.13~B~3~: 3991404~1~19974.18~4.7~~21.7~false| 4.6~133.01~L~1~4.5~124.83~L~2~4.4~319.33~L~3~| 4.7~86.61~B~1~4.8~247.9~B~2~4.9~142.0~B~3~: 4031423~2~15503.56~6.6~~15.1~false| 6.6~53.21~L~1~6.4~19.23~L~2~6.2~53.28~L~3~| 6.8~41.23~B~1~7.0~145.04~B~2~7.2~37.23~B~3~

That is just a selection of the full string - and I've broken it up
for this email.  It's delimited by : and then by ~ and finally, in
some cases, | (a pipe).

If the string is called m, I thought I could create a list with
m.split(":").  I would like to then first of all find in this list the
entry beginning with e.g. 3991404.

I thought I could pop each item in the list and compare that seems
pretty long winded.

When the ItemFound is now =
'3991404~1~19974.18~4.7~~21.7~false| 4.6~133.01~L~1~4.5~124.83~L~2~4.4~319.33~L~3~| 4.7~86.61~B~1~4.8~247.9~B~2~4.9~142.0~B~3~:'

I would like to return the 3rd item delimited with ~, which in this case, is 4.7

Can anyone help?

Many thanks!

Cheers,

Garry
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
I've been dealing with a similar problem myself, parsing input for project Euler. The way I did it was to map a split function onto the first list:

lst = map(lambda s: s.split("~"), m.split(":"))
You can get the same effect with a comprehension:

lst = [s.split("~") for s in m.split(":")]

You can then use a function like the following:

def find(term):
    for i in lst:
        if i[0] == term:
            return i[3]

Of course, this assumes that you only want the first match, but it would be trivial to modify it to return all matches.

Does that help? If it doesn't solve the problem, I hope it will at least point you towards how to solve it.

If you really want to speed up the search, you could turn the list of lists into a dict, using the first value in each sublist as a key:

dct = dict((i[0], i[1:]) for i in lst)

Then you can access it using the normal dictionary interface.
dct["3991404"][3]

This will only return the last of any repeated values (previous ones will get overwritten during construction), so it really depends on the behaviour you want.
---
Richard "Roadie Rich" Lovely
Part of the JNP|UK Famille
www.theJNP.com

(Sent from my iPod - please allow me a few typos: it's a very small keyboard)
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to