Re: RE Help splitting CVS data

Mitya Sirenef Sun, 20 Jan 2013 14:18:04 -0800

On 01/20/2013 05:04 PM, Garry wrote:

I'm trying to manipulate family tree data using Python.
I'm using linux and Python 2.7.3 and have data files saved as Linux formatted 
cvs files
The data appears in this format:


Marriage,Husband,Wife,Date,Place,Source,Note0x0a
Note: the Source field or the Note field can contain quoted data (same as the 
Place field)

Actual data:
[F0244],[I0690],[I0354],1916-06-08,"Neely's Landing, Cape Gir. Co, MO",,0x0a
[F0245],[I0692],[I0355],1919-09-04,"Cape Girardeau Co, MO",,0x0a

code snippet follows:

import os
import re
#I'm using the following regex in an attempt to decode the data:
RegExp2 = 
"^(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\d{,4}\-\d{,2}\-\d{,2})\,(.*|\".*\")\,(.*|\".*\")\,(.*|\".*\")"
#
line = "[F0244],[I0690],[I0354],1916-06-08,\"Neely's Landing, Cape Gir. Co, 
MO\",,"
#
(Marriage,Husband,Wife,Date,Place,Source,Note) = re.split(RegExp2,line)
#
#However, this does not decode the 7 fields.
# The following error is displayed:
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
#
# When I use xx the fields apparently get unpacked.
xx = re.split(RegExp2,line)
#

print xx[0]
print xx[1]

[F0244]

print xx[5]

"Neely's Landing, Cape Gir. Co, MO"

print xx[6]
print xx[7]
print xx[8]

Why is there an extra NULL field before and after my record contents?
I'm stuck, comments and solutions greatly appreciated.

Garry



Gosh, you really don't want to use regex to split csv lines like that....

Use csv module:

>>> s

'[F0244],[I0690],[I0354],1916-06-08,"Neely\'s Landing, Cape Gir. Co,MO",,0x0a'

>>> import csv
>>> r = csv.reader([s])
>>> for l in r: print(l)
...

['[F0244]', '[I0690]', '[I0354]', '1916-06-08', "Neely's Landing, CapeGir. Co, MO", '', '0x0a']



the arg to csv.reader can be the file object (or a list of lines).

 - mitya


--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

--
http://mail.python.org/mailman/listinfo/python-list

Re: RE Help splitting CVS data

Reply via email to