I have a csv data file that may become corrupted (already happened)
resulting in a NULL byte appearing in the file. The NULL byte causes an
_csv.Error exception.
I'd rather like the csv reader to return csv lines as best it can and
subsequent processing of each comma separated field deal with illegal
bytes. That way as many lines from the file may be processed and the
corrupted ones simply dumped.
Is there a way of getting the csv reader to accept all 256 possible
bytes. (with \r,\n and ',' bytes delimiting lines and fields).
My test code is,
with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile:
csvreader = csv.reader(csvfile, delimiter=',',
quoting=csv.QUOTE_NONE, strict=False )
data = list( csvreader )
for ln in data:
print( ln )
Result
>>python36 csvTest.py
Traceback (most recent call last):
File "csvTest.py", line 22, in <module>
data = list( csvreader )
_csv.Error: line contains NULL byte
strict=False or True makes no difference.
Help appreciated,
John
--
https://mail.python.org/mailman/listinfo/python-list