On 2016-08-04 15:45, Random832 wrote:
> On Thu, Aug 4, 2016, at 15:22, Malcolm Greene wrote:
>> Hi Chris,
>>
>> Thanks for your suggestions. I would like to capture the specific bad
>> codes *before* they get replaced. So if a line of text has 10 bad codes
>> (each one raising UnicodeError), I woul
Wow!!! A huge thank you to all who replied to this thread!
Chris: You gave me some ideas I will apply in the future.
MRAB: Thanks for exposing me to the extended attributes of the UnicodeError
object (e.start, e.end, e.object).
Mike: Cool example! I like how _cleanlines() recursively calls itse
On Thu, Aug 4, 2016 at 3:24 PM Malcolm Greene wrote:
> Hi Chris,
>
> Thanks for your suggestions. I would like to capture the specific bad
> codes *before* they get replaced. So if a line of text has 10 bad codes
> (each one raising UnicodeError), I would like to track each exception's
> bad code
On Fri, Aug 5, 2016 at 5:22 AM, Malcolm Greene wrote:
> Thanks for your suggestions. I would like to capture the specific bad
> codes *before* they get replaced. So if a line of text has 10 bad codes
> (each one raising UnicodeError), I would like to track each exception's
> bad code but still ret
On Thu, Aug 4, 2016, at 15:22, Malcolm Greene wrote:
> Hi Chris,
>
> Thanks for your suggestions. I would like to capture the specific bad
> codes *before* they get replaced. So if a line of text has 10 bad codes
> (each one raising UnicodeError), I would like to track each exception's
> bad code
On 2016-08-04 20:22, Malcolm Greene wrote:
Hi Chris,
Thanks for your suggestions. I would like to capture the specific bad
codes *before* they get replaced. So if a line of text has 10 bad codes
(each one raising UnicodeError), I would like to track each exception's
bad code but still return a v
Hi Chris,
Thanks for your suggestions. I would like to capture the specific bad
codes *before* they get replaced. So if a line of text has 10 bad codes
(each one raising UnicodeError), I would like to track each exception's
bad code but still return a valid decode line when finished.
My goal is
On Fri, Aug 5, 2016 at 4:47 AM, Malcolm Greene wrote:
> I'm processing a lot of dirty CSV files and would like to track the bad
> codes that are raising UnicodeErrors. I'm struggling how to figure out
> what the exact codes are so I can track them, them remove them, and then
> repeat the decoding
I'm processing a lot of dirty CSV files and would like to track the bad
codes that are raising UnicodeErrors. I'm struggling how to figure out
what the exact codes are so I can track them, them remove them, and then
repeat the decoding process for the current line until the line has been
fully deco