Hi. I have files that I will be importing in at least four different plain text formats, one of them being tab delimited format, a couple being token based uses pipes (but not delimited with pipes), another being xml. There will likely be others as well but the data needs to be extracted and rewritten to a single format. The files can be fairly large (several MB) so I do not want to read the whole file into memory. What approach would be recommended for sniffing the files for the different text formats. I realize CSV module has a sniffer but it is something that is limited more or less to delimited files. I have a couple of ideas on what I could do but I am interested in hearing from others on how they might handle something like this so I can determine the best approach to take. Many thanks.
Regards, David -- http://mail.python.org/mailman/listinfo/python-list