On 2017-07-06 11:47, Gregory Ewing wrote: > The only reason I can think of to want to use tsv instead > of csv is that you can sometimes get away without having > to quote things that would need quoting in csv. But that's > not an issue in Python, since the csv module takes care of > all of that for you.
I work with thousands of CSV/TSV data files from dozens-to-hundreds of sources (clients and service providers) and have never encountered a 0x09-as-data needing to be escaped. So my big reason for preference is that people say "TSV" and I can work with it without a second thought. On the other hand, with "CSV", sometimes it's comma-delimited as it says on the tin. But sometimes it's pipe or semi-colon delimited while still carrying the ".csv" extension. And sometimes a subset of values are quoted. Sometimes all the values are quoted. Sometimes numeric values are quoted to distinguish between numeric-looking-string and numeric-value. Sometimes escaping is done with backslashes before the quote-as-value character. Sometimes escaping is done with doubling-up the quoting-character. Sometimes CR(0x0D) and/or NL(0x0A) characters are allowed within quoted values; sometimes they're invalid. Usually fields are quoted with double-quotes; but sometimes they're single-quoted values. Or sometimes they're either, depending on the data (much like Python's REPL prints string representations). And while, yes, Python's csv module handles most of these with no issues thanks to the "dialects" concept, I still have to determine the dialect—sometimes by sniffing, sometimes by customer/vendor specification—but it's not nearly as trivial as with open("file.txt", "rb") as fp: for row in csv.DictReader(fp, delimiter='\t'): process(row) because there's the intermediate muddling of dialect determination or specification. And that said, I have a particular longing for a world in which people actually used the US/RS/GS/FS (Unit/Record/Group/File separators; AKA 0x1f-0x1c) as defined in ASCII for exactly this purpose. Sigh. :-) -tkc -- https://mail.python.org/mailman/listinfo/python-list