On Wed, 6 Dec 2017 04:20 am, Jason wrote: > I ran into this: > https://stackoverflow.com/questions/27707581/why-does-csv-dictreader-skip-empty -lines > > # unlike the basic reader, we prefer not to return blanks, > # because we will typically wind up with a dict full of None > # values > > while iterating over two files, which are line-by-line corresponding. The > DictReader skipped ahead many lines breaking the line-by-line > correspondence.
Um... this doesn't follow. If they are line-by-line corresponding, then they should skip the same number of blank lines and read the same number of non-blank lines. Even if one file has blanks and the other does not, if you iterate the over the records themselves, they should keep their correspondence. I'm afraid that if you want to convince me this is a buggy design, you need to demonstrate a simple pair of CSV files where the non-blank lines are corresponding (possibly with differing numbers of blanks in between) but the CSV readers get out of alignment somehow. > And I want to argue that the difference of behavior should be considered a > bug. It should be considered as such because: 1. I need to know what's in > the file to know what class to use. Sure. But blank lines don't tell you what class to use. > The file content should not break at-least-1-record-per-line. Blank lines DO break that requirement. A blank line is not a record. > There may me multiple lines per record in the > case of embedded new lines, but it should never no record per line. I disagree. A blank line is not a record. If I have (say) five fields, then: ,,,,\n is a blank record with five empty fields. \n alone is just a blank. The DictReader correctly returns records with blank fields. > 2. It's a premature optimization. If skipping blank lines is desirable, > then have another class on top of DictReader, maybe call it > EmptyLineSkippingDictReader. No, that's needless ravioli code. The csv module already defines a basic reader that doesn't skip blank lines. Having two different DictReaders, one which doesn't work correctly because it wrongly expands blank lines to collections of blank fields, is not helpful. Perhaps if they were called BrokenDictReader for the one which expands blank lines to empty records, and DictReader for the one which correctly skips blank lines. > 3. The intent of DictReader is to return a > dict, nothing more, therefore the change of behavior isn inappropriate. No, if all you want is a dict, call dict() or use the dict display {}. The intent of DictReader is to *read a CSV file and extract the records* as a dict. Since blank lines aren't records, they should be skipped. > Does anyone agree, or am I crazy? I wouldn't want to guess your mental health based just on this isolated incident, but if I had to make a diagnosis, I'd say, yes, crazy as a loon. *wink* -- Steve -- https://mail.python.org/mailman/listinfo/python-list