On 2015-10-06 00:51, Chris Angelico wrote: > fn = "tmp1.csv" > fin = open(fn, 'rb') > rdr = csv.DictReader(fin, delimiter=',') > # all the same down to here > blanks = set(rdr.fieldnames) > for row in rdr: > blanks = {col for col in blanks if not row[col]} > mt = [col for col in rdr.fieldnames if col not in blanks] > print mt
My only other modification would be to add a check that, if you no longer have any blank columns, bail early from the loop: from cStringIO import StringIO import csv s = StringIO("""Name,Surname,Age,Sex abc,def,,M ,ghi,,F jkl,mno,, pqr,,,F """) dr = csv.DictReader(s) header_set = set(dr.fieldnames) for row in dr: header_set = set(h for h in header_set if not row[h]) if not header_set: # we no longer have any headers, bail early break ordered_headers = [h for h in dr.fieldnames if h in header_set] print(header_set) print(ordered_headers) That way, if you determine by line 3 that your million-row CSV file has no blank columns, you can get away with not processing all million rows. -tkc -- https://mail.python.org/mailman/listinfo/python-list