On Thursday, September 3, 2015 at 12:12:04 PM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwa...@gmail.com> wrote:
> > I have used CSV and collections. For some reason when I apply this 
> > algorithm, all of my files are not added (the output is ridiculously small 
> > considering how much goes in - think KB output vs MB input):
> >
> > from glob import iglob
> > import csv
> > from collections import OrderedDict
> >
> > files = sorted(iglob('*.csv'))
> > header = OrderedDict()
> > data = []
> >
> > for filename in files:
> >     with open(filename, 'r') as fin:
> >         csvin = csv.DictReader(fin)
> >         header.update(OrderedDict.fromkeys(csvin.fieldnames))
> >         data.append(next(csvin))
> >
> > with open('output_filename_version2.csv', 'w') as fout:
> >     csvout = csv.DictWriter(fout, fieldnames=list(header))
> >     csvout.writeheader()
> >     csvout.writerows(data)
> 
> You're collecting up just one row from each file. Since you say your
> input is measured in MB (not GB or anything bigger), the simplest
> approach is probably fine: instead of "data.append(next(csvin))", just
> use "data.extend(csvin)", which should grab them all. That'll store
> all your input data in memory, which should be fine if it's only a few
> meg, and probably not a problem for anything under a few hundred meg.
> 
> ChrisA

Hmmmm - good point. However, I may have to deal with larger files, but thank 
you for the tip. 

I am also wondering, based on what you stated, you are only "collecting up just 
one row from each file"....

I am fulfilling this, correct? 

"I have files that may have different headers. If they are different, they 
should be appended (along with their values) into the output. If there are 
duplicate headers, then their values should just be added sequentially."

I am wondering how DictReader can skip empty rows by default and that this may 
be happening that also extrapolates to the other rows.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to