Hello, I'm a fairly new python programmer (aren't I unique!) and a somewhat longer C/++ programmer (4 classes at a city college + lots and lots of tinkering on my own).
I've started a pet project (I'm really a blacksheep!); the barebones of it is reading data from CSV files. Each CSV file is going to be between 1 and ~500 entries/lines, I can't see them being much bigger than that, though 600 entries/lines might be possible. As for the total number of CSV files I will be reading in, I can't see my self going over several hundred (200-300), though 500 isn't to much of a stretch and 1000 files *could* happen in the distant future. So, I read the data in from the CSV file and store each line as an entry in a list, like this (I have slightly more code, but this is basically what I'm doing) *FilepathVar = "my/file/path/csv.txt" import csv reader = csv.reader(open(FilepathVar,"rb"), delimiter=',') entryGrouping = [] # create a list for entry in reader: entryGrouping.append(entry)* This produces a list (entryGrouping) where I can do something like ( *print entryGrouping[0]* ) and get the first row/entry of the CSV file. I could also do ( *print entryGrouping[0][0]* ) and get the first item in the first row. All is well and good, codewise, I hope? Then, since I wanted to be able to write in multiple CSV files (they have the same structure, the data relates to different things) I did something like this to store multiple entryGroupings... *masterList = [] # create a list masterList.append(entryGrouping) # ... # load another CSV file into entryGrouping # ... masterList.append(entryGrouping)* Which lets me write code like this... * print masterList[0] # prints an entire entryGrouping print masterList[0][0] # prints the first entry in entryGrouping print masterList[0][0][0] # prints the first item in the first row of the first entryGrouping... * So, my question (because I did have one!) is thus: I'm I doing this in a pythonic way? Is a list of lists (of lists?) a good way to handle this? As I start adding more CSV files, will my program grind to a halt? To answer that, you might need some more information, so I'll try and provide a little right now as to what I expect to be doing... (It's still very much in the planning phases, and a lot of it is all in my head) So, Example: I'll read in a CSV file (just one, for now.) and store it into a list. Sometime later, I'll get another CSV file, almost identical/related to the first. However, a few values might have changed, and there might be a few new lines (entries) or maybe a few less. I would want to compare the CSV file I have in my list (in memory) to new CSV file (which I would probably read into a temporary list). I would then want to track and log the differences between the two files. After I've figured out what's changed, I would either update the original CSV file with the new CSV's information, or completely discard the original and replace it with the new one (whichever involves less work). Basically, lots of iterating through each entry of each CSV file and comparing to other information (either hard coded or variable). So, to reiterate, are lists what I want to use? Should I be using something else? (even if that 'something else' only really comes into play when storing and operating on LOTS of data, I would still love to hear about it!) Thank you for taking the time to read this far. I apologize if I've mangled any accepted terminology in relation to python or CSV files. - Ira (P.S. I've read this through twice now and tried to catch as many errors as I could. It's late (almost 4AM) so I'm sure to have missed some. If something wasn't clear, point it out please. See you in the morning! - er, more like afternoon!)
-- http://mail.python.org/mailman/listinfo/python-list