Hey all, I'll start by saying that I'm a complete beginner with perl, and while I've done some shell scripting, my experience doing programming / scripting work is pretty limited.
I have several hundred csv files containing a mix of comma separated strings and numbers. Fortunately, they're all uniform, so it's easy to tell what a given column/record/value represents, at least when I look at the files in person. What I want to do is read all of these files into an array, and take a given record from each line containing a certain string -- an example of the type of data I have: (number of hotdogs of a particular type I eat on each day) ,"hotdog type",monday,tuesday,wednesday,thursday,friday,saturday,sunday,total ,"red hot",0,0,0,1,0,0,0,1, ,"white hot",0,1,1,0,0,0,2, ,totals,0,1,2,0,0,0,3, (I'm not actually that interested in hot dogs -- just using this as an example). It's also worth noting that the header row appears in each file. Now, to take it a step further, each file contains information about a different person's hotdog preferences, and the filenames use a firstname.lastname.mm-dd-yy.csv. For each week, I have a file for each person I am working with, and data was collected over a one year period. Now that data collection has been completed, I need to find the total number of white hots as well as the total number of red hots consumed, and the total number of hot dogs consumed per person. As someone with little to no experience programming, I think the steps to do this are to: * Concatenate all CSV files (hopefully using some function to make each line identifiable to a particular subject in the study I'm working on -- possibly by adding the filename as the last column/record/value to each line? Maybe this should be done before concatenation, but I'm not sure how.) * Read the file into an array, using a comma to delimit fields in the array. * Do some math -- add all of the "totals" columns together and output that number to a text file, with a brief text description of what the number represents. * Sort the array by the identifying string ("red hot" or "white hot") and break it into multiple arrays. * Return the value for the "total" column in each array. Append those values to the text file from two steps ago. * Return to the original (unsplit) array, and sort by the participant name (given in the original firstname.lastname . . . csv file.) Break that array into an array for each participant. * Return the total number of each type of hot dog the participant consumed during the study period, as well as the total (combined) number of hot dogs consumed by each participant and append it to the aforementioned text file. * Do a quick comparison to make sure the numbers match -- if the sum of total hot dogs eaten by each participant exceeds the total number of hotdogs consumed (in step 3), we have a problem. Exit and return a minimally descriptive error. Cross check the other way too. If all is well, exit cleanly. Is this right? Any suggestions for how to build something like this? I know I've probably made this out to be more complicated than it is, but as someone with very little understanding of programming, I figured it makes the most sense for me to separate out each step, and chances are I will learn more about how to build this type of program as a result. While I'm really grateful for any help, if I get code snippets, I'm also interested in knowing how they work and why (very brief descriptions or pointers to existing documentation would be more than enough for me). (Please forgive me for using hot dogs as an example. If you're a vegetarian or an animal rights activist, you can substitute something else -- no dogs, hot or otherwise, were harmed in this study :) ). Thanks and best, Graham -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/