Thomas, I have been thinking about the concept of being stingy with information as this is a fairly common occurrence when people ask for help. They often ask for what they think they want while people like us keep asking why they want that and perhaps offer guidance on how to get closer to what they NEED or a better way.
In retrospect, Rich did give all the info he thought he needed. It boiled down to saying that he wants to distribute data into two files in such a way that finding an item in file A then lets him find the corresponding item in file B. He was not worried about how to make the files or what to do with the info afterward. He had those covered and was missing what he considered a central piece. And, it seems he programs in multiple languages and environments as needed and is not exactly a newbie. He just wanted a way to implement his overall design. We threw many solutions and ideas at him but some of us (like me) also got frustrated as some ideas were not received due to one objection or another that had not been mentioned earlier when it was not seen as important. I particularly notice a disconnect some of us had. Was this supposed to be a search that read only as much as needed to find something and stopped reading, or a sort of filter that returned zero or more matches and went to the end, or perhaps something that read entire files and swallowed them into data structures in memory and then searched and found corresponding entries, or maybe something else? All the above approaches could work but some designs not so much. For example, some files are too large. We, as programmers, often consciously or unconsciously look at many factors to try to zoom in on what approaches me might use. To be given minimal amounts of info can be frustrating. We worry about making a silly design. But the OP may want something minimal and not worry as long as it is fairly easy to program and works. We could have suggested something very simple like: Open both files A and B In a loop get a line from each. If the line from A is a match, do something with the current line from B. If you are getting only one, exit the loop. Or, if willing, we could have suggested any other file format, such as a CSV, in which the algorithm is similar but different as in: Open file A Read a line in a loop Split it in parts If the party of the first part matches something, use the party of the second part Or, of course, suggest they read the entire file, into a list of lines or a data.frame and use some tools that search all of it and produce results. I find I personally now often lean toward the latter approach but ages ago when memory and CPU were considerations and maybe garbage collection was not automatic, ... -----Original Message----- From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On Behalf Of Thomas Passin via Python-list Sent: Wednesday, January 31, 2024 7:25 AM To: python-list@python.org Subject: Re: Extract lines from file, add to new files On 1/30/2024 11:25 PM, avi.e.gr...@gmail.com wrote: > Thomas, on some points we may see it differently. I'm mostly going by what the OP originally asked for back on Jan 11. He's been too stingy with information since then to be worth spending much time on, IMHO. > Some formats can be done simply but are maybe better done in somewhat > standard ways. > > Some of what the OP has is already tables in a database and that can > trivially be exported into a CSV file or other formats like your TSV file > and more. They can also import from there. As I mentioned, many spreadsheets > and all kinds of statistical programs tend to support some formats making it > quite flexible. > > Python has all kinds of functionality, such as in the pandas module, to read > in a CSV or write it out. And once you have the data structure in memory, al > kinds of queries and changes can be made fairly straightforwardly. As one > example, Rich has mentioned wanting finer control in selecting who gets some > version of the email based on concepts like market segmentation. He already > may have info like the STATE (as in Arizona) in his database. He might at > some point enlarge his schema so each entry is placed in one or more > categories and thus his CSV, once imported, can do the usual tasks of > selecting various rows and columns or doing joins or whatever. > > Mind you, another architecture could place quite a bit of work completely on > the back end and he could send SQL queries to the database from python and > get back his results into python which would then make the email messages > and pass them on to other functionality to deliver. This would remove any > need for files and just rely on the DB. > > There as as usual, too many choices and not necessarily one best answer. Of > course if this was a major product that would be heavily used, sure, you > could tweak and optimize. As it is, Rich is getting a chance to improve his > python skills no matter which way he goes. > > > > -----Original Message----- > From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On > Behalf Of Thomas Passin via Python-list > Sent: Tuesday, January 30, 2024 10:37 PM > To: python-list@python.org > Subject: Re: Extract lines from file, add to new files > > On 1/30/2024 12:21 PM, Rich Shepard via Python-list wrote: >> On Tue, 30 Jan 2024, Thomas Passin via Python-list wrote: >> >>> Fine, my toy example will still be applicable. But, you know, you haven't >>> told us enough to give you help. Do you want to replace text from values >>> in a file? That's been covered. Do you want to send the messages using >>> those libraries? You haven't said what you don't know how to do. >>> Something >>> else? What is it that you want to do that you don't know how? >> >> Thomas, >> >> For 30 years I've used a bash script using mailx to send messages to a > list >> of recipients. They have no salutation to personalize each one. Since I >> want >> to add that personalized salutation I decided to write a python script to >> replace the bash script. >> >> I have collected 11 docs explaining the smtplib and email modules and >> providing example scripts to apply them to send multiple individual >> messages >> with salutations and attachments. > > If I had a script that's been working for 30 years, I'd probably just > use Python to do the personalizing and let the rest of the bash script > do the rest, like it always has. The Python program would pipe or send > the personalized messages to the rest of the bash program. Something in > that ballpark, anyway. > >> Today I'm going to be reading these. They each recommend using .csv input >> files for names and addresses. My first search is learning whether I can >> write a single .csv file such as: >> "name1","address1" >> "mane2","address2" >> which I believe will work; and by inserting at the top of the message > block >> Hi, {yourname} >> the name in the .csv file will replace the bracketed place holder > If the file contents are going to be people's names and email addresses, > I would just tab separate them and split each line on the tab. Names > aren't going to include tabs so that would be safe. Email addresses > might theoretically include a tab inside a quoted name but that would be > extremely obscure and unlikely. No need for CSV, it would just add > complexity. > > data = f.readlines() > for d in data: > name, addr = line.split('\t') if line.strip() else ('', '') > >> Still much to learn and the batch of downloaded PDF files should educate >> me. >> >> Regards, >> >> Rich > -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list