I managed to get output for my function, thanks much for your direction. I really appreciate the hints. Now I have tried to place the statement "print ("Length \t" + "Count\n")" in different places in my code so that the function can print the headers only one time in this manner:
Count Length 4 7 8 1 12 2 Code so far: def fileProcess(filename = open('declaration.txt', 'r')): """Call the program with an argument, it should treat the argument as a filename, splitting it up into words, and computes the length of each word. print a table showing the word count for each of the word lengths that has been encountered.""" freq = {} #empty dict to accumulate word count and word length print ("Length \t" + "Count\n") for line in filename: punc = string.punctuation + string.whitespace#use Python's built-in punctuation and whiitespace for word in (line.replace (punc, "").lower().split()): if word in freq: freq[word] +=1 #increment current count if word already in dict else: freq[word] = 1 #if punctuation encountered, frequency=0 word length = 0 #print ("Length \t" + "Count\n")#print header for all numbers. for word, count in freq.items(): print(len(word), count) fileProcess() On Sat, Jun 18, 2011 at 7:09 PM, <python-list-requ...@python.org> wrote: > Send Python-list mailing list submissions to > python-list@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-list > or, via email, send a message with subject or body 'help' to > python-list-requ...@python.org > > You can reach the person managing the list at > python-list-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-list digest..." > > Today's Topics: > > 1. Re: How do you copy files from one location to another? > (Terry Reedy) > 2. Re: Strategy to Verify Python Program is POST'ing to a web > server. (Paul Rubin) > 3. Re: Strategy to Verify Python Program is POST'ing to a web > server. (Terry Reedy) > 4. Re: debugging https connections with urllib2? (Roy Smith) > 5. Re: Improper creating of logger instances or a Memory Leak? > (Chris Torek) > 6. Re: Strategy to Verify Python Program is POST'ing to a web > server. (Chris Angelico) > 7. NEED HELP-process words in a text file (Cathy James) > 8. Re: NEED HELP-process words in a text file (Chris Rebert) > 9. Re: NEED HELP-process words in a text file (Tim Chase) > > > ---------- Forwarded message ---------- > From: Terry Reedy <tjre...@udel.edu> > To: python-list@python.org > Date: Sat, 18 Jun 2011 16:52:26 -0400 > Subject: Re: How do you copy files from one location to another? > On 6/18/2011 1:13 PM, Michael Hrivnak wrote: >> >> Python is great for automating sysadmin tasks, but perhaps you should >> just use rsync for this. It comes with the benefit of only copying >> the changes instead of every file every time. >> >> "rsync -a C:\source E:\destination" and you're done. > > Perhaps 'synctree' would be a candidate for addition to shutil. > > If copytree did not prohibit an existing directory as destination, it could > be used for synching with an 'ignore' function. > > -- > Terry Jan Reedy > > > > > ---------- Forwarded message ---------- > From: Paul Rubin <no.email@nospam.invalid> > To: python-list@python.org > Date: Sat, 18 Jun 2011 14:03:19 -0700 > Subject: Re: Strategy to Verify Python Program is POST'ing to a web server. > "mzagu...@gmail.com" <mzagu...@gmail.com> writes: >> For example, if I create a website that tracks some sort of >> statistical information and don't ensure that my program is the one >> that is uploading it, the statistics can be thrown off by people >> entering false POST data onto the data upload page. Any remedy? > > If you're concerned about unauthorized users posting random crap, the > obvious solution is configure your web server to put password protection > on the page. > > If you're saying AUTHORIZED users (those allowed to use the program to > post stuff) aren't trusted to not bypass the program, you've basically > got a DRM problem, especially if you think the users might > reverse-engineer the program to figure out the protocol. The most > effective approaches generally involve delivering the program in the > form of a hardware product that's difficult to tamper with. That's what > cable TV boxes amount to, for example. > > What is the application, if you can say? That might help get better > answers. > > > > ---------- Forwarded message ---------- > From: Terry Reedy <tjre...@udel.edu> > To: python-list@python.org > Date: Sat, 18 Jun 2011 17:17:09 -0400 > Subject: Re: Strategy to Verify Python Program is POST'ing to a web server. > On 6/18/2011 7:34 AM, mzagu...@gmail.com wrote: >> >> Hello Folks, >> >> I am wondering what your strategies are for ensuring that data >> transmitted to a website via a python program is indeed from that >> program, and not from someone submitting POST data using some other >> means. I find it likely that there is no solution, in which case what >> is the best solution for sending data to a remote server from a python >> program and ensuring that it is from that program? >> >> For example, if I create a website that tracks some sort of >> statistical information and don't ensure that my program is the one >> that is uploading it, the statistics can be thrown off by people >> entering false POST data onto the data upload page. Any remedy? > > You have not specified all the parameters of the problem. Are there a limited > number of copies of your program or are they distrubuted freely? What about > multiple votes from one program? > > Corporate proxy votes (which are a legally important type of statistical > information) work as follows. Each shareholder is mailed or emailed a > 'control number'. Attend stockholder meeting in person, mail proxy vote, or > login with any browser with control number. Repeat votes by the same control > id supercede previous vote. There should be a 'thank you for voting' response > for each vote. I suspect IP addr. is recorded with vote too. I have not heard > of specific problems with electronic proxy voting. > > -- > Terry Jan Reedy > > > > > ---------- Forwarded message ---------- > From: Roy Smith <r...@panix.com> > To: python-list@python.org > Date: Sat, 18 Jun 2011 17:45:42 -0400 > Subject: Re: debugging https connections with urllib2? > In article <4dfcff48$0$49184$e4fe5...@news.xs4all.nl>, > Irmen de Jong <irmen.nos...@xs4all.nl> wrote: > >> On 18-6-2011 20:57, Roy Smith wrote: >> > We've got a REST call that we're making to a service provider over https >> > using urllib2.urlopen(). Is there any way to see exactly what's getting >> > sent and received over the network (i.e. all the HTTP headers) in plain >> > text? >> >> Put a proxy between the https-service endpoint and your client app. >> Let the proxy talk https and let your client talk http to the proxy. > > Clever. I like. Thanks. > > > > ---------- Forwarded message ---------- > From: Chris Torek <nos...@torek.net> > To: python-list@python.org > Date: 18 Jun 2011 22:28:39 GMT > Subject: Re: Improper creating of logger instances or a Memory Leak? > In article <ebafe7b6-aa93-4847-81d6-12d396a4f...@j28g2000vbp.googlegroups.com> > foobar <wjship...@gmail.com> wrote: >>I've run across a memory leak in a long running process which I can't >>determine if its my issue or if its the logger. > > You do not say what version of python you are using, but on the > other hand I do not know how much the logger code has evolved > over time anyway. :-) > >> Each application thread gets a logger instance in it's init() method >>via: >> >> self.logger = logging.getLogger('ivr-'+str(self.rand)) >> >>where self.rand is a suitably large random number to avoid collisions >>of the log file's name. > > This instance will "live forever" (since the thread shares the > main logging manager with all other threads). > --------- > class Manager: > """ > There is [under normal circumstances] just one Manager instance, which > holds the hierarchy of loggers. > """ > def __init__(self, rootnode): > """ > Initialize the manager with the root node of the logger hierarchy. > """ > [snip] > self.loggerDict = {} > > def getLogger(self, name): > """ > Get a logger with the specified name (channel name), creating it > if it doesn't yet exist. This name is a dot-separated hierarchical > name, such as "a", "a.b", "a.b.c" or similar. > > If a PlaceHolder existed for the specified name [i.e. the logger > didn't exist but a child of it did], replace it with the created > logger and fix up the parent/child references which pointed to the > placeholder to now point to the logger. > """ > [snip] > self.loggerDict[name] = rv > [snip] > [snip] > Logger.manager = Manager(Logger.root) > --------- > > So you will find all the various ivr-* loggers in > logging.Logger.manager.loggerDict[]. > >>finally the last statements in the run() method are: >> >> filehandler.close() >> self.logger.removeHandler(filehandler) >> del self.logger #this was added to try and force a clean up of >>the logger instances. > > There appears to be no __del__ handler and nothing that allows > removing a logger instance from the manager's loggerDict. Of > course you could do this "manually", e.g.: > > ... > self.logger.removeHandler(filehandler) > del logging.Logger.manager.loggerDict[self.logger.name] > del self.logger # optional > > I am curious as to why you create a new logger for each thread. > The logging module has thread synchronization in it, so that you > can share one log (or several logs) amongst all threads, which is > more typically what one wants. > -- > In-Real-Life: Chris Torek, Wind River Systems > Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 > email: gmail (figure it out) http://web.torek.net/torek/index.html > > > > ---------- Forwarded message ---------- > From: Chris Angelico <ros...@gmail.com> > To: python-list@python.org > Date: Sun, 19 Jun 2011 09:12:13 +1000 > Subject: Re: Strategy to Verify Python Program is POST'ing to a web server. > On Sun, Jun 19, 2011 at 6:40 AM, Michael Hrivnak <mhriv...@hrivnak.org> wrote: >> On Sat, Jun 18, 2011 at 1:26 PM, Chris Angelico <ros...@gmail.com> wrote: >>> SSL certificates are good, but they can be stolen (very easily if the >>> client is open source). Anything algorithmic suffers from the same >>> issue. >> >> This is only true if you distribute your app with one built-in >> certificate, which does indeed seem like a bad idea. When you know >> your user base though, especially if this is a situation with a small >> number of deployments, than you can distribute a unique certificate to >> each client, signed by your CA. > > That changes it from verifying the program to verifying the user. It's > a somewhat different beast, but it still leaves the possibility of > snagging the cert and using it in another program. Same with IP > address checks. You can't prove that the other end is a particular > program. > >>> You could go a long way toward it, though, by >>> using something ridiculously complex, such as: >>> ... >> >> An authentication process that involves the client executing code >> supplied by the server opens up one single point of failure (server is >> compromised or man-in-the-middle attack is happening) by which >> arbitrary code could get executed on the client. Yikes! > > Yeah, hence the part of verifying the server's cert too. That one is a > bit safer though; nobody but you will have that certificate, so it's > not as easy to take and put into another program. But this whole > scheme was meant from the start to be ridiculous. > >> If ... >> then you'll have to accept that you cannot trust the submitted data >> 100%, and just take measures to mitigate abuse. > > I still stand by my original point, namely that the "if" on here is > superfluous, and the "then" is unconditional. But the measures you > describe _do_ reduce the likelihood significantly. > > ChrisA > > > > ---------- Forwarded message ---------- > From: Cathy James <nambo...@gmail.com> > To: python-list@python.org > Date: Sat, 18 Jun 2011 18:21:55 -0500 > Subject: NEED HELP-process words in a text file > Dear Python Experts, > > First, I'd like to convey my appreciation to you all for your support > and contributions. I am a Python newborn and need help with my > function. I commented on my program as to what it should do, but > nothing is printing. I know I am off, but not sure where. Please > help:( > > import string > def fileProcess(filename): > """Call the program with an argument, > it should treat the argument as a filename, > splitting it up into words, and computes the length of each word. > print a table showing the word count for each of the word lengths > that has been encountered. > Example: > Length Count > 1 16 > 2 267 > 3 267 > 4 169 > >>>"&" > Length Count > 0 0 > >>> > >>>"right." > Length Count > 5 10 > """ > freq = [] #empty dict to accumulate words and word length > filename=open('declaration.txt, r') > for line in filename: > punc = string.punctuation + string.whitespace#use Python's > built-in punctuation and whiitespace > for i, word in enumerate (line.replace (punc, "").lower().split()): > if word in freq: > freq[word] +=1 #increment current count if word already in dict > > else: > freq[word] = 0 #if punctuation encountered, > frequency=0 word length = 0 > for word in freq.items(): > print("Length /t"+"Count/n"+ freq[word],+'/t' + > len(word))#print word count and length of word separated by a tab > > > > > #Thanks in advance, > CJ. > > > > ---------- Forwarded message ---------- > From: Chris Rebert <c...@rebertia.com> > To: Cathy James <nambo...@gmail.com> > Date: Sat, 18 Jun 2011 16:30:00 -0700 > Subject: Re: NEED HELP-process words in a text file > On Sat, Jun 18, 2011 at 4:21 PM, Cathy James <nambo...@gmail.com> wrote: >> Subject: NEED HELP-process words in a text file >> >> Dear Python Experts, >> >> First, I'd like to convey my appreciation to you all for your support >> and contributions. I am a Python newborn and need help with my >> function. I commented on my program as to what it should do, but >> nothing is printing. I know I am off, but not sure where. Please >> help:( > > Netiquette comment: Please avoid SHOUTING and including unnecessary > entreaties in your subject lines in the future. > > Cheers, > Chris > > > > ---------- Forwarded message ---------- > From: Tim Chase <python.l...@tim.thechases.com> > To: Cathy James <nambo...@gmail.com> > Date: Sat, 18 Jun 2011 19:09:18 -0500 > Subject: Re: NEED HELP-process words in a text file > On 06/18/2011 06:21 PM, Cathy James wrote: > >> freq = [] #empty dict to accumulate words and word length > > While you say you create an empty dict, using "[]" creates an empty *list*, > not a dict. Either your comment is wrong or your code is wrong. :) Given > your usage, I presume you want a dict, not a list. > >> for line in filename: >> punc = string.punctuation + string.whitespace#use Python's >> built-in punctuation and whiitespace > > Since you don't change "punc" in your loop, you'd get better performance by > hoisting this outside of the loop so it's only evaluated once. Not that it > should matter *that* greatly, but it's just a bad-code-smell. > >> for i, word in enumerate (line.replace (punc, "").lower().split()): > > .replace() doesn't operate on sets of characters, but rather strings. So > unless your line contains the exact text in "punc" (unlikely), that > replacement is a NOP. There are a couple ways to go about removing unwanted > characters: > > - make a set of those characters and produce a resulting string from things > not in that set: > > punc_set = set(punc) > line = ''.join(c for c in line if c not in punc_set) > > - use a regexp to strip them out...something like > > punc_re = re.compile("[" + re.escape(punc) + "]") > ... > line = punc_re.sub('', line) > > - use string translations. I'm not as familiar with these, but the following > seemed to work for me, abusing the 2nd "deletechars" parameter for your > particular use-case: > > line = line.translate(None, punc) > > I don't see .translate(None) documented anywhere. My random effort seemed to > work in 2.6, but fails in 2.5 and prior. YMMV. > >> if word in freq: >> freq[word] +=1 #increment current count if word already in >> dict >> >> else: >> freq[word] = 0 #if punctuation encountered, >> frequency=0 word length = 0 > > Again, your 2nd comment disagrees with your code. As an aside, if you're > using 2.5 or greater, I'd use collections.defaultdict(int) as the accumulator: > > freq = collections.defaultdict(int) > ... > freq[word] += 1 > # no need to check presence > >> for word in freq.items(): >> print("Length /t"+"Count/n"+ freq[word],+'/t' + >> len(word))#print word count and length of word separated by a tab > > Where to begin: > > - Your escapes are using "/" instead of "\" for <tab> and <newline> which I > expect will mess up the formatting. > > - You're also labeling them "Length/Count" but printing "count/length". > > - you're iterating over freq.items() but that should be written as > > for word, count in freq.items(): > > or > > for word in freq: > > - Additionally, adding the bits together makes it somewhat hard to > understand. > > I'd use something like > > for word, count in freq.items(): > print("Word \tLength \tCount\n%s \t%i \t%i" % ( > word, len(word), count)) > > -tkc > > > > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list