On Fri, Jan 19, 2018 at 12:58:10PM -0500, Bob Gailer wrote: > = > > On Jan 18, 2018 5:45 PM, "Devansh Rastogi" <devan...@gmail.com> wrote: > > > > Hello, > > > > I'm new to python and programming as > > > > from collections import Counter > > import json > > > I don't see any value for having a class. All you need are functions and > global variables
Except for the simplest scripts, global variables are a good way to have fragile, buggy, hard to debug code. If there's ever a chance that you will need to read two or more files at the same time, a class is a much better solution than trying to juggle global variables. If I never have to do the: old_foo = foo calculate_foo() print(foo) foo = old_foo dance again, it won't be too soon. > > class Files: > > def __init__(self, filename): > > I don't see any need for a function or"with". Just write file_input_string > = open(filename, 'r', encoding='utf-16').read().replace('\n', ' ') Again, that doesn't scale beyond quick and dirty scripts. Best practice (even when not strictly needed) is to use with open(filename) as f: do_something_with(f.read()) in order to guarantee that even if an error occurs while reading, the file will be closed. Otherwise, you run the risk of running out of file handles in a long-running program. > > with open(filename, 'r', encoding='utf-16') as file_input: > > self.file_input_string = file_input.read().replace('\n', ' ') > > > You are assuming that all words are separated by blanks which is rarely the > case in natural language. Surelyyoumeanthatitisusuallythecasethatwordsareseparatedbyblanksinmostnaturallanguages? I think that Thai is one of the few exceptions to the rule that most languages separate words with a blank space. In English, there are a small number of compound words that contain spaces (as opposed to the far more common hyphen), such as "ice cream" (neither a form of ice, nor cream) or "attorney general" but most people don't bother distinguishing such compound words and just treating them as a pair of regular words. But I can't think of any English grammatical construct where words are run together while still treating them as separate words (apart from simple mistakes, e.g. accidentally writing "runtogether" as a typo). > Your program is creating lists of ones. Rather than counting them all you > need to do is take the length of each list.. e;g;: lowercase_letters = > len(1 for c in self.file_input_string if c.islower()) That won't work. You are trying to take the length of a generator expression: py> len(1 for c in "abcDe" if c.islower()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: object of type 'generator' has no len() > However there is a much better way to do the counting: translate the text > using the string translate method into various characters that identify the > class of each letter in the file. Then count the occurrences of each of > those characters. Example: counting Upper Lower, Nunber, and punctuation > Single, Double stroke): > > txt= "THIS is 123 ,./ :*(" # input file text > > transtable = > str.maketrans("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > ,./:*(", > "L"*26 + "U"*26 + "N"*10 + "S"*4 + "D"*3) # maps input characters to > corresponding class characters That lists only 68 out of the many, many thousands of characters supported by Python. It doesn't scale very well beyond ASCII. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor