Shiva wrote: > Hi All, > > I have written a function that > -reads a file > -splits the words and stores it in a dictionary as word(key) and the total > count of word in file (value). > > I want to print the words with top 20 occurrences in the file in reverse > order - but can't figure it out. Here is my function: > > def print_top(filename): > > #Open a file > path = '/home/BCA/Documents/LearnPython/ic/' > fname = path + filename > print ('filename: ',fname) > filetext = open(fname) > > #Read the file > textstorage={} > > #print(type(textstorage)) > readall = filetext.read().lower() > eachword = set(readall.split()) > > #store split words as keys in dictionary > for w in eachword: > textstorage[w] = readall.count(w)
Using count() here is very inefficient. A better approach is to increment the dict value: for w in readall.split(): textstorage[w] = textstorage.get(w, 0) + 1 > > #print top 20 items in dictionary by decending order of val > # This bit is what I can't figure out. > > for dkey in (textstorage.keys()): > print(dkey,sorted(textstorage[dkey]))?? Apart from the fact that you are sorting characters in the word at that point the sorting effort is already too late -- you need to sort the dict keys by the corresponding dict values. It is possible to write a get_value() function such that sorted(textstorage, key=get_value, reverse=True) gives the keys in the right order, but perhaps it is simpler to convert textstorage into a list of (count, word) pairs first, something like pairs = [(42, "blue"), (17, "red"), (77, "black"), ...] When you sort that list most_common_words = sorted(pairs, reverse=True) you automatically get (count, word) pairs in the right order and can print the first 20 with for count, word in most_common_words[:20]: print(word, count) PS: Once you have it all working have a look at collections.Counter... -- https://mail.python.org/mailman/listinfo/python-list