On Monday, 5 August 2019 07:21:52 UTC+8, MRAB wrote: > On 2019-08-05 00:10, A S wrote: > > Oh... By set did you mean by using python function set(variable) as > > something? > > > > So sorry for bothering you.. > > > Make it a set (outside the loop): > > dictionary = set() > > and then add the words to it (inside the loop): > > dictionary.add(cell_range.value) > > (Maybe also rename the variable to, say, "words_wanted", because calling > it "dictionary" when it's not a dictionary (dict) could be confusing...) > > > On Mon, 5 Aug 2019, 6:52 am A S, <aishan0...@gmail.com > > <mailto:aishan0...@gmail.com>> wrote: > > > > Previously I had tried many methods and using set was one of them > > but it didn't work out either.. I even tried to append it to a > > list but it's not working out.. > > > > On Mon, 5 Aug 2019, 2:29 am MRAB, <pyt...@mrabarnett.plus.com > > <mailto:pyt...@mrabarnett.plus.com>> wrote: > > > > On 2019-08-04 18:53, A S wrote: > > > Hi Mrab, > > > > > > Thank you so much for your detailed response, I really really > > > appreciate it as I have been constantly trying to seek help > > regarding > > > this issue. > > > > > > Yes, I figured that the dictionary is only capturing the > > last value :( > > > I've been trying to get it to capture and store all the > > values to > > > memory in python but it's not working.. > > > > > > Are there any improvements that I could make to allow my > > code to work? > > > > > > I would be truly grateful if you could provide further > > insights on this.. > > > > > > Thank you so much. > > > > > Make it a set and then add the words to it. > > > > > > > > On Mon, 5 Aug 2019, 1:45 am MRAB, > > <pyt...@mrabarnett.plus.com <mailto:pyt...@mrabarnett.plus.com> > > > <mailto:pyt...@mrabarnett.plus.com > > <mailto:pyt...@mrabarnett.plus.com>>> wrote: > > > > > > On 2019-08-04 09:29, aishan0...@gmail.com > > <mailto:aishan0...@gmail.com> > > > <mailto:aishan0...@gmail.com > > <mailto:aishan0...@gmail.com>> wrote: > > > > I want to compare the common words from multiple .txt > > files > > > based on the words in multiple .xlsx files. > > > > > > > > Could anyone kindly help with my code? I have been > > stuck for > > > weeks and really need help.. > > > > > > > > Please refer to this link: > > > > > > > > > > > https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi > > > > > > > > Any help is greatly appreciated really!! > > > > > > > First of all, in this line: > > > > > > folder_path1 = > > os.chdir("C:/Users/xxx/Documents/xxxx/Test > > > python dict") > > > > > > it changes the current working directory (not a > > problem), but 'chdir' > > > returns None, so from that point 'folder_path1' has the > > value None. > > > > > > Then in this line: > > > > > > for file in os.listdir(folder_path1): > > > > > > it's actually doing: > > > > > > for file in os.listdir(None): > > > > > > which happens to work because passing it None means to > > return the > > > names > > > in the current directory. > > > > > > Now to your problem. > > > > > > This line: > > > > > > dictionary = cell_range.value > > > > > > sets 'dictionary' to the value in the spreadsheet cell, > > and you're > > > doing > > > it each time around the loop. At the end of the loop, > > 'dictionary' > > > will > > > be set to the _last_ such value. You're not collecting > > the value, but > > > merely remembering the last value. > > > > > > Looking further on, there's this line: > > > > > > if txtwords in dictionary: > > > > > > Remember, 'dictionary' is the last value (a string), so > > that'll be > > > True > > > only if 'txtwords' is a substring of the string in > > 'dictionary'. > > > > > > That's why you're seeing only one match. > > > > >
My latest reply to Mrab in case anybody needs it (and p.s. I'm so sorry for spamming you Mrab): Mrab! Thank you so much for your constant replies ! I'm able to print out the words now!! Using these codes: import os, sys import xlrd from xlrd import open_workbook import openpyxl from openpyxl.reader.excel import load_workbook import xlwt from xlwt import Workbook #The filepath that I will be saving my .xls file to: filepath = ('C:/Users/Ai Shan/Documents/CPFB Work/LAN SAS MONTHLY.xls') #The .xls file: wb2 = xlrd.open_workbook('C:\\Users\\Ai Shan\\Documents\\CPFB Work\\LAN SAS MONTHLY.xls', on_demand= True) wb2 = Workbook() sheet2 = wb2.add_sheet("LAN SAS", cell_overwrite_ok=True) #The .xlxs file that contains the words I want to compare with the .txt files: folder_path1 = os.chdir("C:/Users/Ai Shan/Documents/CPFB Work/Test python dict") words= set() for file in os.listdir(folder_path1): if file.endswith(".xlsx"): wb = load_workbook(file, data_only=True) ws = wb.active words.add(str(ws['A1'].value)) #cell_range = ws['A1'] #with open('copy.txt','w+') as f: # f.write(str(cell_range.value)) # Me writing the name of each .txt file to the .xls file: for r, dir in enumerate(os.listdir("C:/Users/Ai Shan/Documents/CPFB Work/txt test python")): sheet2.write(r+1,1,dir) #Reading .txt file and trying to make the sentence into words instead of lines so that I can compare the .txt individual words with the .xlsx file: path = os.chdir("C:/Users/Ai Shan/Documents/CPFB Work/txt test python") for name in os.listdir(path): if name.endswith(".txt"): with open(name, 'r') as texts: s = texts.read() import re m = re.match(r'(?:.*?\n)(?P<word>\w+?)\b', s) if m: word = m.group('word') if word in words: print(word) sheet2.write(r+1,2,word) wb2.save(filepath) But I'm not able to write the printed values to my excel workbook..its only printing "pear" again.. I want to get this outcome: apples orange pear But I'm only getting the last value again, am I writing the code wrongly..? -- https://mail.python.org/mailman/listinfo/python-list