On 20/06/18 20:32, Daniel Bosah wrote: > reg = pattern.findall(str(soup)) > > for i in reg: > if i in reg and paul: # this loop checks to see if elements are in > both the regexed parsed list and the list.
No it doesn't. It checks if i is in reg and if paul is non empty - which it always is. So this if test is really just testing if i is in reg. This is also always truie since the for loop is iterating over reg. So you are effectively saying if True and True or if True. What you really wanted was something like if i in reg and i in paul: But since you know i is in reg you can drop that bit to get if i in paul: > sets.append(str(i)) Because the if is always true you always add i to sets > with open('sets.txt', 'w') as f: > f.write(str(sets)) > f.close() Why not just wait to the end? Writing the entire sets stucture to a file each time is very wasteful. Alternatively use the append mode and just write the new item to the file. Also you don't need f.close if you use a with statement. > However, every time I run the current code, I get all the > textfile(sets.txt) from the previous ( regex ) function, even though all I > want are words and pharse shared between the textfile from regex and the > monum list from regexparse. How can I fix this? I think that's due to the incorrect if expression above. But I didn't check the rest of the code... However, I do wonder about your use of soup as your search string. Isn't soup the parsed html structure? Is that really what you want to search with your regex? But I'm no BS expert, so there might be some magic at work there. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor