On 9 March 2016 at 12:06, Matt Wheeler <m...@funkyhat.org> wrote: > But we can still do better. A list is a poor choice for this kind of > lookup, as Python has no way to find elements other than by checking > them one after another. (given (one of the) name(s) you've given it > sounds a bit like "dictionary" I assume it contains rather a lot of > items)
Sorry, I've just read your original code properly and see that you're looking up the next item in the list, this means a set is not suitable, as it doesn't preserve order (however, your original code is open to an IndexError if the last element in your list is matched). If you could provide a sample of the NewTotalTag.txt file data that would be helpful, but working with the information I've got we can still get a comparable speedup, by constructing a dict upfront mapping each word to the next one[1]: dict_word=dict_read.split() dict_word.append('N/A') # Assuming that 'N/A' is a reasonable output if the last word in your list is matched. # This works around the IndexError your current code is exposed to. # The slice ([:-1]) means we don't try to add the last item to the new a4 dict. a4={} for index,word in enumerate(words[:-1]): a4[word] = dict_word[index+1] This creates a dict where each key maps to the corresponding next word, which you can use later in your lookup instead of fetching by index. i.e. a4[word] instead of a4[windex+1]. This means you're saving yet *another* scan through of the entire list (`a4.index(word)` has to scan yet again) for the positive matches. [1] though I suspect if we get to see a sample of your data file there may be a better way -- Matt Wheeler http://funkyh.at -- https://mail.python.org/mailman/listinfo/python-list