Today I Leaned (the hard way): The difference between dict and list and Python
Lookup time for dict() is O(1) and for list() is O(n) no matter what the number of elements is.
Given a list of elements in a file, this piece of code:
from itertools import izip
lists = map(str.strip, open('dict.txt').readlines())
i = iter(lists)
j = range(len(lists))
dicts = dict(izip(i, j))
index = dicts[w]
is n time faster than this piece of code:
dicts = map(str.strip, open('dict.txt').readlines())
index = dicts.index(w)
I learned that the hard way. My dict has 5,000,000 elements, which mean my code will run 5 millions time slower than it should be. As such, I ran my code on 20 machines, each with 64 cores and it took ~10 hours without finish. Changing from list to dict, it took about 900 seconds on a single machine (of course 64 cores)
Next time, when you want to do lookup in Python, use dict(). Happy coding!