Hello, I found pybloom module from http://www.imperialviolet.org/pybloom.html and tried to use it for my crawler:) I want to use it to store the URLs which have been crawled. But when I insert a URL string I always get a warning and wrong result...
My testing code is quite simple: from pybloom import CountedBloom cb = CountedBloom(800000, 4) cb.insert("AAA") print cb.__contains__("BBB") Warning: E:\EclipseWorkspace\demo\src\pybloom.py:74: DeprecationWarning: 'I' format requires 0 <= number <= 4294967295 b = [ord(x) for x in struct.pack ('I', val)] I will get warning when running the code above. The output is "1" which means "BBB" is in the set. But actually it is not... When I use integer for testing it seems right. I am not familiar with arithmetic and I don't know if I wrote something wrong. Can anyone help me? Thanks! -- Zhang Xiao Junior engineer, Web development Ethos Tech. http://www.ethos.com.cn
-- http://mail.python.org/mailman/listinfo/python-list