> "Kerry, Richard" <[EMAIL PROTECTED]> (KR) wrote:
>KR> The hash is not expected to be unique, it just provides a starting point
>KR> for another search (usually linear ?).
>KR> See http://en.wikipedia.org/wiki/Hash_function
That only contains a definition of a hash function. I know what a
On 2006-07-12, Qiangning Hong <[EMAIL PROTECTED]> wrote:
> Grant Edwards wrote:
>> On 2006-07-11, Qiangning Hong <[EMAIL PROTECTED]> wrote:
>> > However, when I come to Python's builtin hash() function, I
>> > found it produces different values in my two computers! In a
>> > pentium4, hash('a') ->
On 2006-07-12, Carl Banks <[EMAIL PROTECTED]> wrote:
> Grant Edwards wrote:
>> On 2006-07-11, Qiangning Hong <[EMAIL PROTECTED]> wrote:
>>
>> > I'm writing a spider. I have millions of urls in a table (mysql) to
>> > check if a url has already been fetched. To check fast, I am
>> > considering to a
"Kerry, Richard" <[EMAIL PROTECTED]> writes:
> The hash is not expected to be unique, it just provides a starting point
> for another search (usually linear ?).
The database is good at organizing indexes and searching in them. Why
not let the database do what it's good at.
--
http://mail.pytho
Oostrum
Sent: 12 July 2006 10:56
To: python-list@python.org
Subject: Re: hash() yields different results for different platforms
>>>>> Grant Edwards <[EMAIL PROTECTED]> (GE) wrote:
>GE> The low 32 bits match, so perhaps you should just use that
>GE> portion of
> Grant Edwards <[EMAIL PROTECTED]> (GE) wrote:
>GE> The low 32 bits match, so perhaps you should just use that
>GE> portion of the returned hash?
If the hashed should be unique, 32 bits is much too low if you have
millions of entries.
--
Piet van Oostrum <[EMAIL PROTECTED]>
URL: http://www.
Using Python's hash as column in the table might not be a good idea.
You just found out why. So you could instead just use the base url and
create an index based on that so next time just quickly get all urls
from same base address then do a linear search for a specific one, or
even easier, impleme
Qiangning Hong wrote:
> /.../ add a "hash" column in the table, make it a unique key
at this point, you should have slapped yourself on the forehead, and gone
back to the drawing board.
--
http://mail.python.org/mailman/listinfo/python-list
[Grant Edwards]
>> ...
>> The low 32 bits match, so perhaps you should just use that
>> portion of the returned hash?
>>
>> >>> hex(12416037344)
>> '0x2E40DB1E0L'
>> >>> hex(-468864544 & 0x)
>> '0xE40DB1E0L'
>>
>> >>> hex(12416037344 & 0x)
>> '0xE40DB1E0L'
>> >>> hex
Grant Edwards wrote:
> On 2006-07-11, Qiangning Hong <[EMAIL PROTECTED]> wrote:
> > However, when I come to Python's builtin hash() function, I
> > found it produces different values in my two computers! In a
> > pentium4, hash('a') -> -468864544; in a amd64, hash('a') ->
> > 12416037344. Does ha
Grant Edwards wrote:
> On 2006-07-11, Qiangning Hong <[EMAIL PROTECTED]> wrote:
>
> > I'm writing a spider. I have millions of urls in a table (mysql) to
> > check if a url has already been fetched. To check fast, I am
> > considering to add a "hash" column in the table, make it a unique key,
> > a
On 2006-07-11, Qiangning Hong <[EMAIL PROTECTED]> wrote:
> I'm writing a spider. I have millions of urls in a table (mysql) to
> check if a url has already been fetched. To check fast, I am
> considering to add a "hash" column in the table, make it a unique key,
> and use the following sql stateme
"Qiangning Hong" <[EMAIL PROTECTED]> writes:
> However, when I come to Python's builtin hash() function, I found it
> produces different values in my two computers! In a pentium4,
> hash('a') -> -468864544; in a amd64, hash('a') -> 12416037344. Does
> hash function depend on machine's word length
13 matches
Mail list logo