On Wed, Jan 23, 2013 at 12:57 AM, Ferrous Cranus <nikos.gr...@gmail.com> wrote:
Τη Τρίτη, 22 Ιανουαρίου 2013 3:04:41 μ.μ. UTC+2, ο χρήστης Steven D'Aprano
έγραψε:
What do you expect int("my-web-page.html") to return? Should it return 23
or 794 or 109432985462940911485 or 42?
I expected a unique number from the given string to be produced so i could have a
(number <=> string) relation. What does int( somestring ) is returning really?
i don;t have IDLE to test.
Just run python without any args, and you'll get interactive mode. You
can try things out there.
This counter.py will work on a shared hosting enviroment, so absolutes paths
are BIG and expected like this:
/home/nikos/public_html/varsa.gr/articles/html/files/index.html
That's not big. Trust me, modern databases work just fine with unique
indexes like that. The most common way to organize the index is with a
binary tree, so the database has to look through log(N) entries.
That's like figuring out if the two numbers 142857 and 857142 are the
same; you don't need to look through 1,000,000 possibilities, you just
need to look through the six digits each number has.
'pin' has to be a number because if i used the column 'page' instead, just
imagine the database's capacity withholding detailed information for each and
every .html requested by visitors!!!
Not that bad actually. I've happily used keys easily that long, and
expected the database to ensure uniqueness without costing
performance.
So i really - really need to associate a (4-digit integer <=> htmlpage's
absolute path)
Is there any chance that you'll have more than 10,000 pages? If so, a
four-digit number is *guaranteed* to have duplicates. And if you
research the Birthday Paradox, you'll find that any sort of hashing
algorithm is likely to produce collisions a lot sooner than that.
Maybe it can be done by creating a MySQL association between the two columns,
but i dont know how such a thing can be done(if it can).
So, that why i need to get a "unique" number out of a string. please help.
Ultimately, that unique number would end up being a foreign key into a
table of URLs and IDs. So just skip that table and use the URLs
directly - much easier. In this instance, there's no value in
normalizing.
ChrisA