On Nov 9, 6:10 pm, "Rajat Gogri" <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I was asked this question for MSFT interview.
>
> You are given Symbols for stock ticker application,
>
> All are 4 character , Capital letters only so min is AAAA and max is ZZZZ.
> for example MSFT, GOOG etcetc
> How will you hash it???
....
> But I think space is huge here...
There are 26 choices for the first character, 26 choices for the
second character etc... 26^4 = 456976 permutations. You knew this?
Each entry in the table holds a pointer to a record or a null.
To implement a table in memory on an 8/32 bit machine (8 bit byte, 32
bit addressing) you need:
456976 * 4 = 1827904 bytes
= say 1.3/4 MB.
The table is large but not unreasonable.
For an entry say a, b, c, d the relationship between the key
AAAA..ZZZZ and table location is:
(Ascii(a)-Ascii('A'))* 26^3 + (Ascii(b)-Ascii('A'))* 26^2 + (Ascii(c)-
Ascii('A'))* 26 + (Ascii(d)-Ascii('A')); i.e. 0..(26^4 - 1)
This is working radix 26 and shifting character values left. Use the
inverse to get the key for any table location.
> Any better answers????????????
Your answer looks reasonable but maybe lacks acuracy, scope, don't you
think?
A record will be a lot bigger than 4 bytes, so perhaps we want to keep
records on disk, only load them when required and maybe unload them if
memory gets short - table also stores file offsets. Using key and
database: might as well get local bit of DBMS to manage table and
forget all this stuff. Saving table to disk: only store non-nulls
complete with table index.
Personally I would call above a vector table not a hash table. To me
hashing involves a table that is smaller than the range of key values,
hash function, collision handling. Google for Hash Function, Table
etc. is quite good.
Hashing downside is that an in-order traversal requires iterating
through key values to see what comes out or additional provision that
is going to slow down insertion, deletion.
Alternatives: 1) Use sparse array techniques to reduce vector table
size. 2) A multi-branch tree, also known as a trie. Each node has 26
way vector table. Full population gets same total table but only
create nodes and/or tables when required. 3) B-Tree - google it.
Hope these thoughts help.
- Martin
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Algorithm Geeks" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---