if you can tolerate errors then a simple idea is to generate a random
number in the range 0 ... 2 ^n and use that as the key.  if the number
of lines is small relative to 2 ^ n then with high probability you
won't get the same key twice.

Miles

2009/5/4 Rares Vernica <[email protected]>:
> Hello,
>
> TextInputFormat is a perfect match for my problem. The only drawback is
> that fact that keys are unique only within a file. Is there an easy way
> to have keys unique across files. That is, each line in any file should
> get a unique key. Is there an unique id for each file? If yes, maybe I
> can concatenate them if I can access the file id from the map function.
>
> Thanks,
> Rares
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Reply via email to