On 10/11/2013 03:04 PM, Adrien Grand wrote:
On Fri, Oct 11, 2013 at 7:03 PM, Michael Sokolov
<msoko...@safaribooksonline.com> wrote:
I've been running some tests comparing storing large fields (documents, say
100K .. 10M) as files vs. storing them in Lucene as stored fields. Initial
results seem to indicate storing them externally is a win (at least for
binary docs which don't compress, and presumably we can compress the
external files if we want, too), which seems to make sense. There will be
some issues with huge directories, but that might be worth solving.
So I'm wondering if there is a codec that does that? I haven't seen one
talked about anywhere.
I don't know about any codec that works this way but such a codec
would quickly exceed the amount of available file descriptors.
I'm not sure I understand. I was thinking that the stored fields would
be accessed infrequently (only when writing or reading the particular
stored field value), and the file descriptor would only be in use during
the read/write operation - they wouldn't be held open. So for example
during query scoring one wouldn't need to visit these fields I think?
But I may have a fundamental misunderstanding about how Lucene uses its
codecs: this is new to me.
-Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org