[
https://issues.apache.org/jira/browse/LUCENE-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117637#comment-14117637
]
Erick Erickson commented on LUCENE-5914:
----------------------------------------
These were some user's list discussions, can't lay my hands on them right now.
Whether there were work-arounds or not, I've always been uncomfortable with not
having the option to turn off compression. There are just too many places where
people do "non standard" things like store not-very-many huge documents to take
that option away from them unless we can _guarantee_ that in all configurations
trading I/O for CPU is A Good Thing. And I don't believe we can make that
guarantee.
That said, I'm not the one doing the coding so I don't know the ins and outs
here. If it's easy to turn compression off then I think it's worth doing. If
it's major surgery OTOH, I don't have any hot complaints to point to so
insisting that you do the work because of a vaguely-remembered user list
discussion (that I can't prove there was no work-around for) is just not in the
cards ;). The response has been "write your own codec", so maybe the right
option is to provide a non-compressing codec? Here's where you tell me there
already is one ;)
Andrzej points out an interesting case though.
> More options for stored fields compression
> ------------------------------------------
>
> Key: LUCENE-5914
> URL: https://issues.apache.org/jira/browse/LUCENE-5914
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Fix For: 4.11
>
> Attachments: LUCENE-5914.patch
>
>
> Since we added codec-level compression in Lucene 4.1 I think I got about the
> same amount of users complaining that compression was too aggressive and that
> compression was too light.
> I think it is due to the fact that we have users that are doing very
> different things with Lucene. For example if you have a small index that fits
> in the filesystem cache (or is close to), then you might never pay for actual
> disk seeks and in such a case the fact that the current stored fields format
> needs to over-decompress data can sensibly slow search down on cheap queries.
> On the other hand, it is more and more common to use Lucene for things like
> log analytics, and in that case you have huge amounts of data for which you
> don't care much about stored fields performance. However it is very
> frustrating to notice that the data that you store takes several times less
> space when you gzip it compared to your index although Lucene claims to
> compress stored fields.
> For that reason, I think it would be nice to have some kind of options that
> would allow to trade speed for compression in the default codec.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]