I think you can get the most bang for your buck using the high-level controls to disable the parts of the index you don't need ... all codecs respect these.
EG, index your fields with omitNorms=true, so no boost & length normalization is stored in the index / loaded at search time. Index with IndexOption.DOCS_ONLY, so no positions nor freq information are stored in the postings lists, which means you cannot run positional queries and scoring will not reflect how many times a term occurs in each doc. Set CONSTANT_SCORE_AUTO_REWRITE rewrite method for the MultiTermQueries (this is the default for most of them, except FuzzyQuery): this will avoid all scoring at search time. Don't turn on stored fields, term vectors. Then, if these steps are insufficient, consider making a custom codec that specializes how things are encoded. Mike McCandless http://blog.mikemccandless.com On Tue, Mar 26, 2013 at 3:00 PM, Vitaly Funstein <vfunst...@gmail.com> wrote: > This is probably a pretty general inquiry, but I'm just exploring this as > an option at the moment. > > It seems that Lucene 4 adds some freedom to define how data is actually > written to underlying storage by exposing the codec API. However, I find > the learning curve for understanding what bits to change quite steep, i.e. > one really needs to get into the guts of storage formats and how data in > these formats is actually consumed by search queries. > > Is there some type of tutorial, possibly with code samples, that would > guide me through what needs to be done for specific use cases? Basically, > what I am looking for is the ability to "turn off" certain features of the > engine, creating a "lite" version of Lucene's codec that would both cut > down on the amount of data to persist while indexing, and on query > execution time. To be a bit more specific, the queries in my case do not go > beyond NumericRangeQuery, WildCardQuery and TermQuery types, so things like > similarities, boosts and scoring are not used. So obviously I want to > preserve the existing functionality while removing support for features I'm > not using (yet). > > Thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org