That makes no sense at all, it would make it slow as shit. I am tired of repeating this: Don't use BINARY docvalues Don't use BINARY docvalues Don't use BINARY docvalues
Use types like SORTED/SORTED_SET which will compress the term dictionary and make use of ordinals in your application instead. On Sat, Aug 8, 2015 at 10:19 AM, Olivier Binda <olivier.bi...@wanadoo.fr> wrote: > Greetings > > are there any plans to implement compression of the variable length bites[] > binary doc Values, > say in blocks of 16k like for stored values ? > > my .cfs file goes from 2MB to like 400k when I zip it > > Best regards, > Olivier > > > > On 08/08/2015 02:32 PM, jamie wrote: >> >> Greetings >> >> Our app primarily uses Lucene for its intended purpose i.e. to search >> across large amounts of unstructured text. However, recently our requirement >> expanded to perform look-ups on specific documents in the index based on >> associated custom defined unique keys. For our purposes, a unique key is the >> string representation of a 128 bit murmur hash, stored in a Lucene field >> named uid. We are currently using the TermsFilter to lookup Documents in >> the Lucene index as follows: >> >> List<Term> terms = new LinkedList<>(); >> for (String id : ids) { >> terms.add(new Term("uid", id)); >> } >> TermsFilter idFilter = new TermsFilter(terms); >> ... search logic... >> >> At any time we may need to lookup say a couple of thousand documents. Our >> problem is one of performance. On very large indexes with 30 million records >> or more, the lookup can be excruciatingly slow. At this stage, its not >> practical for us to move the data over to fit for purpose database, nor >> change the uid field to a numeric type. I fully appreciate the fact that >> Lucene is not designed to be a database, however, is there anything we can >> do to improve the performance of these look-ups? >> >> Much appreciate >> >> Jamie >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org