If these are multi-valued fields then you're screwed. You'd need a non-deterministic automaton or a different regexp implementation (like in re2) to efficiently apply such expressions. If you have access to the code and can add your own Query class then this would be possible. If not, I don't know, sorry.
Dawid On Fri, Mar 29, 2013 at 3:37 AM, ko-mizutani < [email protected]> wrote: > Hi Dawid, > > Thanks for your suggestions. > > >How about if you index the position of the bit and the value separately? > So: > > > >bit_0: x > >bit_1: x > >bit_2: x > > > >then you can query for a specific combination of bits using a boolean > >query (bit_0: 0 and bit_1: 1). Just a thought. > > I thought same thing. Since my application is searching multivalued field, > this does not resolve my issue. > > For example, assuming 2 docs(docA and docB) are indexed. Each doc has > multiple bit stream as shown below. > > docA > bit : 010...; 101...; > > docB > bit : 000...; 111...; > > In Lucene v3.5, my old query (bit:01?...*) matches only docA. If we index > the position of each bit like this: > > docA > bit_0: 0; 1; > bit_1: 1; 0; > bit_2: 0; 1; > ... > > docB > bit_0: 0; 1; > bit_1: 0; 1; > bit_2: 0; 1; > ... > > docA and docB can be returned by query (bit_0: 0 and bit_1: 1). > > Unfortunately, the data format in my application has been fixed and it is > difficult to change for now. I wish I could find the way to identify the > specific bit stream. > > Thanks, > Kou Mizutani > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/OutOfMemoryError-occured-by-WildcardQuery-tp4051924p4052278.html > Sent from the Lucene - Java Developer mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
