Re: Query Parser, Unary Operators and Multi-Field Query

2011-05-21 Thread Renaud Delbru
ot; in the query under one of the following conditions: preceded either by "(" or " ", or at the beginning of the string, e.g. using a regex like /(?:^|[\s(])[+-]/, and if you find a match, use default OR operator, and

Re: Query Parser, Unary Operators and Multi-Field Query

2011-05-20 Thread Renaud Delbru
to extend myself the queryparser contrib ? [1] http://lucidworks.lucidimagination.com/display/LWEUG/Boolean+Operators Thanks -- Renaud Delbru On 20/05/11 13:21, Steven A Rowe wrote: Hi Renaud, That's normal behavior, since you have AND as default operator. This is equivalent to placing a "+&q

Query Parser, Unary Operators and Multi-Field Query

2011-05-20 Thread Renaud Delbru
normal behaviour ? A Bug ? Am I doing something wrong ? Thanks in advance for your help, -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

Re: Flex API - Debugging Segment Merge

2010-03-26 Thread Renaud Delbru
rare cases. I'll start the query benchmark this week end. Let's hope I'll have something to share during the next week. Cheers -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

Re: Flex API - Debugging Segment Merge

2010-03-25 Thread Renaud Delbru
ty segment, and the faulty term (or even better, the index of the faulty block), I will be able to display the content of the blocks, and see if there is some problems in the PFor encoding. Cheers, -- Renaud Delbru - To unsubsc

Flex API - Debugging Segment Merge

2010-03-25 Thread Renaud Delbru
easy way to get this information, so I will be able to check these segments and their encoded blocks in order to find and understand the problem ? Thanks in advance, -- Renaud Delbru - To unsubscribe, e-mai

Re: Question on number of fields in a document

2010-03-12 Thread Renaud Delbru
aybe SIREn will be more suitable. [1] http://siren.sindice.com/ -- Renaud Delbru On 12/03/10 13:43, Erick Erickson wrote: There's no requirement that all documents have the same fields, Lucene is fine with different docs having different fields. There's no limit on the number of diff

Flex & Segment Merging

2010-02-16 Thread Renaud Delbru
Codec interface ? How is it working currently ? Is there some restrictions on how segments can be merged ? Is there a way to extend easily the mechanism on how segments are merged ? Cheers, -- Renaud Delbru - To unsubscribe, e-mail:

Re: Flex & Docs/AndPositionsEnum

2010-02-10 Thread Renaud Delbru
way I am testing the postings (using termPositionsEnum on the top-level reader) was not really the proper way to test it, and that the correct way will be instead to use directly a TermQuery. Thanks for the clarification. -- Renaud Delbru

Re: Flex & Docs/AndPositionsEnum

2010-02-10 Thread Renaud Delbru
using the DocsEnum interface, and therefore do not know if it manipulates segment-level enum or a Multi*Enums. What search (or query operators) in Lucene is using segment-level enums ? Cheers -- Renaud Delbru - To unsubscri

Re: Flex & Docs/AndPositionsEnum

2010-02-10 Thread Renaud Delbru
But what you were suggesting is to create my own "MultiReader" that is optimised for my codec. Is that right ? A MultiReader that just iterates over the subreaders, checks if they are using my codec (and therefore associated fields), and uses them to iterate

Re: Flex & Docs/AndPositionsEnum

2010-02-09 Thread Renaud Delbru
On 09/02/10 16:04, Michael McCandless wrote: On Tue, Feb 9, 2010 at 9:08 AM, Renaud Delbru wrote: So, does it mean that the codec interface is likely to change ? Do I need to be prepared to change again all my code ;o) ? This particular patch doesn't change the Codecs API

Re: Flex & Docs/AndPositionsEnum

2010-02-09 Thread Renaud Delbru
the information that have been stored in the new index data structure are correctly retrieved. In that case, I got the previous errors (a MultiDocsAndPositionsEnum is returned). However, when I am indexing only one or two documents, the original DocsAndPositionsEnum is returne

Re: Flex & Docs/AndPositionsEnum

2010-02-09 Thread Renaud Delbru
all. Ok, it works like a charm except the problem related to MultiReaders. Thanks -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Flex & Docs/AndPositionsEnum

2010-02-09 Thread Renaud Delbru
her way to extends StandardCodec without having to deal with these classes ? Cheers -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: IndexingChain and TermHash

2010-01-07 Thread Renaud Delbru
Hi Michael, I have started to look at the PFOR codec. However, when I include the codec files inside the flex_1458 branch, it misses the org.apache.lucene.util.pfor.PFor class which is the core of the codec. Where can I find this class ? Thanks, Regards -- Renaud Delbru On 16/11/09 14:01

Re: Question about many fields within a single index

2009-12-30 Thread Renaud Delbru
where (correct me if I am wrong) that this new version includes some optimisations for dictionary lookups, which should minimize the overhead. -- Renaud Delbru On 30/12/09 16:18, Jason Tesser wrote: I have a situation where I might have 1000 different types of Lucene Documents each with 10 or so f

Re: IndexingChain and TermHash

2009-12-11 Thread Renaud Delbru
a medium term period. I will continue to follow the advancement of 1458, test it, and continue to report you my feedbacks and experiences with it. Thanks, Best Regards [1] http://siren.sindice.com -- Renaud Delbru On 16/11/09 13:01, Michael McCandless wrote: Yes, the branch is he

Re: IndexingChain and TermHash

2009-11-16 Thread Renaud Delbru
e the experience! I will. -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: IndexingChain and TermHash

2009-11-16 Thread Renaud Delbru
, in order to be able to plug my own chain, but I have the impression that you've done something similar already (with the codec abstraction). Would be a pity to lose my time doing something less convenient that your appraoch. Thanks. -- Renaud Delbru On 14/11/09 13:22, Michael McCan

Re: querying multi-value fields

2009-10-15 Thread Renaud Delbru
Hi, there is also the SIREn plugin [1] that allows to index multi-valued fields, with values of variable length, and to query them individually. [1] http://siren.sindice.com -- Renaud Delbru On 12/10/09 21:31, Angel, Eric wrote: I need to analyze these values since I also want the benefits

Re: Querying across object relationships

2009-07-30 Thread Renaud Delbru
If you need some help, feel free to ask your questions in our mailing list. [1] http://siren.sindice.com [2] https://dev.deri.ie/confluence/display/SIREn/Indexing+and+Searching+Tabular+Data Best Regards, -- Renaud Delbru Donal Murtagh wrote: Hi, I'm trying to use Lucene to qu

[ANN] SIREn 0.1 Release

2009-07-23 Thread Renaud Delbru
nputs to make this project happen ... but also to the Data Intensive Infrastructure Group and DERI. [1] http://di2.deri.ie/ -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional comman

Posting List Encoding: Group Varint Encoding

2009-05-05 Thread Renaud Delbru
rowse/LUCENE-1410 [2] http://videolectures.net/wsdm09_dean_cblirs/ -- Renaud Delbru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Modification of positional information encoding

2008-10-14 Thread Renaud Delbru
define how to serialise positions and payloads I think other parts of the FreqProxTermsWriter can stay generic. What do you think ? Regards. -- Renaud Delbru - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Re: Sorting posting lists before intersection

2008-10-13 Thread Renaud Delbru
so bad predictor in general. Regards. -- Renaud Delbru - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Sorting posting lists before intersection

2008-10-13 Thread Renaud Delbru
Andrzej Bialecki wrote: Renaud Delbru wrote: Hi Andrzej, sorry for the late reply. I have looked at the code. As far as I understand, you sort the posting lists based on the first doc skip. The first posting list will be the one who have the first biggest document skip. Do the sparseness of

Re: Modification of positional information encoding

2008-10-13 Thread Renaud Delbru
ou could then create your own indexing chain for indexing? If you take that approach, please report back so we can learn how to improve Lucene for these very advanced customizations! Ok, thanks for the reference. I will try this solution, and will report you any proble

Re: Sorting posting lists before intersection

2008-10-13 Thread Renaud Delbru
ConjunctiveScorer). This will require a call to IndexReader.docFreq(term) for each of the term queries. Is docFreq call mean another IO access ? Thanks for the clarification, Regards. -- Renaud Delbru Andrzej Bialecki wrote: Renaud Delbru wrote: > Hi all, > > I am wondering if Lucene implements

Modification of positional information encoding

2008-10-13 Thread Renaud Delbru
modifications ? Make a branch of lucene, and add my new classes to the lucene package org.apache.lucene.index ? Or do a more elegant solution is possible ? Thanks in advance, Regards. -- Renaud Delbru - To unsubscribe, e-mail: [EMAIL

Re: triplet store

2008-09-29 Thread Renaud Delbru
Yes, I know to research project that have implemented a triple store on top of Lucene: - Semplore [1] - Sindice [2] [1] http://apex.sjtu.edu.cn/apex_wiki/Demos/Semplore [2] http://www.sindice.com -- Renaud Delbru Cam Bazz wrote: Has anyone tried to implement a triplet store with lucene? Best

Sorting posting lists before intersection

2008-09-17 Thread Renaud Delbru
Hi all, I am wondering if Lucene implements the query optimisation that consists of ordering the posting lists based on the term frequency before intersection ? If yes, could somebody point me to the java class / method that implements such strategy ? Thanks in advance, Regards. -- Renaud

Update of stored and non-indexed binary fields

2008-04-08 Thread Renaud Delbru
the field data (.fdt) file. Then, could it be possible to overwrite the old float value by a new float value ? Thanks, -- Renaud delbru - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Incorrect Token Offset when using multiple fieldable instance

2008-03-05 Thread Renaud Delbru
r can know (because tokenStream doesn't return them). It's as if we need the ability to query a tokenStream for its "final" offset or something. One workaround might be to insert an "end marker" token, with the true end offset, which is a term you would never sear

Incorrect Token Offset when using multiple fieldable instance

2008-03-04 Thread Renaud Delbru
instances will have their offset shifted back. Is it a bug ? Or is it a desired behavior (in this case, why ?) ? Regards. -- Renaud Delbru, E.C.S., Ph.D. Student, Semantic Information Systems and Language Engineering Group (SmILE), Digital Enterprise Research Institute, National University of