nation.com/search/document/deda4dd3f9041bee/the_order_of_fields_in_document_fields#bb26d84091aebcaa
>
> -Grant
>
>
> On Mar 31, 2009, at 8:44 AM, Grant Ingersoll wrote:
>
> I'm going to bring this over to java-dev.
>>
>> -Grant
>>
>> On Mar 30, 2009, at 11:34 AM, Raymond Balmès
lucene 2.4.0
On Mon, Mar 30, 2009 at 2:18 PM, Grant Ingersoll wrote:
>
> On Mar 30, 2009, at 4:42 AM, Raymond Balmès wrote:
>
>>
>>
>> I found out that the fields are processed in alpha order... and not in
>> creation order. Is there any reason for that ?
>>
IndexWriter every time? Do you ever close it?
>
> It'd help to see the surrounding code...
>
> Best
> Erick
>
> On Sat, Mar 28, 2009 at 1:36 PM, Raymond Balmès >wrote:
>
> > Hi guys,
> >
> > I'm using a SinkTokenizer to collect some terms of
Hi guys,
I'm using a SinkTokenizer to collect some terms of the documents while doing
the main document indexing
I attached it to a specific field (tokenized, indexed).
*
writer* = *new* IndexWriter(index, *my _analyzer*, create,
*new*IndexWriter.MaxFieldLength(100));
doc.add(new Field("cont
field name. You can use
> this for both indexing and query.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Raymond Balmès [mailto:raymond
I was looking for calling a different analyzer for each field of a
document... looks like it is not possible.
Do I have it right ?
-Ray-
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Raymond Balmès [mailto:raymond.bal...@gmail.com]
> > Sent: Sunday, March 08, 2009 4:06 PM
> > To: j
I get the problem below in trying to build Lucene 2_4.
I'm using Eclipse and just run Ant on the top build.xml.
It is pretty weird because the core is indeed build, but for some reason the
build stops there and I don't get any of the demos build etc...
Any idea what this "svnversion" program is ?
ators for the normal
term search that allow for this... but I'm new as you can see.
http://www.jdocs.com/lucene/2.0.0/org/apache/lucene/search/RangeFilter.html
Thx,
-Raymond-
On Tue, Mar 3, 2009 at 10:10 PM, Steven A Rowe wrote:
> Hi Raymond,
>
> On 3/3/2009 at 1:19 PM, Raymond B
order. But this could get clumsy with large
> numbers of terms.
>
> If you mean "at least one of index04...08 in the field"
> that's just an OR clause.
>
> Best
> Erick
>
>
> On Tue, Mar 3, 2009 at 1:18 PM, Raymond Balmès >wrote:
>
> > so
sorry [index04 TO index 08]
On Tue, Mar 3, 2009 at 7:18 PM, Raymond Balmès wrote:
> Just a simplified view of my problem :
>
> A document contains the terms "index01 blabla index02 xxx yyy index03 ...
> index10". I have the terms indexed in the collection.
> I now w
3, 2009 at 6:33 PM, Steven A Rowe wrote:
> Hi Raymond,
>
> On 3/3/2009 at 12:04 PM, Raymond Balmès wrote:
> > The range query only works on fields (using a string compare)... is
> > there any reason why it is not possible on the words of the document.
> >
> > The
Hi all,
The range query only works on fields (using a string compare)... is there
any reason why it is not possible on the words of the document.
The following query [stringa TO stringb] would just give the list of
documents which contains words between those two strings.
-RB-
ay need to normalize the phrases for the search phase,
so it may not work.
Keep in touch,
-RB-
On Mon, Mar 2, 2009 at 5:23 PM, Steven A Rowe wrote:
> Hi Raymond,
>
> On 3/2/2009 at 10:09 AM, Raymond Balmès wrote:
> > suppose I have a tri-gram, what I want to do is index the tri-gram
that using regex for
instance.
My documents look like regular html or pdf pages although some of them
contains those specific tri-grams.
Thx,
-RB-
On Mon, Mar 2, 2009 at 2:37 PM, Steven A Rowe wrote:
> Hi Raymond,
>
> On 3/1/2009, Raymond Balmès wrote:
> > I'm
Hi,
I'm trying to index (& search later) documents that contain tri-grams
however they have the following form:
<2 digit> <2 digit>
Does the ShingleFilter work with numbers in the match ?
Another complication, in future features I'd like to add optional digits
like
[<1 digit>] <2 digit> <2 d
Well that is well explained in "Lucene in Action" if you want to search
files you have to build a file parser and there is a good example given. So
not really my problem.
But I thought I could go thru the token stream only once, where I have to go
twice 1. for detecting my triplets , 2. for indexi
I think I'm getting you. But the files I'm going to parse have many formats
: PDF, HTML, Word.
they don't have a particular structure, memos if you will. But the ones I'm
interested in will have the triplets I described
Yes building a TokenFilter as you suggest should do the job.
I guess my initi
I understand your point, I did not say it was a Lucene problem but was
rather checking if I my intended design was correct... basically not.
Since I thought that I would first break my stream in token to do my special
filter, I thought I could do it in one step...
Interesting if you are not going
OK, not clear enough.
I have documents in which I'm looking for 3 consecutive elements :
<#1> <#2> (string1 is a predefined list)
I want to disregard those without this sequence and reverse index those with
these markers... it looks to me that parsing won't do the job since my
documents are unst
I'm getting plenty of message but do you receive mine... please someone give
me reply
On Tue, Sep 2, 2008 at 11:36 AM, Leonid Maslov <[EMAIL PROTECTED]> wrote:
> -- Forwarded message --
> From: Sankari Palanisamy <[EMAIL PROTECTED]>
> Date: Tue, Sep 2, 2008 at 12:32 PM
> Subject:
Is my subscription working... I got no reply on my previous question.
Sorry the disturbance.
On Mon, Sep 1, 2008 at 10:29 PM, Markus Lux <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Assume I have a String "z-4". That would be properly indexed by my
> Analyzer,
> so I'd find the belonging document if I se
Hi guys,
Fairly new to Lucene, and just finished reading Lucene in Action.
My problem is the following I need to index the documents that only contains
the following pattern(s) in a mass of documents:
<#1> <#2>
is a fixed list of words
<#x> are small numbers <100
My idea is to simply build a
Hi guys,
Fairly new to Lucene, and just finished reading Lucene in Action.
My problem is the following I need to index the documents that only contains
the following pattern(s) in a mass of documents:
<#1> <#2>
is a fixed list of words
<#x> are small numbers <100
My idea is to simply build a
24 matches
Mail list logo