Not sure if I'm going about this the right way, but I want to use Query
instances as a key to a HashMap to cache BitSet instances from filtering
operations. They are all for the same reader.
That means equals() for any instance of the same generic Query would have to
return true if the terms,
Mohammad Norouzi wrote:
Hi
you know, actually we dont indexed this field as Date. we always use
string
instead of Date type because we use both Hijri date and Gregorian date so
if
we put a Hijri date the DateField not work properly. that is why we index
such this field as String.
What DateFi
Hi!
I am new to Lucene and I am trying to customise the query parser to default
to wildcard searches.
For example, if the user types in "fenc", it should find "fence" and
"fencing" and "fences" and "fenced".
Looks like stemming to me! Maybe you should consider using a stemming
analyzer instea
Hello,
I am new to Lucene and I am trying to customise the query parser to default
to wildcard searches.
For example, if the user types in "fenc", it should find "fence" and
"fencing" and "fences" and "fenced".
I can not find a way to modify / extend the QueryParser to automatically
create wildc
Hello folks,
Maybe one of you can help me with this (sorry, long read).
I have implemented a FuzzyPhraseQuery that works similar to Lucene's
native PhraseQuery.
I.e. it can retrieve phrases for a query, with respect to insertions
and term order.
But in addition it can also find matches with term
: initialized... I tried to create a seacher everytime but that lead me to
: the Too-Many-Files-Open exception. So no matter what I do I face a show
: stopper.
were you closing the old searcher before opening the new one?
even if that was the cause of your problem, i still wouldn't recomend
reop
My two cents:
Lucene often offers the least common denominator...for example: out of
the box, Lucene best handles either a single user / single thread
experience or a mostly 'read only' experience. I believe that the main
reason for this is that it serves the greatest variety uses. You can,
a
I agree those are benefits when you batch process the indexes once or once in a
while.
The beauty of AOP is that I can intercept writes and do change the index on the
spot. At that point I'd need to let the search know or drop it. If I do that
that will face issues on the search side since this
Indeed, having to re-open a searcher/reader in order for searches
to reflect index modification, can sometimes not best fit with the
logic of a certain application.
But see the features made possible with this design:
(+) searches do not feel index modifications until desired.
(++) no need to sync
Hi Mathieu,
You can't add TokenFilters to an existing Analyzer. However,
implementing an Analyzer that acts just like the StandardAnalyzer
plus your Stemmer is pretty straightforward.
StandardAnalzyer.tokenStream() looks like:
/** Constructs a [EMAIL PROTECTED] StandardTokenizer} filtere
I too cannot think of an indexing configuration that would help this.
However it seems that all the required information exists at search time,
more precisely at hits collection time:
- the doc-id and doc-score are known, and used when hits are collected.
- The value of that certain field of inter
: I override that method and just remove the try/catch block in which you put
: codes with Date stuffs and now it works fine
:
: my overridden method only return new RangeQuery(...);
subclassing QueryParser to override getRangeQuery and eliminate the
special Date code sounds like it would work ju
there are also the static IndexReder.isLocked(Directory) and
IndexReder.unlock(Directory) methods that encapsulate this logic for you
... they've been around since at least 1.4.3.
: Date: Sun, 4 Mar 2007 21:34:52 -0800
: From: Chris Lu <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
But, letting it stay in the text stream and not putting it in a
separate
date
field would give you some trouble with ranges because things
that
weren't dates could mess you up.
This is why Chris suggested putting a prefix on the token. For example,
leading underscor
I'm not sure how, but in moving an index over from 2.0 to 2.1 and
changing my own code one of the .tii files got deleted. I still have
the .tis file though, can I rebuild the missing file so I can open my
index? Luke won't open it now and I just want to make sure everything
is ok before openi
That part is self understood. However as I describe the problem initially - and
the use case is a very practical way of dealing with documents in real live -
they change, we edit them, I don't want to run a batch re-indexing thing every
night... I just wanted done on the spot. One instance Index
Hi,
This is a very simple question, but I just can't find the ressources I
need ...
I am using the StandardAnalyzer :
StandardAnalyzer stdAnalyzer;
if ((stopWordList != null) && (stopWordList.length != 0)) {
stdAnalyzer = new StandardAnalyzer(stopWordList);
} else {
stdAnalyzer = new Standa
If add() is tokening up front, then three calls would be the logical
equivalent, and I wouldn't need to artificially add separators while doing
a
looping construct.
document.add( Field.Text("interestingdates", "17760704" ) );
document.add( Field.Text("interestingdates&q
<1>. Every time you close/open a reader, you pay a significant penalty
to warm up caches, etc. You may have to do some tricky dancing
to coordinate among the sessions to be able to close/reopen
the reader to allow updates to show up though.
Erick
On 3/5/07, Mohammad Norouzi <[EMAIL PROTECTED]>
cal
equivalent, and I wouldn't need to artificially add separators while doing a
looping construct.
document.add( Field.Text("interestingdates", "17760704" ) );
document.add( Field.Text("interestingdates", "20010911" ) );
document.add( Field.Text("inter
Little shameless self promotion here:
If your not aware, there will be several Lucene related talks at
ApacheCon Europe (in Amsterdam) the first week in May. ApacheCon
info is available at http://www.eu.apachecon.com
Here is the current schedule for the talks:
* May 1: Lucene Boot Cam
Hi Erick,
I take a look at your source codes and I saw in the getRangeQuery() method
you put a DateFormat as this:
DateFormat df = DateFormat.getDateInstance(DateFormat.SHORT, locale);
...
Date d1 = df.parse(part1);
the last line of code doesnt work with our localed format. for example I put
the
Hi Erick
I am completely confused about this IndexReader.
in my case, I have to keep the reader opened because of pagination of the
result so I have to had a reader per session. the thing that baffled me is
can only one reader service all the session at the same time?
I mean
1- having one reader
On Mon, 2007-03-05 at 07:52 -0500, Erick Erickson wrote:
> Why not just call IndexReader.document(idx) where idx ranges
> from 0 to IndexReader.maxDoc()? I believe if your index has some
> deleted documents you'll have to handle null returns though
That was exactly what I was looking for. Than
Or MatchAllDocsQuery.
/Ronnie
Erick Erickson wrote:
Why not just call IndexReader.document(idx) where idx ranges
from 0 to IndexReader.maxDoc()? I believe if your index has some
deleted documents you'll have to handle null returns though
Sorry to lose you to the dark side ...
Best
Erick
Why not just call IndexReader.document(idx) where idx ranges
from 0 to IndexReader.maxDoc()? I believe if your index has some
deleted documents you'll have to handle null returns though
Sorry to lose you to the dark side ...
Best
Erick
On 3/5/07, Morten Simonsen <[EMAIL PROTECTED]> wrote:
There's a discussion recently where someone pointed me to
FieldSortedHitQueue, you might trysearchinng for that. Also, try
"buckets" which was the header of that discussion.
You can also think about clever indexing schemes with fields that
allow you to sort however you really need to, although I
I think you should search the archive for DateTools. There have been
very extensive discussions of this topic that will give you answers
far more quickly.
Dates are strings in Lucene. There's no magic here. You don't need
to override anything to get them to work, all you need to do is make
sure t
There was quite a long discussion thread on this topic relatively
recently, try searching the archive for concurrence, perhaps
IndexReader, etc.
The short take-away is that you should share a single instance
of the reader, since opening one is an expensive operation, and
the first searches you pe
"Chris Lu" <[EMAIL PROTECTED]> wrote:
> They are not really unique. Here are my code to unlock the directory.
> Notice there are two locks.
>
> public static void unlockDirectory(Directory dir) {
> Lock dirLock = dir.makeLock(IndexWriter.WRITE_LOCK_NAME);
> if (dirLock.isLocke
Hi
I'm about to convert from Lucene index-files into a MySQL (sorry about
that:) I thought I would run a "SELECT *" on the index-file, then read
through all the "rows" (hits?) and process each of them into my new
database.
So I wrote this code:
WildcardQuery query = new WildcardQuery(new
I think the solution is fairly simple.
Pass the "metadata" fieldname to the QueryTermExtractor - not the fieldname
"author". QueryTermExtractor effectively provides just a list of strings (no
fieldnames) which are then matched against strings found in the tokenStream
which represents your conten
This will then be a big hastle. The results are in 100s and sometimes in
1000s.
Hum.. No other better way?
Jelda
> -Original Message-
> From: Mordo, Aviran (EXP N-NANNATEK) [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 02, 2007 8:02 PM
> To: java-user@lucene.apache.org
> Subject: RE: H
33 matches
Mail list logo