>> Any case where it would break?
If a query uses multiple fields it would break. That is, usually all the
fields need to be in doc in index 2 - not just the modified one.
On Fri, Oct 15, 2010 at 2:35 PM, Erick Erickson wrote:
> This seems like far too much work if I'm reading things right. You
This seems like far too much work if I'm reading things right. You can't
update
a field, but you #can# update a document which actually re-index that
document
under the covers (you have to have a way to uniquely identify the doc).
Then, when
you reopen your index reader, you'll only see the new val
Hey Erick, Sure.
*
*
*What I am trying to achieve:*
A) Update a field in Index A
B) When searching for that old field, it should be a miss.
*How I achieved it*
*Index 1 *
Doc 1 - Field1, Value 1
Doc 2 - Field1, Value 1
*Index 2*
Doc 1 - Field1, Modified_Value 1
Doc 2 - EMPTY
Add index 2 before
No. And you don't even want to try... Document IDs are NOT invariant.
Particularly
when you delete a document and optimize an index, all the documents that
come
after the deleted one get new doc IDs. Trying to keep these two indexes in
synch
will be a nightmare.
Perhaps you could explain what you'
Background: I've been trying to enable hit highlighting of XML documents
in such a way that the highlighting preserves the well-formedness of the
XML.
I thought I could get this to work by implementing a CharFilter that
extracts text from XML (somewhat like HTMLStripCharFilter, except I am
us
Hey Grant,
Fair point on the next(). In this case I'm iterating through the terms returned
from a PrefixTermEnum so I know they're in the index.
The analyser I'm using looks like this:
public class TypeSavingAnalyzer extends StandardAnalyzer {
public TypeSavingAnalyzer(Version version) {
I have two index, A and B. Can two documents doc1[in index A] and doc2[in
index B] have a common field? doc1 and doc2 have same document Id's.
On Oct 14, 2010, at 10:17 AM, app...@dsl.pipex.com wrote:
> Hello
>
> I would like to store data retrieved hourly from RSS feeds in a database or
> in Lucene so that the text can be easily
> indexed for word frequencies.
>
> I need to get the text from the title and description elements of RSS
On Oct 13, 2010, at 11:37 AM, Sykes, Derek wrote:
> Hi there,
>
> I'm currently trying to work out how I can determine the type
> (string/number/date/etc)of a term. I've not seen any off the shelf way to do
> it so am trying to store a payload against each term that records the type.
>
> I'm
Hello
I would like to store data retrieved hourly from RSS feeds in a database or in
Lucene so that the text can be easily
indexed for word frequencies.
I need to get the text from the title and description elements of RSS items.
Ideally, for each hourly retrieval from a given feed, I would add
Ok, I read the Wiki page related to improving the searching speed and adopted
some advices. One of the slow queries is simply. Here are some:
plaintext:guid
107.0 ms
resultSet.totalHits = 1
plaintext:allianc
51.0 ms
resultSet.totalHists = 1
plaintext:engin
46.0 ms
resultSet.totalHits = 1
plain
Am Donnerstag, 14. Oktober 2010, 12:29:43 schrieben Sie:
Hello,
> > is there a way to store additional metadata with fields?
> > Example:
> > I have the following content:
> >
> > This is a very
> > interesting text.
> > This is boring text
> >
> > Is there any way to include the page,x,y val
Hey Guys
Whenever I try to view open issues in hudson it doesn't display any information.
Does anyone know why this is the case or how I could fix it?
Thanks in advance
-Dave Clarke
OK, so it looks like we're down to a more general "why is searching
slow" question.
The number of docs is not very large by lucene standards.
Work through http://wiki.apache.org/lucene-java/ImproveSearchingSpeed.
If that still doesn't help, pick a slow query and post again with:
. the output of
Payload!!
2010/10/14 Christoph Hermann
> Hi,
>
> is there a way to store additional metadata with fields?
>
> My Problem is as follows:
> I'm extracting extended html with tika. This extended html contains
> references
> to pages, x,y values of the text etc. I want to be able to retrieve those
>
Many times when you run a search for the first time it has to load all field
values IF the field is being sorted on. Subsequent searches use that cache
and are faster. Does that happen in your case? From your description it
doesn't look like you are sorting, although this kind of performance
degrad
Hi,
is there a way to store additional metadata with fields?
My Problem is as follows:
I'm extracting extended html with tika. This extended html contains references
to pages, x,y values of the text etc. I want to be able to retrieve those
values when text was found while searching.
So when cr
Hi Ian,
thank you for your quick response. I am running Lucene on Ubuntu 10.04, 64
bit. I switched from MMapDirectory to NIOFSDirectory without any significant
changes in performance. The Lucene version running is 3.0.2. I followed your
advice and opened the IndexSearcher after I added all docume
Do the fast searches that you get while the app is running use the
searcher you create before you add all the docs to the index? Surely
that won't see the added docs.
There are general tips on speeding up searches at
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed. There are
some gotcha
Hi,
I'am facing some problems in using Lucene. The index I am using is
constructed like this:
try {
Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");
Directory dir = MMapDirectory.open(index);
IndexWriter writer = new IndexWriter(dir, analyzer,
MaxFieldLength.LIMITED)
20 matches
Mail list logo