I use RAMDirectory and the error often shows the low number. Last time it
happened with message "7<=7". Nest time it happens, I will try to capture
the stacktrace.
Michael McCandless-2 wrote:
>
>
> "testn" <[EMAIL PROTECTED]> wrote:
>>
>> Using Lucene 2.2.0, I still sporadically got doc out
I actually know from experience. Around 20% +/- 5% of emails will have
attachments. If that helps. Again, I say index as much info as you
can. Store what you think it necessary.
Erick Erickson wrote:
Rather than use efficiency arguments to drive the behavior of the
app, I'd recommend that
Rather than use efficiency arguments to drive the behavior of the
app, I'd recommend that you define the expected behavior and
make that behavior happen as necessary.
What would you estimate is the ratio of meta-data to attachments?
And what is the ratio of documents that have multiple attachments
OK, what worked? Using a RAMDir?
Erick
On 8/15/07, John Paul Sondag <[EMAIL PROTECTED]> wrote:
>
> It worked! My indexing time went from over 6 hours to 592 seconds! Thank
> you guys so much!
>
> --JP
>
> On 8/14/07, karl wettin <[EMAIL PROTECTED]> wrote:
> >
> >
> > 14 aug 2007 kl. 21.34 skrev
"testn" <[EMAIL PROTECTED]> wrote:
>
> Using Lucene 2.2.0, I still sporadically got doc out of order error. I
> indexed all of my stuff in one thread. Do you have any idea why it
> happens?
Hm, that is not good. I thought we had finally fixed this with
LUCENE-140. Though un-corrected disk
Using Lucene 2.2.0, I still sporadically got doc out of order error. I
indexed all of my stuff in one thread. Do you have any idea why it happens?
Thanks!
--
View this message in context:
http://www.nabble.com/out-of-order-tf4276385.html#a12172277
Sent from the Lucene - Java Users mailing list
Hi Chew,
with Lucene you could try the following:
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
with a custom HitCollector like the following example taken from
org.apache.lucene.search.HitCollector AP
Donna,
I have been investigation highlighters in Lucene recently a bit. The humble
experience I've learned so far is that highlighting is completely different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if yo
Well, in my case the highlighting was returning nothing because of (my
favorite acronym) PBCAK--
I don't store the text in the index, so I have to retrieve it separately
(from a database) for the highlighting, and my database was not in sync
with the index, so in a few cases the document in the
Hey Michael,
Are you writing this software for yourself or for reselling? We built
an email archiving service and we use lucene as our search engine. We
approach this a little differently.
BUT, i don't think it is wasteful to index the header information with
the attachment. Just don't st
Could someone who understands Lucene internals help me port
https://issues.apache.org/jira/browse/LUCENE-423 to Lucene 2.0? I have beefy
hardware (32 cores) and want to try this out, but it won't compile.
There are 2 issues:
1- maxScore
On line 412 TopFieldDocs constructor now needs a maxScore.
We are writing a mail archiving program. Each piece of the message (eg each
attachment) is stored separately.
I'll try to keep this short and sweet :)
Currently we index the main header fields, like
subject
sender
recipients (space delimited)
etc.
This stuff is really only needed once per e-m
I'm working on refining my stopwords by looking at the highest scoring
document returned for each search, and using the highlighter to show which
terms were significant in choosing that document. This has been extremely
helpful in improving my searches. I've noticed though that sometimes the
hi
Hey,
I think u can try :
MultiFieldQueryParser.parse(String[] queries, String[] fields,
BooleanClause.Occur[] flags,
Analyzer analyzer)
The flags arrray will get u ORs and ANDs in places u need
- Sagar Naik
Abu Abdulla alhanbali wrote:
Thanks for the help,
please provide the code to
It worked! My indexing time went from over 6 hours to 592 seconds! Thank
you guys so much!
--JP
On 8/14/07, karl wettin <[EMAIL PROTECTED]> wrote:
>
>
> 14 aug 2007 kl. 21.34 skrev John Paul Sondag:
>
> > What exactly is a RAMDirectory, I didn't see it mentioned on that
> > page. Is
> > there
15 aug 2007 kl. 07.18 skrev Mohammad Norouzi:
I am using WhitespaceAnalyzer and the query is " icdCode:H* " but
there is
no result however I know that there are many documents with this
field value
such as H20, H20.5 etc. this field is tokenized and indexed
what is
wrong with this?
wh
copying all fields to a single searchable field is quite reasonable,
and won't double your index size if you set the new field to be
unstored.
Erik
On Aug 15, 2007, at 5:38 AM, Ridwan Habbal wrote:
Hello all,
when we search over an index docs we use code such:
Analyzer analyzer
Hello all,
when we search over an index docs we use code such:
Analyzer analyzer = new StandardAnalyzer();
String defaultSearchField = "all";
QueryParser parser = new QueryParser(defaultSearchField, analyzer);
IndexSearcher indexSearcher = new IndexSearcher(this.indexDirectory);
Hits hits = in
Greetings,
I have tested with Mysql, the grouping is ok when there is not much records in
the table, but when I come across to performed grouping in a table which have 3
millions of records, It really take a very long time to finish. Thus, Im
looking at lucene and hope it can help.
Thank you
e
19 matches
Mail list logo