Hello All,
We've been using Lucene here and like it, but we've been asked to look
into another engine also (Dieselpoint). Has anyone used both Dieselpoint and
Lucene. Any comments. We have a lot of documents (50 million+) each document
contains many small fields (maybe 100s). Important features
wrote:
>
> The Hits class collects the document ids from the query in batches. If you
> iterate beyond what was collected, the query is re-executed to collect
> more
> ids.
>
> You can use the expert level search methods on IndexSearcher if this isn't
> what you want.
I did the change and here are the results:
Query (default field is COMP_PART_NUMBER): 2444*
Query: COMP_PART_NUMBER:2444*
Query Time: 328 ms - time for query to run.
383 total matching documents.
Cycle Time: 141 ms - time to run through hits.
Query (default field is COMP_PART_NUMBER): *91822*
Qu
I understand that for the query, but why does it matter once you have the
Hits object? That is the part I'm baffled on. The query with the wildcard in
the front takes a lot longer, but we expected that.
On 9/8/05, Jeremy Meyer <[EMAIL PROTECTED]> wrote:
>
> The issue isn't with multiple wildcar
Hello All,
I am getting some weird time results when retrieving documents back from a
hits object. I am just timing this bit of code:
Hits hits = searcher.search(query);
long startTime = System.currentTimeMillis();
for (int i = 0; i < hits.length(); i++) {
Document doc = hits.doc(i);
String field
Try doing that in reverse order:
rs.close();
rs = null;
stmnt.close();
stmnt = null;
conn.close();
conn = null;
I usual do one more step also, just to be safe.
try {rs.close();} catch (Exception ignore) {}
rs = null;
try {stmnt.close();} catch (Exception ignore) {}
stmnt = null;
try {conn.close(
To add to this option, you may want to use this patch
http://issues.apache.org/bugzilla/show_bug.cgi?id=27743
This way instead of pulling the entire document back each time, just
pull back your host field. Then do your check and only pull pack the
rest of the document if you need to. This will help
Here is a list of special characters that must be excaped in a query.
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
Query q = QueryParser.parse("great\!", "all", new StandardAnalyzer());
On 6/9/05, Zhang, Lisheng <[EMAIL PROTECTED]> wrote:
> Hi,
>
> We are using lucene 1.4.3, we indexed a string
>
>
Corey,
I have one off the wall approach that may or may not work for you.
If you convert your scanned images to PDF then use something like
Acrobat to convert those PDFs into PDFs with hidden text (The OCR
data). You can then tell Acrobat Reader via XML what to highlight when
your user opens the
ecide which route is acceptable for our needs.
Thanks again
On 5/17/05, Richard Krenek <[EMAIL PROTECTED]> wrote:
> I was wondering about Lucene and NFS. The issue is with locking
> correct? In Lucene in Action it mentions.
> ... issues with lock files and NFS, choose a directory t
I was wondering about Lucene and NFS. The issue is with locking
correct? In Lucene in Action it mentions.
... issues with lock files and NFS, choose a directory that doesn't
reside on an NFS volume. If you have the book, flip to page 62. Does
it mean, don't use NFS or just ensure you point your loc
Unfortunately our indexes will be performance sensitive. Is Lucene
still a good choice? What kind of hardware are you using?
Also what are the performance implications for having the additional
80 records in the index for just display purposes?
Thanks,
Richard Krenek
On 5/13/05, Vince
Hypothetically I have 100 million records. Each record has 100+
fields. Only 20 of those fields need to be searched on, the rest
(including the 20) are just for display purposes.
Would it be best to just add the 20 fields to the index and keep the
rest in a relational database? What affect does all
13 matches
Mail list logo