Hi,
Or Lucene is more like Google in this sense, meaning that the time
doesn't depend on the size of the matched result
i found that it takes long time if the result set is bigger(upto 25 sec for
29 M results). But for smaller resultset of size approx 10,000 it takes
approx. 200 ms.
On 6/2
Dear Chris,
Thanks for your reply, explain is a good friend indeed :-)
Actually, the problem was that the documents were weighted in the
indexing phase using the default similarity, and this was cached (as
documented). So swithching the indexing to the HitCountSimilarity solves
the problem of 'st
Hello, Chris.
i have tried CO it with httpS. and it works. very weird... i dont use
any proxies or other firewalls. also CO from other sites work fine...
can anyone explain this weird behavior of TortoiseSVN 1.3.3, Build
6219 ? :)
CH> : i am trying to CO or update it for 2 hours... can you perfor
On Jun 27, 2006, at 3:17 PM, Beady Geraghty wrote:
I wasn't very clear in my original note. I just want to make sure
that I
can merge indexes created from differerent platforms/different
OSes without
problem. So I understand from the respond that this can be done.
Yes, this can be done.
Thank you for the response.
I wasn't very clear in my original note. I just want to make sure that I
can merge indexes created from differerent platforms/different OSes without
problem. So I understand from the respond that this can be done.
Thanks
On 6/27/06, Erik Hatcher <[EMAIL PROTECT
On 06/27/06 at 1:00 PM, Yura Smolsky wrote:
> svn co -r 417135
> http://svn.apache.org/repos/asf/lucene/java/trunk
> lucene-java-2.0.0-417135
I successfully ran this exact command line just now -- no errors.
It is strange that the revision number given with the checkout command
(417135) does not
: i am trying to CO or update it for 2 hours... can you perform updates or COs?
both. I tried the exact revision checkout you specified in your orriginal
email...
svn co -r 417135 http://svn.apache.org/repos/asf/lucene/java/trunk
lucene-java-2.0.0-417135
...and had no problems.
perhaps there
Yura Smolsky wrote:
Hello, Chris.
CH> : svn co -r 417135
CH> http://svn.apache.org/repos/asf/lucene/java/trunk lucene-java-2.0.0-417135
CH> : svn: REPORT request failed on
CH> '/repos/asf/!svn/bc/417505/lucene/java/trunk'
CH> : svn: REPORT of '/repos/asf/!svn/bc/417505/lucene/java/trunk':
CH> 40
Thanks, Mike. This info is actually quite helpful. What is 'times 10
rule' you are refering to?
Also, I wonder how Lucene is handling the growth of the result set
returned by the query? In the various search engine implementations I
did myself for several projects that was one of the things which
Hello, Chris.
CH> : svn co -r 417135
CH> http://svn.apache.org/repos/asf/lucene/java/trunk lucene-java-2.0.0-417135
CH> : svn: REPORT request failed on
CH> '/repos/asf/!svn/bc/417505/lucene/java/trunk'
CH> : svn: REPORT of '/repos/asf/!svn/bc/417505/lucene/java/trunk':
CH> 400 Bad Request (http://
On Jun 27, 2006, at 2:02 PM, Daniel Naber wrote:
On Dienstag 27 Juni 2006 17:23, Beady Geraghty wrote:
I tried to look at the segments file, thinking that it points to the
various other
files in the index directory,
Use IndexWriter.addIndexes() to merge two or more indexes.
Or use the Ind
: svn co -r 417135 http://svn.apache.org/repos/asf/lucene/java/trunk
lucene-java-2.0.0-417135
: svn: REPORT request failed on '/repos/asf/!svn/bc/417505/lucene/java/trunk'
: svn: REPORT of '/repos/asf/!svn/bc/417505/lucene/java/trunk': 400 Bad Request
(http://svn.apache.org
: )
: make: *** [luce
: Similarity that simply returns the number of matched terms per document
: as the score. I tried making one that returns freq as tf and 1.0f as
: anything else, but that gives strange results; same for something that
: really returns 1.0f whatever.
That's because when your tf function always retu
Hello.
I have encountered weird error with CO of Lucene, when I try to build
PyLucene:
[EMAIL PROTECTED] /cygdrive/d/workshop/PyLucene
$ make
svn co -r 417135 http://svn.apache.org/repos/asf/lucene/java/trunk
lucene-java-2.0.0-417135
svn: REPORT request failed on '/repos/asf/!svn/bc/417505/lucen
On Dienstag 27 Juni 2006 17:23, Beady Geraghty wrote:
> I tried to look at the segments file, thinking that it points to the
> various other
> files in the index directory,
Use IndexWriter.addIndexes() to merge two or more indexes.
Regards
Daniel
--
http://www.danielnaber.de
Yup, this is pretty much how I do it for lucenebook.com (though quite
admittedly it's got a miniscule amount of data behind it, which
rarely changes). I don't use a servlet initialization to put the
searcher into application scope, though, as I'm using blojsom for the
blogging system and i
I used java libraries for rtf file formats. Refer to Mannning's Lucene
In Action book. It is helpful and gives pointers where you can access
differentlibraries.
suba suresh.
mcarcelen wrote:
Hi,
Do you know another library for indexing RDF?
Thanks a lot for your help
Teresa
-Mensaje ori
Erik:
I commend you for giving all the information that's relevant. For the sake
of simplicity, and because it is the vast majority of use cases, could you
endorse the following as the simplest, most correct way (i.e. a best
practice) to implement Lucene for Web applications.
1- create an In
depends of the document type, look at method setOmitNorms in Field class.
heritrix.lucene wrote:
Hi,
Aprrox 50 Million i have processed upto now. I kept maxMergeFactor and
maxBufferedDoc's value 1000. This value i got after several round of test
runs.
Indexing rate for each document in 50 M, is
Hi,
Do you know another library for indexing RDF?
Thanks a lot for your help
Teresa
-Mensaje original-
De: Suba Suresh [mailto:[EMAIL PROTECTED]
Enviado el: martes, 27 de junio de 2006 17:38
Para: java-user@lucene.apache.org
Asunto: Re: Lucene indexing pdf
I used PDFBox library as menti
I used PDFBox library as mentioned in Lucene in Action. It works for me.
You can access it from www.pdfbox.org
suba suresh
mcarcelen wrote:
Hi,
I´m new with Lucene and I´m trying to index a pdf but when I query
everything it returns nothing. Can anyone help me?
Thans a lot
Teresa
---
Hi Teresa
You need to convert the pdf file into text format before adding the
text to the Lucene index.
You may like to look at http://www.pdfbox.org/ for a library to
convert pdf files to text format.
Patrick
On 27/06/06, mcarcelen <[EMAIL PROTECTED]> wrote:
Hi,
I´m new with Lucene and I´m t
Hi,
I´m new with Lucene and I´m trying to index a pdf but when I query
everything it returns nothing. Can anyone help me?
Thans a lot
Teresa
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROT
Hi,
I am trying to merge in index from a different node and probably different
platform.
I tried some simple cases by copying an index created from a windows
machine,
and bring to a linux server. I seem to be able to search from this index
that
is copied over. I would therefore assume that I c
Dear Ziv, List,
I am probably doing something stupid... I was trying to create a
Similarity that simply returns the number of matched terms per document
as the score. I tried making one that returns freq as tf and 1.0f as
anything else, but that gives strange results; same for something that
reall
On Jun 27, 2006, at 10:32 AM, Fabrice Robini wrote:
That's also my case...
I create a new IndexSearcher at each query, but with a static and
instanciated Directory.
New IndexSearcher(myDirectory)
It seems to be OK... am I wrong ?
You may be "ok" given your query patterns, but you won't benef
Michael - you're absolutely right in your thinking. As long as
IndexReader is long-lived you'll be fine. All caches internal to
Lucene are based off the IndexReader, which is implicitly constructed
under the covers of IndexSearcher if not specified directly.
Erik
On Jun 27, 2006
That's also my case...
I create a new IndexSearcher at each query, but with a static and
instanciated Directory.
New IndexSearcher(myDirectory)
It seems to be OK... am I wrong ?
-Original Message-
From: Crump, Michael [mailto:[EMAIL PROTECTED]
Sent: mardi 27 juin 2006 16:04
To: java-us
Singleton pattern is better. Than you can extend it to proxy pattern.
existing IndexReader really isn't that expensive and does get around
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL
Hello,
I have another question along this line. One of the points made in this
thread was to never create a new IndexSearcher for each query. Is this
true even in the case that an IndexSearcher is being created with a
static or cached IndexReader using the IndexSearcher(IndexReader reader)
const
you can initiliaze your IndexSearcher in a Servlet Listner, and even warm it up
with few queries. that way when the user sends the first query it won't take a
long time to load the index in RAM.
> -Original Message-
> From: Fabrice Robini [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, June 2
Erik,
Thank you for your reply.
I'm goingto use the static IndexSearcher in my Servlet (my index is static).
Thanks :-)
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: mardi 27 juin 2006 12:49
To: java-user@lucene.apache.org
Subject: Re: IndexSearcher in Servlet
If you want OR, you need to make all clauses SHOULD, no MUSTs.
Erik
On Jun 27, 2006, at 4:50 AM, heritrix.lucene wrote:
Hi i am using lucene 1.9.1.
My query is :
(subject:cs OR author:ritchie)
I am creating one Boolean query for two TermQueries.
t1 = new Term("subject", "cs")
t2 = n
Hi All,
I am sorry on my mistake. Now i am agree with you.
I had some mistake in my code, I was sharing the hits object in servlet and
that was my foolish mistake. Now since i changed it and when i again ran the
testcase, there was no problem.
i am using single static IndexSearcher now :)
Thanks
On Jun 27, 2006, at 5:47 AM, Fabrice Robini wrote:
What is your advice for webApplication ?
It all depends :)
- IndexSearcher pool ?
No point in that. A single IndexSearcher for searches is all that is
ever needed. Having a warming IndexSearcher, as Solr implements,
makes sense in so
Thank you very much Erik!
I will definatly check into this. I'm currently using xfire in my
implementation. I guess to big issue was/is that the indexing is done by
one application, and the searching from several different applications.
(obviously)
I have been bitten by the lucene-virus;)
Aleksander - if you're wrapping Lucene with a web service, you'd do
well to investigate Solr - http://incubator.apache.org/solr - as it
handles all of the index management in a very elegant fashion. It
currently does not support a SOAP interface, but rather a RESTful
light-weight custom XM
Hi Erik,
What is your advice for webApplication ?
- IndexSearcher pool ?
- New IndexSearcher for each query ?
- Something else ?
Thanks a lot,
Fab
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: mardi 27 juin 2006 11:41
To: java-user@lucene.apache.org
Subject: Re
On Jun 27, 2006, at 5:11 AM, heritrix.lucene wrote:
Hi,
I also had the same confusion. But today when i did the testing i
found that
it will merge your results. Therefore i believe that indexSearcher
is not
thread safe. I tried this on 10,000 requests per second.
You must have something e
Hi,
I also had the same confusion. But today when i did the testing i found that
it will merge your results. Therefore i believe that indexSearcher is not
thread safe. I tried this on 10,000 requests per second.
With Regards
On 6/27/06, Ramana Jelda <[EMAIL PROTECTED]> wrote:
Hi,
You are wrong
Hi,
You are wrong.
In ur case (If I ignore any updates to index) , One IndexSearcher object is
enough.
IndexSearcher is thread safe.
Jelda
> -Original Message-
> From: heritrix.lucene [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, June 27, 2006 10:58 AM
> To: java-user@lucene.apache.org
> S
Hi,
Thanks a lot for your reply :-)
I totally agree with you, I'm going to use a pool.
Are there "design-patterns" in a Lucene Sandbox about InsexSearcher pool?
What are the best practices ?
Thanks a lot,
Fab
-Original Message-
From: heritrix.lucene [mailto:[EMAIL PROTECTED]
Sent: mar
Hi,
The same question i asked yesterday. :-)
And now i know the answer :0
Creating a new searcher for each query will make your application very very
slow... (leave this idea)
U can not have a static indexsearcher object. It will merge all results and
the user will get the result of their que
No. I am not sorting the data...
On 6/27/06, Martin Braun <[EMAIL PROTECTED]> wrote:
Hi chris,
> searching everytime using a new searcher was taking time. So For
testing, i
> made it a static one and reused the same. This gave me a lot of
> improvement.
> Previously my query was taking approx
Hi i am using lucene 1.9.1.
My query is :
(subject:cs OR author:ritchie)
I am creating one Boolean query for two TermQueries.
t1 = new Term("subject", "cs")
t2 = new Term("author","ritchie")
for this the BooleanQuery i created is:
BooleanQuery mergedQuery = new BooleanQuery();
mergedQuery.add(n
Hello,
I have a question about the IndexSearcher().
I have a Servlet that has a searchDocument(String theQuery) method.
These method instantiate a new IndexSearcher at each query:
searchDocument(String theQuery)
{
Searcher searcher = new IndexSearcher(indexPath);
Hi chris,
> searching everytime using a new searcher was taking time. So For testing, i
> made it a static one and reused the same. This gave me a lot of
> improvement.
> Previously my query was taking approx 25 sec. But now most of the queries
> are taking time between the 100 and 800 ms.
Do you
On Tuesday 27 June 2006 09:23, heritrix.lucene wrote:
> Hi,
> First of all, thanks for your attention...
> I think i've got the solution.
> Actually earlier, everytime for each query i was creating a different
> searcher object. Creating searcher object was not taking a lot. But
> searching everyti
Hi,
First of all, thanks for your attention...
I think i've got the solution.
Actually earlier, everytime for each query i was creating a different
searcher object. Creating searcher object was not taking a lot. But
searching everytime using a new searcher was taking time. So For testing, i
made i
49 matches
Mail list logo