I put a new Jar there. Plese send email if things still don't work with Maven.
-bash-2.05b$ ls -al
total 434
drwxrwxr-x 2 martinc apcvs 512 Jun 29 22:43 .
drwxrwxr-x 4 martinc apcvs 512 Jun 11 09:01 ..
-rw-rw-r-- 1 martinc apcvs2336 Jun 11 09:01 lucene-core-2.0.0.jar
-rw-r--r--
Ya you are correct. My idea will not work when there are lots of documents
in the index and also there are lots of hits for that page.
I am going with you :-)
Thanx...
On 6/29/06, James Pine <[EMAIL PROTECTED]> wrote:
Hey,
I'm not a performance guru, but it seems to me that if
you've got
hi
your attachement is empty, have no java source code in it.
Liao Xuefeng <[EMAIL PROTECTED]> 写道: hi, all,
I wrote my own html parser because it just meets my require and do not
depend on 3rd part's lib. and i'd like to share it (in attachment).
This class provides some static methods to
Hey,
I've looked at the documentation for:
org.apache.lucene.search.Searchable
org.apache.lucene.search.Searcher
org.apache.lucene.search.IndexSearcher
and it struck me that there are no search methods with
these signatures:
void search(Query query, Filter filter, HitCollector
results, Sort sor
I have a clustered environment, with a load-balancer in the front
assigning connections. Is it better to have one of the cluster running
a searcher as a webservice (to be accessed by the other machines in the
cluster) or to have a IndexReader/Searcher for each machine in the
cluster?
Jeff
-O
Thanks' for the promped reply I will look for something similar for the dot
net version, I posted in this group as it is more active.
--
View this message in context:
http://www.nabble.com/Lucene-Dynamic-http-Web-Page-Search-tf1867987.html#a5111083
Sent from the Lucene - Java Users forum at Nabb
What are the conditions that cause corruption? If there is just one
writer and multiple readers, is that safe?
The cases are well spelled out in Lucene in Action, section 2.9.
Generally, one writer and multiple readers is not safe for disabling
locking.
For example, the IndexReader, when
Otis Gospodnetic wrote:
Try using HitCollector and break out of it when you collect enough documents.
My guess is that if you are not doing anything crazy with Hits (like looping
through the all) this won't be that much faster than using Hits.
Well, in practice it does help - see the way
Lucene uses this lock to ensure the index does not become
corrupt when IndexReaders and IndexWriters are working on the same index.
What are the conditions that cause corruption? If there is just one
writer and multiple readers, is that safe?
---
Try using HitCollector and break out of it when you collect enough documents.
My guess is that if you are not doing anything crazy with Hits (like looping
through the all) this won't be that much faster than using Hits.
Otis
- Original Message
From: Dominik Bruhn <[EMAIL PROTECTED]>
T
> When I create an index withe the class IndexModifier in Lucene
1.9.1there is a lock file created on a temp folder.
> My question is: Is it possible to disable this option?
> If yes how to procede?
Yes, there is.
You can call the static FSDirectory.setDisabledLocks() to disable
locking enti
Hi,
Would you please send me your parser too?
Thanks!
Malcolm
- Original Message
From: Liao Xuefeng <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, June 23, 2006 12:54:29 AM
Subject: RE: HTML text extraction
hi, all,
I wrote my own html parser because it just meets
Hi Clive,
Lucene is a general purpose search engine. If you need crawling
capabilities on top of Lucene take a look at Nutch:
http://lucene.apache.org/nutch/
On 6/29/06, Clive. <[EMAIL PROTECTED]> wrote:
Hi,
I am working on adding a search feature to a web site that uses single
database dri
When I create an index withe the class IndexModifier in Lucene 1.9.1there is a
lock file created on a temp folder.
My question is: Is it possible to disable this option?
If yes how to procede?
Hy,
how can I limit the result-count of a query in order to save time? I searched
the web but didn't find a solution.
Thanks
Dominik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello gentlemen,
I am novice to lucene and carrot 2 but I have urgent requirement for building
a prototype using lucene and carrot2. Please help me with working web
application demo along with code.
Thanks
Arun
-
Sneak p
Hey,
I'm not a performance guru, but it seems to me that if
you've got millions of results coming back then you
probably don't want to call ArrayList.add() each time,
as it will have to grow itself a bunch of times. Also,
even ints take up space in memory, so if you only need
20 of them, then stor
Hi,
I am working on adding a search feature to a web site that uses single
database driven aspx pages and would like to know if Lucene can search using
the http url address or database to index from.
As current I can only see Lucene being able to search physical files in a
windows folder.
Any
What about the scoring worries you?
I would say this is the best approach, and also the suggest approach over
at the lucene wordnet page:
http://www.tropo.com/techno/java/lucene/wordnet.html
Of course, you could say that matches on the original search should return
with a higher score. You
Yes, That is a good idea and thanks for the suggestion.
But isn't that painful?
Then the scoring really worries us. Hence, will have to prefer boosting
the original content?
Can find or suggest a better solution?
Thanks... Ramesh.S
On Thu, 2006-06-29 at 16:10 +0200, Aleksander M. Stensby w
No... Don't think thats the idea.
I think that u would make use of the wordnet index after a user have
inputted a search. U take each term of the search, look up those terms in
the wordnet index, then use the results you get to search your index for
all those aggregated terms along with the
Hi everybody,
I'm searching a test collection for an academic
digital library (with relevant/judgement file like
TREC collections). Requirement: documents are
scientific articles, with full references.
I've heard about collection of INEX with scientific
articles from IEEE journals. Are there an
Hi,
seems like am awe struck.
My Index is working fine.
Now, have got the WordNet synonym-index.
How do I make use of this index to get synonym support search results.?
Do I have to Merge these 2 indexes using the Merge class? will that
work?
or
Do I have to inject the field "word" values
If your database table looks like this:
ID - Content - Subject - Author
you get the fields from you db and assumably store them in some bean, or
directly in strings like this;
String id, content, subject, author.
you can create a lucene document in this fashion:
final Document doc = new Doc
hi martin,
thing is that i am new to lucene and
i am not sure how to use it
the cnnection through jdbc and select stmt. are all done
i just want to know that how can i create lucene document per
row? if u provide some pseudo code kind of thing..
as in demo the indexing is done on files
amit ku
[EMAIL PROTECTED] schrieb:
> hi,
>
> my problem is that i am using mysql db in which one table is
> present and i want index each row in the table and then search
>
> plz reply
>
> how this can be done?
http://wiki.apache.org/jakarta-lucene/LuceneFAQ
How can I use Lucene to index a database?
Co
Hi Chris,
I find this incredibly interesting!
Thank you for your full explanation. I was aware of the components, but not
the implementation.
... to provide a means to query both document full-text and metadata using
an RDF model
Is there any thing I can read about how you have some to this ap
hi,
my problem is that i am using mysql db in which one table is
present and i want index each row in the table and then search
plz reply
how this can be done?
amit kumar
DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the
property of Persistent
That is exactly what I did when I started to realize the effects of using
Lucene sorting with millions of documents in the index. I used STORED fields
and sorted the results with a generic Comparator, which is configured for a
field and a search order. I only do this if the query did not return
perhaps that's not what you ment, perhaps you aren't iterating over any
results, in which case using a HitCOllector instead isn't neccessary going
to bring that 17sec down.
As i told earlier that for the same query minimum time is 2-3 sec and this
time is after several attempt(so i think upto th
This will break performance. It is better to first collect all the document
numbers (code without the proper declarations):
public void collect(int id, float score) {
if(docCount >= startDoc && docCount < endDoc) {
docNrs.add(id); // or use int[] docNrs when possible.
Why
On Thursday 29 June 2006 06:17, James Pine wrote:
> A HitCollector object invokes its collect method on
> every document which matches the query/filter
> submitted to the Searcher.search method. I think all
> you would need to do is pass in the page number and
> results per page to your HitCollecto
32 matches
Mail list logo