Hi Dough,
that's exactly what I'm looking for!
Thanks very much,
Daniel
Doug Cutting wrote:
Daniel Rabus wrote:
I've created an Semantic Desktop application using Lucene. For a
presentation I'd like to create a poster. Unfortunately I haven't
found any high resolution version (or vector gr
Thanks for the nice summary.
Couple of other things that comes to my mind are:
1. Not columns are indexed in a database; searching on
a non-indexed field would be very expensive.
2. Some dbs do support free form indexes but it is an
offline index. Good thing is that regular sql can be
used to tak
I would like to store large source documents (>10MB) in the index in their
original form, i.e. as text for text documents or as byte[] for binary
documents.
I have no difficulty adding the source document as a field to the Lucene
index document, but when I write the index document to the index I
Bill,
I don´t know of any automatic method to get notified
of index changes.
One option is that the process that updates the index
send a signal to the search deamon.
Or, the search deamon may have a thread that
periodically checks for new indexes (I think of two
possible checks: check the versio
Daniel,
Thanks for the note. But I think you misunderstand a bit (or I do :-).
These are two separate processes. The updater (in Java) runs and
exits, flushing its buffers, over and over again, as new info comes
in.
The query server (in Python), however, runs continuously, doing
searches and s
Bill Janssen wrote:
I've got a daemon process which keeps an IndexSearcher open on an
index and responds to query requests by sending back document
identifiers. I've also got other processes updating the index by
re-indexing existing documents, deleting obsolete documents, and
adding new documen
I've got a daemon process which keeps an IndexSearcher open on an
index and responds to query requests by sending back document
identifiers. I've also got other processes updating the index by
re-indexing existing documents, deleting obsolete documents, and
adding new documents. Is there any way
Hello,
For the first problem (indexing different types of documents), you can use the
mini-framework for doing just that. Just get the source code that comes with
Lucene in Action, and play - http://www.lucenebook.com/
For the Analyzers, look what Snowball provides (do a search at lucenebook.co
On Donnerstag 19 Januar 2006 14:51, Ranjan K. Baisak wrote:
> I have different categories of indexes in different
> directory. When I am searching for a category of
> "ALL", lucene should search using indexes from all
> directories.
> So is it possible?
Use this class:
http://lucene.apache.org/ja
Hi Jose,
There are several papers on that topic. But the one which
particularly interested me was
An information-theoretic approach to automatic query expansion
Claudio Carpineto, Renato de Mori, Giovanni Romano, Brigitte Bigi
ACM Transactions on Information Systems (TOIS)
It
I agree, it's definitely not what one wants. But to answer your question: Yes,
I do use RemoteSearchable on the server side.
-Original Message-
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: Thu 2006-01-19 18:44
To: java-user@lucene.apache.org
Subject: Re: Limiting hits?
I'm certai
Hi,
I begin working with lucene and need few explanations to do what i want,
thanks for your helpful answers.
I have to add lucene into a java application and I have two targets:
- To enable search throw different types of files, like MS Word, PDF or
Excel files.
I read that each type of docume
what articles you have read? i work in automatic query expansion and automatic
thesaurus generation, and i use lucene for my tests, but, by now, i don't have
excellent results. In a few days i will have results based in this method:
Concept Based Query Expansion (1993)
Yonggang Qiu Department o
On Jan 19, 2006, at 10:01 AM, [EMAIL PROTECTED] wrote:
I've a question about the lucene search method. What is the
different between a search with the class
lucene.queryParser.QueryParser and the class lucene.search.Query
and their subclasses?
QueryParser is a Query "factory". It takes
Hi,
Has anyone experimented information theory based expanded query
weight boosting for Lucene?
When the user query is small, there are several ways to expand
the query terms by synonym terms, morph terms etc. I read several
articles on how different boosting levels affect the
I'm certain how the Hits class works, but I've never used Lucene with
RMI before.
I suppose of one does it incorrectly, every hit could end up going
across the network (definitely not what you want).
Are you using RemoteSearchable on the server side?
-Yonik
On 1/19/06, Daniel Pfeifer <[EMAIL PRO
Lucene is aimed for ~10M document indexes on single CPU,
Anyway I tried till 20 GB and believe me lucene holds pretty good.
Manish Chowdhary
[EMAIL PROTECTED]
-Original Message-
From: z shalev [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 19, 2006 10:52 PM
To: java-user@lucene
Are you certain? I am quite sure we retrieve a huge amount of data if there are
thousands of matches to one query.
-Original Message-
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: Thu 2006-01-19 16:45
To: java-user@lucene.apache.org
Subject: Re: Limiting hits?
Hits doesn't keep tr
Daniel Rabus wrote:
I've created an Semantic Desktop application using Lucene. For a
presentation I'd like to create a poster. Unfortunately I haven't found
any high resolution version (or vector graphic) of the Lucene logo. At
http://svn.apache.org/repos/asf/lucene/java/trunk/docs/images/ only
hey,
is there a max amount of data (in gigabytes) where lucene's performance
starts to deteriorate
i tested with about 2 giga on two instances (2 ram dirs using the
parallelmultisearcher) and performance was great,
however i think i will need to support about 10-15 times as much
On 1/20/06, Klaus <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> >Actually, my problem is that, for instance, for a document d, Its feature
> >vector may be keywords and concepts.
>
> What do you exactly mean by features vector? You are referring to the
> predicate - object pairs, connected to one subject
Yonik,
You're right, it's unecessary unless you want to search on both (so
index both), as you said in your other message
SL
Yonik Seeley a écrit :
On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote:
Hi,
You'd better use 2 fields, one analysed and not stored, and the other
one only
On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote:
> Hi,
> You'd better use 2 fields, one analysed and not stored, and the other
> one only stored.
There is no need for that. A single field that is both indexed and
stored will give you the same ting.
-Yonik
--
Hi,
You'd better use 2 fields, one analysed and not stored, and the other
one only stored.
So you perform the query on the analysed field and present the other
field (not stemmed) in the result.
Stephan Lagraulet
Klaus a écrit :
Hi,
Is there a way to get the unstemmed term out of the lucene
Do you want to search for the unstemmed term, or just be able to retrieve it?
When you retrieve a document, you get the un-analyzed original fields.
If you want to index both the stemmend and unstemmed terms, the
easiest way is to add the field twice (the second time using a
different field name)
Hi,
Is there a way to get the unstemmed term out of the lucene index, or do I
have to change the analyzer, to save the original term and the stemmed one?
Thank,
Klaus
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Hi,
>Actually, my problem is that, for instance, for a document d, Its feature
>vector may be keywords and concepts.
What do you exactly mean by features vector? You are referring to the
predicate - object pairs, connected to one subject node, don't you?
>I don't know how to weight the two
>ite
Hits doesn't keep track of all 100,000 matches, only the first 100.
It dynamically collects more matches if it needs to.
-Yonik
On 1/19/06, Daniel Pfeifer <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am currently looking for a way to limit the amount of Hits which are
> returned by a Query.
>
> What I
Check out minNrShouldMatch in BooleanQuery in the latest lucene
version (1.9 dev version in subversion).
-Yonik
On 1/19/06, Anton Potehin <[EMAIL PROTECTED]> wrote:
> Suppose that the search query contains 20 terms. It is necessary to find
> all documents which contains at least 5 terms from sear
Hi all!
I've a question about the lucene search method. What is the different between a
search with the class lucene.queryParser.QueryParser and the class
lucene.search.Query and their subclasses?
For example: With the WildcardQuery I can search for * at the beginning of a
term, but not with the
I've the following problem:
I've a big number of documents indexed.
Suppose that the search query contains 20 terms. It is necessary to find
all documents which contains at least 5 terms from search query.
Is it possible to implement? If yes, what problems may arise during the
solving of thi
I have different categories of indexes in different
directory. When I am searching for a category of
"ALL", lucene should search using indexes from all
directories.
So is it possible?
regards,
Ranjan
-
To unsubscribe, e-mail: [EM
On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote:
>
>
> > Actually, my problem is that, for instance, for a document d,
> > Its feature
> > vector may be keywords and concepts. I don't know how to
> > weight the two
> > items. Right now, i used a stupid method, given a document d,
> > i can obtain
> Actually, my problem is that, for instance, for a document d,
> Its feature
> vector may be keywords and concepts. I don't know how to
> weight the two
> items. Right now, i used a stupid method, given a document d,
> i can obtain a
> rank D based on keyword method. Also, it is annotated wit
Hi,
I am currently looking for a way to limit the amount of Hits which are
returned by a Query.
What I am doing is following:
Searcher s = ...;
Query q = QueryParser.parse("...", "...", new StandardAnalyzer());
searcher.search(query);
We have approximately 10 million products in our Index and o
On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote:
>
>
>
> > -Ursprüngliche Nachricht-
> > Von: xing jiang [mailto:[EMAIL PROTECTED]
> > Gesendet: Donnerstag, 19. Jänner 2006 13:11
> > An: java-user@lucene.apache.org
> > Betreff: Re: Use the lucene for searching in the Semantic Web.
> >
> >
> -Ursprüngliche Nachricht-
> Von: xing jiang [mailto:[EMAIL PROTECTED]
> Gesendet: Donnerstag, 19. Jänner 2006 13:11
> An: java-user@lucene.apache.org
> Betreff: Re: Use the lucene for searching in the Semantic Web.
>
> Hi,
>
> I am not sure whether my understanding is correct.
>
>
Hi,
I am not sure whether my understanding is correct.
In your application, A concept "document" first should be defined as a class
in the ontology? Then, each document is an instance of this class. It uses
its contents as its features. Also, the related concepts will be added into
the feature ve
Its for both, onto + contents (Word, Pdf, PPT, all time the same candidates).
The main disadvantage of this approach is that "main" nodes in the ontology
have to be defined.
Imagine following use case:
An ontology describes a companies content and knowledge management system.
Persons, hierarc
Hello,
I've created an Semantic Desktop application using Lucene. For a
presentation I'd like to create a poster. Unfortunately I haven't found
any high resolution version (or vector graphic) of the Lucene logo. At
http://svn.apache.org/repos/asf/lucene/java/trunk/docs/images/ only a
few GIFs
Hi Mathias,
Can you give more details? Is your application for text + ontology, or
ontology only?
regards
jiang xing
On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote:
>
> Hi!
>
> (1) I'm working on a similar problem, but based on MPEG-7 Semantic
> Description Graphs. I've already a prototype
For some semweb + full-text searching real-world examples, also look
to the SIMILE project - http://simile.mit.edu/
They have integrated Lucene into PiggyBank and Longwell.
Erik
On Jan 18, 2006, at 9:30 PM, xing jiang wrote:
Hi,
I have done some surveys about the information retr
On Jan 19, 2006, at 2:57 AM, Ravi wrote:
Can u please tell me how to use this query in loop because he can
refine
the search n number of time so how to maintain all the queries in
QueryFilter and use of them , Please help me I need very urgent.
If you're continually refining queries, I
Thanks for your valuable suggestions to me.. I am very much glad to you for
this response. Now I understood where I am going wrong so I will try use the
first solution given by you
Thanks
Ravi Kumar Jaladanki
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Beha
Hi!
(1) I'm working on a similar problem, but based on MPEG-7 Semantic
Description Graphs. I've already a prototype for pakth based matching
within Lucene integrated in my sf project Caliph & Emir
(http://caliph-emir.sf.net). I've already adapted the approach to an
ontology, which had to be search
45 matches
Mail list logo