date:20060119

Re: Lucene Logo? (high resolution)

2006-01-19 Thread Daniel Rabus

Hi Dough, that's exactly what I'm looking for! Thanks very much, Daniel Doug Cutting wrote: Daniel Rabus wrote: I've created an Semantic Desktop application using Lucene. For a presentation I'd like to create a poster. Unfortunately I haven't found any high resolution version (or vector gr

Re: Why we use Lucene for Database search like Oracle / Sybase ?

2006-01-19 Thread Chandramohan

Thanks for the nice summary. Couple of other things that comes to my mind are: 1. Not columns are indexed in a database; searching on a non-indexed field would be very expensive. 2. Some dbs do support free form indexes but it is an offline index. Good thing is that regular sql can be used to tak

Storing large text or binary source documents in the index and memory usage

2006-01-19 Thread George Washington

I would like to store large source documents (>10MB) in the index in their original form, i.e. as text for text documents or as byte[] for binary documents. I have no difficulty adding the source document as a field to the Lucene index document, but when I write the index document to the index I

Re: notification of active IndexSearchers when index is modified?

2006-01-19 Thread Alejandro Rusell

Bill, I don´t know of any automatic method to get notified of index changes. One option is that the process that updates the index send a signal to the search deamon. Or, the search deamon may have a thread that periodically checks for new indexes (I think of two possible checks: check the versio

Re: notification of active IndexSearchers when index is modified?

2006-01-19 Thread Bill Janssen

Daniel, Thanks for the note. But I think you misunderstand a bit (or I do :-). These are two separate processes. The updater (in Java) runs and exits, flushing its buffers, over and over again, as new info comes in. The query server (in Python), however, runs continuously, doing searches and s

Re: notification of active IndexSearchers when index is modified?

2006-01-19 Thread Daniel Noll

Bill Janssen wrote: I've got a daemon process which keeps an IndexSearcher open on an index and responds to query requests by sending back document identifiers. I've also got other processes updating the index by re-indexing existing documents, deleting obsolete documents, and adding new documen

notification of active IndexSearchers when index is modified?

2006-01-19 Thread Bill Janssen

I've got a daemon process which keeps an IndexSearcher open on an index and responds to query requests by sending back document identifiers. I've also got other processes updating the index by re-indexing existing documents, deleting obsolete documents, and adding new documents. Is there any way

Re: languages & files

2006-01-19 Thread Otis Gospodnetic

Hello, For the first problem (indexing different types of documents), you can use the mini-framework for doing just that. Just get the source code that comes with Lucene in Action, and play - http://www.lucenebook.com/ For the Analyzers, look what Snowball provides (do a search at lucenebook.co

Re: Search more than one index directory

2006-01-19 Thread Daniel Naber

On Donnerstag 19 Januar 2006 14:51, Ranjan K. Baisak wrote: > I have different categories of indexes in different > directory. When I am searching for a category of > "ALL", lucene should search using indexes from all > directories. > So is it possible? Use this class: http://lucene.apache.org/ja

Re: information theory based expanded query term boosting

2006-01-19 Thread Rajesh Munavalli

Hi Jose, There are several papers on that topic. But the one which particularly interested me was An information-theoretic approach to automatic query expansion Claudio Carpineto, Renato de Mori, Giovanni Romano, Brigitte Bigi ACM Transactions on Information Systems (TOIS) It

RE: Limiting hits?

2006-01-19 Thread Daniel Pfeifer

I agree, it's definitely not what one wants. But to answer your question: Yes, I do use RemoteSearchable on the server side. -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Thu 2006-01-19 18:44 To: java-user@lucene.apache.org Subject: Re: Limiting hits? I'm certai

languages & files

2006-01-19 Thread arnaudbuffet

Hi, I begin working with lucene and need few explanations to do what i want, thanks for your helpful answers. I have to add lucene into a java application and I have two targets: - To enable search throw different types of files, like MS Word, PDF or Excel files. I read that each type of docume

Re: information theory based expanded query term boosting

2006-01-19 Thread José Ramón Pérez Agüera

what articles you have read? i work in automatic query expansion and automatic thesaurus generation, and i use lucene for my tests, but, by now, i don't have excellent results. In a few days i will have results based in this method: Concept Based Query Expansion (1993) Yonggang Qiu Department o

Re: What's the differents between QueryParser and Query

2006-01-19 Thread Erik Hatcher

On Jan 19, 2006, at 10:01 AM, [EMAIL PROTECTED] wrote: I've a question about the lucene search method. What is the different between a search with the class lucene.queryParser.QueryParser and the class lucene.search.Query and their subclasses? QueryParser is a Query "factory". It takes

information theory based expanded query term boosting

2006-01-19 Thread Rajesh Munavalli

Hi, Has anyone experimented information theory based expanded query weight boosting for Lucene? When the user query is small, there are several ways to expand the query terms by synonym terms, morph terms etc. I read several articles on how different boosting levels affect the

Re: Limiting hits?

2006-01-19 Thread Yonik Seeley

I'm certain how the Hits class works, but I've never used Lucene with RMI before. I suppose of one does it incorrectly, every hit could end up going across the network (definitely not what you want). Are you using RemoteSearchable on the server side? -Yonik On 1/19/06, Daniel Pfeifer <[EMAIL PRO

RE: data size limitation?

2006-01-19 Thread M å n i s h

Lucene is aimed for ~10M document indexes on single CPU, Anyway I tried till 20 GB and believe me lucene holds pretty good. Manish Chowdhary [EMAIL PROTECTED] -Original Message- From: z shalev [mailto:[EMAIL PROTECTED] Sent: Thursday, January 19, 2006 10:52 PM To: java-user@lucene

RE: Limiting hits?

2006-01-19 Thread Daniel Pfeifer

Are you certain? I am quite sure we retrieve a huge amount of data if there are thousands of matches to one query. -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Thu 2006-01-19 16:45 To: java-user@lucene.apache.org Subject: Re: Limiting hits? Hits doesn't keep tr

Re: Lucene Logo? (high resolution)

2006-01-19 Thread Doug Cutting

Daniel Rabus wrote: I've created an Semantic Desktop application using Lucene. For a presentation I'd like to create a poster. Unfortunately I haven't found any high resolution version (or vector graphic) of the Lucene logo. At http://svn.apache.org/repos/asf/lucene/java/trunk/docs/images/ only

data size limitation?

2006-01-19 Thread zzzzz shalev

hey, is there a max amount of data (in gigabytes) where lucene's performance starts to deteriorate i tested with about 2 giga on two instances (2 ram dirs using the parallelmultisearcher) and performance was great, however i think i will need to support about 10-15 times as much

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread xing jiang

On 1/20/06, Klaus <[EMAIL PROTECTED]> wrote: > > Hi, > > >Actually, my problem is that, for instance, for a document d, Its feature > >vector may be keywords and concepts. > > What do you exactly mean by features vector? You are referring to the > predicate - object pairs, connected to one subject

Re: Analyzer

2006-01-19 Thread Stéphane Lagraulet

Yonik, You're right, it's unecessary unless you want to search on both (so index both), as you said in your other message SL Yonik Seeley a écrit : On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote: Hi, You'd better use 2 fields, one analysed and not stored, and the other one only

Re: Analyzer

2006-01-19 Thread Yonik Seeley

On 1/19/06, Stéphane Lagraulet <[EMAIL PROTECTED]> wrote: > Hi, > You'd better use 2 fields, one analysed and not stored, and the other > one only stored. There is no need for that. A single field that is both indexed and stored will give you the same ting. -Yonik --

Re: Analyzer

2006-01-19 Thread Stéphane Lagraulet

Hi, You'd better use 2 fields, one analysed and not stored, and the other one only stored. So you perform the query on the analysed field and present the other field (not stemmed) in the result. Stephan Lagraulet Klaus a écrit : Hi, Is there a way to get the unstemmed term out of the lucene

Re: Analyzer

2006-01-19 Thread Yonik Seeley

Do you want to search for the unstemmed term, or just be able to retrieve it? When you retrieve a document, you get the un-analyzed original fields. If you want to index both the stemmend and unstemmed terms, the easiest way is to add the field twice (the second time using a different field name)

Analyzer

2006-01-19 Thread Klaus

Hi, Is there a way to get the unstemmed term out of the lucene index, or do I have to change the analyzer, to save the original term and the stemmed one? Thank, Klaus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

AW: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Klaus

Hi, >Actually, my problem is that, for instance, for a document d, Its feature >vector may be keywords and concepts. What do you exactly mean by features vector? You are referring to the predicate - object pairs, connected to one subject node, don't you? >I don't know how to weight the two >ite

Re: Limiting hits?

2006-01-19 Thread Yonik Seeley

Hits doesn't keep track of all 100,000 matches, only the first 100. It dynamically collects more matches if it needs to. -Yonik On 1/19/06, Daniel Pfeifer <[EMAIL PROTECTED]> wrote: > Hi, > > I am currently looking for a way to limit the amount of Hits which are > returned by a Query. > > What I

Re: non-standard query

2006-01-19 Thread Yonik Seeley

Check out minNrShouldMatch in BooleanQuery in the latest lucene version (1.9 dev version in subversion). -Yonik On 1/19/06, Anton Potehin <[EMAIL PROTECTED]> wrote: > Suppose that the search query contains 20 terms. It is necessary to find > all documents which contains at least 5 terms from sear

What's the differents between QueryParser and Query

2006-01-19 Thread der_grosse_hui

Hi all! I've a question about the lucene search method. What is the different between a search with the class lucene.queryParser.QueryParser and the class lucene.search.Query and their subclasses? For example: With the WildcardQuery I can search for * at the beginning of a term, but not with the

non-standard query

2006-01-19 Thread Anton Potehin

I've the following problem: I've a big number of documents indexed. Suppose that the search query contains 20 terms. It is necessary to find all documents which contains at least 5 terms from search query. Is it possible to implement? If yes, what problems may arise during the solving of thi

Search more than one index directory

2006-01-19 Thread Ranjan K. Baisak

I have different categories of indexes in different directory. When I am searching for a category of "ALL", lucene should search using indexes from all directories. So is it possible? regards, Ranjan - To unsubscribe, e-mail: [EM

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread xing jiang

On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote: > > > > Actually, my problem is that, for instance, for a document d, > > Its feature > > vector may be keywords and concepts. I don't know how to > > weight the two > > items. Right now, i used a stupid method, given a document d, > > i can obtain

AW: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Mathias Lux

> Actually, my problem is that, for instance, for a document d, > Its feature > vector may be keywords and concepts. I don't know how to > weight the two > items. Right now, i used a stupid method, given a document d, > i can obtain a > rank D based on keyword method. Also, it is annotated wit

Limiting hits?

2006-01-19 Thread Daniel Pfeifer

Hi, I am currently looking for a way to limit the amount of Hits which are returned by a Query. What I am doing is following: Searcher s = ...; Query q = QueryParser.parse("...", "...", new StandardAnalyzer()); searcher.search(query); We have approximately 10 million products in our Index and o

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread xing jiang

On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote: > > > > > -Ursprüngliche Nachricht- > > Von: xing jiang [mailto:[EMAIL PROTECTED] > > Gesendet: Donnerstag, 19. Jänner 2006 13:11 > > An: java-user@lucene.apache.org > > Betreff: Re: Use the lucene for searching in the Semantic Web. > > > >

AW: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Mathias Lux

> -Ursprüngliche Nachricht- > Von: xing jiang [mailto:[EMAIL PROTECTED] > Gesendet: Donnerstag, 19. Jänner 2006 13:11 > An: java-user@lucene.apache.org > Betreff: Re: Use the lucene for searching in the Semantic Web. > > Hi, > > I am not sure whether my understanding is correct. > >

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread xing jiang

Hi, I am not sure whether my understanding is correct. In your application, A concept "document" first should be defined as a class in the ontology? Then, each document is an instance of this class. It uses its contents as its features. Also, the related concepts will be added into the feature ve

AW: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Mathias Lux

Its for both, onto + contents (Word, Pdf, PPT, all time the same candidates). The main disadvantage of this approach is that "main" nodes in the ontology have to be defined. Imagine following use case: An ontology describes a companies content and knowledge management system. Persons, hierarc

Lucene Logo? (high resolution)

2006-01-19 Thread Daniel Rabus

Hello, I've created an Semantic Desktop application using Lucene. For a presentation I'd like to create a poster. Unfortunately I haven't found any high resolution version (or vector graphic) of the Lucene logo. At http://svn.apache.org/repos/asf/lucene/java/trunk/docs/images/ only a few GIFs

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread xing jiang

Hi Mathias, Can you give more details? Is your application for text + ontology, or ontology only? regards jiang xing On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote: > > Hi! > > (1) I'm working on a similar problem, but based on MPEG-7 Semantic > Description Graphs. I've already a prototype

Re: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Erik Hatcher

For some semweb + full-text searching real-world examples, also look to the SIMILE project - http://simile.mit.edu/ They have integrated Lucene into PiggyBank and Longwell. Erik On Jan 18, 2006, at 9:30 PM, xing jiang wrote: Hi, I have done some surveys about the information retr

Re: :intersection of two hits objects:

2006-01-19 Thread Erik Hatcher

On Jan 19, 2006, at 2:57 AM, Ravi wrote: Can u please tell me how to use this query in loop because he can refine the search n number of time so how to maintain all the queries in QueryFilter and use of them , Please help me I need very urgent. If you're continually refining queries, I

RE: :intersection of two hits objects:

2006-01-19 Thread Ravi

Thanks for your valuable suggestions to me.. I am very much glad to you for this response. Now I understood where I am going wrong so I will try use the first solution given by you Thanks Ravi Kumar Jaladanki -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Beha

RE: Use the lucene for searching in the Semantic Web.

2006-01-19 Thread Mathias Lux

Hi! (1) I'm working on a similar problem, but based on MPEG-7 Semantic Description Graphs. I've already a prototype for pakth based matching within Lucene integrated in my sf project Caliph & Emir (http://caliph-emir.sf.net). I've already adapted the approach to an ontology, which had to be search

45 matches

Mail list logo