Re: Anyone know which way lucene read index file

2006-09-04 Thread karl wettin
On Tue, 2006-09-05 at 09:21 +0800, James liu wrote: > which way lucene read index file? http://lucene.apache.org/java/docs/fileformats.html It is not too far fetched to compare it with a Berkeley DB. - To unsubscribe, e-mail:

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
So, if I do as you suggest below (using PerFieldAnalyzerWrapper with StandardAnalyzer) then I still need to enclose in quotes the phrases (keywords with spaces) when I issue the search, and they are only returned in the results if the case is identical to how it was added? (This seems to be what

Re: word frequency list?

2006-09-04 Thread Dave Kor
If you scrolled down the page, there is a download link to the data files. There's no need to use the search form. On 9/4/06, Dejan Nenov <[EMAIL PROTECTED]> wrote: Unfortunately the term search at the site is down - gives 500 internal server error. -Original Message- From: Dave Kor [ma

Anyone know which way lucene read index file

2006-09-04 Thread James liu
After reading 'lucene in action', i know the format of indexfile. which way lucene read index file? line by line? I m very interesting.

Re: QueryParser returns all documents

2006-09-04 Thread Erick Erickson
Also, as I remember, there's explicitly no support for a query consisting just of "not x" On 9/4/06, lude <[EMAIL PROTECTED]> wrote: Hello, a simple, stupid question to the friendly mailinglist: How do you have to define the query-string to get all documents of an index be returned by using t

Re: QueryParser returns all documents

2006-09-04 Thread karl wettin
On Mon, 2006-09-04 at 22:32 +0200, lude wrote: > How do you have to define the query-string to get all documents of an index > be returned by using the QueryParser? > In theory a query like 'NOT word_not_in_index' should find and return all > documents. In practice this doesn't work (no documents

QueryParser returns all documents

2006-09-04 Thread lude
Hello, a simple, stupid question to the friendly mailinglist: How do you have to define the query-string to get all documents of an index be returned by using the QueryParser? In theory a query like 'NOT word_not_in_index' should find and return all documents. In practice this doesn't work (no d

Re: How to combine multiple fields to a single field for indexing

2006-09-04 Thread Chris Hostetter
: How do you set the position increment gap between each addition to the it does't have an explicit setter, you just subclass that Analyzer of your choosing and override getPositionIncrementGap to return the value of your choosing -- it could be a fixed value, or your Analyzer could be sophistica

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Chris Hostetter
: Yeah, they are more complex than the "exactish" match -- basically, there are : more fields involved -- combined sometimes with AND and sometimes with OR, : and sometimes negated field values, sometimes groupings, etc. These other : field values are all single words (no spaces), and a search mi

indexing and searching semantic documents using lucene

2006-09-04 Thread khgcutg hsowhj
Hi all, I want to know how can i index and search a semantic document like rdf/owl document using lucene.Does lucene support indexing of semantic documents.please provide some sample example.for example if i have a small ontology of faculty.owl as

Re: How to combine multiple fields to a single field for indexing

2006-09-04 Thread Mark Miller
How do you set the position increment gap between each addition to the same field name. Should you set it as high as possible to prevent proximity queries from crossing it? I have been looking for the code to find out how to put a gap between each same name field addition, but I have been unabl

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
More to consider: perhaps there is some way to get what you want by overriding getFieldQuery(String, String) instead. I have not been able to come up with anything simple off the top of my head, but overriding getFieldQuery would free you from having to make that line change on every Lucene up

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Philip Brown
Yeah, they are more complex than the "exactish" match -- basically, there are more fields involved -- combined sometimes with AND and sometimes with OR, and sometimes negated field values, sometimes groupings, etc. These other field values are all single words (no spaces), and a search might invo

Re: Phrase search using quotes -- special Tokenizer

2006-09-04 Thread Mark Miller
Keeping in mind that Hoss's input is much more valuable than mine... It sounds like you want what I originally tgave you. You want to be able to perform complex queries with the QueryParser and you want '-' and '_' to not break words, and you want quoted words to be tokenized as one token with

Re: Stop words in index

2006-09-04 Thread Jason Polites
right ok. I may have been tinkering with the analyzer between searches. Thanks On 9/4/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : In the default StandardAnalyzer, the stop word list contains the word "on". : If I have a document which contains the phrase "Disney on Ice", the index : will