Search results include results with excluded terms

2010-08-16 Thread Christoph Hermann
avl" is included. So the question is, how do i *exclude* documents? I.e. score the exclusion very low, so that these results won't appear at all? regards Christoph -- Christoph Hermann Institut für Informatik Tel: +49 761-203-8171 Fax: +49

Re: Search results include results with excluded terms

2010-08-16 Thread Christoph Hermann
BooleanQuery finalQuery = new BooleanQuery(); finalQuery.add(q, Occur.MUST); finalQuery.add(q2, Occur.MUST_NOT); it works as expected. Thanks a lot for the hint! I think i'll recreate my index with a LowerCaseFilter, that should fix it, shouldn't it? regards Christoph Herma

Re: Search results include results with excluded terms

2010-08-16 Thread Christoph Hermann
Am Montag, 16. August 2010, 20:48:49 schrieb Christoph Hermann: Hello, > I think i'll recreate my index with a LowerCaseFilter, that should fix it, > shouldn't it? it does. At least i just recreated my index and i'm now using the same Analyzer for the QueryParser which

Re: Why Lucene in Action, 2nd Edition isn't available on sale (new) on Amazon UK?

2010-10-12 Thread Christoph Hermann
know where else I could buy it in UK? you can directly buy it from manning: http://www.manning.com/hatcher2/ An you might look for coupon codes on the web before buying. rgds Christoph Hermann -- Christoph Hermann Institut für Informatik Tel: +49 761-203-8171 Fax: +49 761-203-81

Re: Why Lucene in Action, 2nd Edition isn't available on sale (new) on Amazon UK?

2010-10-12 Thread Christoph Hermann
Am Dienstag, 12. Oktober 2010, 10:05:47 schrieb Christoph Hermann: Hi, > you can directly buy it from manning: > http://www.manning.com/hatcher2/ *arg*, wrong url. I meant the second edition of course: http://www.manning.com/hatcher3/ rgds Christoph Hermann -- Christoph Hermann Instit

Storing additional Metadata with Fields

2010-10-14 Thread Christoph Hermann
ield.Index.YES)); Is there any way to include the page,x,y values in there? I'd like to display the page when retrieving the results. I thought about storing the same field twice and adding the page,x,y values at the beginning of the Field and then when retrieving the field extract those

Re: Storing additional Metadata with Fields

2010-10-14 Thread Christoph Hermann
in my case i would store the x,y and page values for every word and increase the index much more than i'd need. Any approach for preventing this? And when searching, how can i access the payloads when displaying the result? I haven't found information on that so far. regards Christoph He

Writing an Analyzer for storing and retrieving a payload (was: Storing additional Metadata with Fields)

2010-10-15 Thread Christoph Hermann
Am Donnerstag, 14. Oktober 2010, 14:43:41 schrieb Christoph Hermann: Hello, > It seems Playload gets added to > every term in the index, so in my case i would store the x,y and page > values for every word and increase the index much more than i'd need. > Any approach fo

Tokenizing XML

2010-10-15 Thread Christoph Hermann
Hi, is there a Tokenizer in Lucene, that tokenizes XML correctly? I.e. that one gets from the following XML: this is exampletext. Tokens (or similar): | this | is | | example | | text. | Or would i need to write such a Tokenizer myself? regards Christoph Hermann -- Christoph Hermann

Re: Writing an Analyzer for storing and retrieving a payload (was: Storing additional Metadata with Fields)

2010-10-15 Thread Christoph Hermann
uot;, payload, ...)); doc.add(new Field("contents", "this is the value", ...)); Then in the Analyzer i can identify the payload (i.e. by the first one, two bytes), decode the payload and use it for further tokens. Anything wrong with that approach? regards Christoph Hermann

scorePayload does not get called

2010-10-16 Thread Christoph Hermann
return boost; } else { return 1.0f; } } } ----- regards Christoph Hermann -- Christoph Hermann Institut für Informatik Tel: +49 761-203-8171 Fax: +49 761-203-8162 e-mail: herm...@informatik.uni-freiburg.de

Copying Payload from one Token to the next

2010-10-16 Thread Christoph Hermann
this approach is, that the token automatically gets consumed, so on the next run i only get the third token (and so on). What would be the best way to copy a payload from the current token to the following ones? regards Christoph Hermann PS: Thanks Uwe for the hint regarding scorePayload. That

Re: Copying Payload from one Token to the next

2010-10-17 Thread Christoph Hermann
he next. That also solved my problem with scorePayload earlier, since now every token has its payload. regards Christoph Hermann -- Christoph Hermann Institut für Informatik Tel: +49 761-203-8171 Fax: +49 761-203-8162 e-mail: herm...@informatik.uni-freiburg.de --

Re: Analyzer

2010-12-02 Thread Christoph Hermann
Am Donnerstag, 2. Dezember 2010, 11:11:03 schrieb Sean: Hi, > By the way, is there an analyzer which splites each letter of a word? > e.g. > hello world => h/e/l/l/o/w/o/r/l/d There is a CharTokenizer, that should help you. regards Christoph Hermann -- Christoph Hermann