Re: Lucene search in URL

2009-09-20 Thread AHMET ARSLAN
> Thanks for all the Help. > > I've now implemented a modified Version of Ahmet Arslan's > Idea and it works. Great to hear that! Doing query iteration programatically will be faster than making it with ShingleFilter. Since you don't care about scores, you can enhance your search time complex

Re: Lucene search in URL

2009-09-20 Thread Florian Klingler
ts.totalHits>0; } Florian Klingler - Ursprüngliche Mail - Von: "Florian Klingler" An: java-user@lucene.apache.org Gesendet: Montag, 21. September 2009 00:14:25 Betreff: Re: Lucene search in URL Thanks for all the Help. I've now implemented a modified Version

Re: Lucene search in URL

2009-09-20 Thread Florian Klingler
cked. > > So my Question is, is there a possibility to specify an Query to serch only > for exact Document-Matches. > > > Thanks very much, > Florian Klingler > > - Ursprüngliche Mail - > Von: "Anshum" > An: java-user@lucene.apache.org >

Re: Lucene search in URL

2009-09-20 Thread Anshum
does not match > "en.wikipedia.org/wiki/production" -> does not match > * "en.wikipedia.org/wiki/production_code" -> Matches, so the URL and all > subURLs are blocked. > > So my Question is, is there a possibility to specify an Query to serch only > for ex

Re: Lucene search in URL

2009-09-20 Thread AHMET ARSLAN
> Is there a possibility in Lucene to do a Exact Search with > Tokenized text? > > Like: "en.wikipedia.org/wiki/production_code" is Tokenized > in > "en.wikipedia.org" > "wiki" > "production" > "code" > with Standardanalyzer. > > And a search will match iff(and only if) all the Tokens > match? >

Re: Lucene search in URL

2009-09-20 Thread Florian Klingler
; -> Matches, so the URL and all subURLs are blocked. So my Question is, is there a possibility to specify an Query to serch only for exact Document-Matches. Thanks very much, Florian Klingler - Ursprüngliche Mail - Von: "Anshum" An: java-user@lucene.apache.org Gesendet

Re: Lucene search in URL

2009-09-19 Thread Anshum
Hi Florian, Perhaps you might run into issues with using an ngram. How I see it is that you need tokenized urls and need to run an exact search using a keyword tokenizer on the search string. You could try this. I am assuming it'll work. so something like en.wikipedia.org/wiki/production_code/test

Re: Lucene search in URL

2009-09-19 Thread AHMET ARSLAN
> Dear List, > > I'm working on a project where i have to check a Blacklist > of URL's with Lucene. (about 500.000) > Is it possible to search for a URL in a hierarchical > context? > > for Example: > Blacklist entry: "en.wikipedia.org/wiki/production_code" > > "en.wikipedia.org/wiki/production_