Re: Need help to do simple line by line indexing and search

2014-09-17 Thread atawfik
Hi, Can you share the implementation of your analyzer. It might be the problem. It will be helpful to share also a sample of your indexed documents. Regards Ameer -- View this message in context: http://lucene.472066.n3.nabble.com/Need-help-to-do-simple-line-by-line-indexing-and-search-tp415

Re: Doubt Lucene

2014-09-17 Thread atawfik
I tried to replicate your search scenario using the code below: Indexer ind = new Indexer(); IndexWriter indW; List listData = new LinkedList<>(); listData.add("Name:PeterRooney"); indW = ind.CreateIndexDir(listData);

Re: Doubt Lucene

2014-09-16 Thread atawfik
Hi, Can you elaborate more on the confusion or doubt you have? Can you provide a sample of your document and query that give you the trouble? I was not able to deduce what is the problem. Regards Ameer -- View this message in context: http://lucene.472066.n3.nabble.com/Doubt-Lucene-tp415906

Re: Arabic Stemmer problem

2014-09-09 Thread atawfik
Hi Suleman, It is not a bug, it is the intended behavior. In fact, your examples are correct. It is just the daily usage for these words has changed recently. For instance, "سيار" means actually something that moves or walks. Since people use mobile everywhere, the word now means mobile. That

Re: KeywordAnalyzer still getting tokenized on spaces

2014-09-09 Thread atawfik
The result of QueryParser is confusing. The problem is that you assume the query parser uses the analyzer to parse your query. However, that is not the case. The query parser first parses the query string, then applies the analyzer. In other words, the query parser will split the query string usin

RE: How to properly correlate relevance in a search across multiple collections

2014-09-08 Thread atawfik
Hi David, It seems that MultiSearcher is deprecated in favor of MultiReader. Have a look here . Regarding the meta search approach, you can normalize raw scores of documents. There are many ways to do that. Just search for "normalization scor

Re: How to properly correlate relevance in a search across multiple collections

2014-09-07 Thread atawfik
Hi, if you have documents that might exist in multiple collections, then you can use techniques from meta search. That is combining multiple search results from different collections. In this case, you can retrieve the top 100 or 1000 documents from each collection and merge them. You then rank do

Re: In need of some guidance

2014-09-05 Thread atawfik
Hi, Try Lucene tutorial . If you follow the materials there you should be able to grasp what Lucene is and how to use it. After that, you should read the Lucene In Action book. It is a very good book that tells you everything about Lucene. Regards Ameer --

Re: Snowball filter - Error instantiating stemmer for a language

2014-09-04 Thread atawfik
Hi Chris, Thanks for the reply. I am sure I have lucene-analyzers-common.jar added to my eclipse project. However, I have figured out a way to run it. I am not sure whether this is a bug or undocumented workflow for Snowball filters. Map args = new HashMap<>(); TokenStream tokenStream = new Sta

Snowball filter - Error instantiating stemmer for a language

2014-09-04 Thread atawfik
I am trying to use some filters from the snowball package. However, when I run the following code: Map args = new HashMap<>(); TokenStream tokenStream = new StandardTokenizer(Version.LUCENE_46, new StringReader("Some text")); args.put("luceneMatchVersion", "4.6"); args.put("language", "Catalan");