Lucene indexing for pdf files

2007-08-30 Thread Madhu
le reading pdf files. Regards, Madhu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Lucene indexing

2007-08-30 Thread Madhu
Hi all.. I am trying to index 5Mb excel file ,but while indexing using poi 3..Its giving me out of memory exception. Can any one knows how to index large size excle files files. - To unsubscribe, e-mail: [EMAIL PROTECTED] For

RE: Single Analyzer for multiple European languages

2005-09-27 Thread Madhu Satyanarayana Panitini
Hi all, One more idea would be using cryptograms to differentiate between languages, and then u can use the delete stopwords and apply stemming for particular language. Regards madhu -Original Message- From: Endre Stølsvik [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 27, 2005 4

preserving document attributes

2005-09-14 Thread Madhu Satyanarayana Panitini
there any possibility to index the author and publischer data and use in the search. please tell me how can I index and use in search. And also please guide me with methods or references in IR that use attributes of DOC for search purpose. Thanks in adv

RE: Splitting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
;,?, etc And then we remove the stopwords and then stemming goes on. Coming my question is clear now how Lucene splits the text? only when ever it encounter the space between the words or it consider the non alphabetic characters as well. What is the whole grammar Standard analyzer h

Spliting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
Hai all I want know the split pattern of text before indexing in Lucene, its splits where ever there is space in between the words Or is there any pattern in splitting the words of text document. In which program I can find the code on the splitting of the word. Madhu Madhu Satyanarayana

multi word synonym

2005-04-26 Thread Madhu Sasidhar, MD
So, in this sentence: Lab results for alpha 1 antitrypsin level I would like to index 'alpha-1-antitrypsin', 'antitrypsin', 'antitrypsin, alpha 1', 'A1AT' as synonyms for the phrase alpha-1-antitrypsin in the sentence. Thanks in advance... madhu