from:"Liaqat Ali"

lucene for Arabic and Urdu

2007-09-18 Thread Liaqat Ali

the scratch using Lucene. Liaqat Ali - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

setting up lucene

2007-09-25 Thread Liaqat Ali

Hi All I m facing problems in setting up lucene. kindly some guy guide me in this

Integration of Lucene

2007-10-24 Thread Liaqat Ali

Hi All, I m developing a search engine for Urdu language. I want to use lucene for that purpose. Now the situation is that ---I have a corpus of 2000 Urdu(Variant of Persian and Arabic) documents in XML form, how i will make index of them using Lucene. ---Well there will be need some stemming

Corpus interpretation

2007-10-24 Thread Liaqat Ali

I want to index the Urdu language corpus (200 documents in CES XML DTD format). Is net necessary to break the XML file into 200 different files or it can be indexed in the original form using Lucene. Kindly guide in this regard. ---

Lucene setting

2007-11-19 Thread Liaqat Ali

Hi All, Can some explain to me this line. I encounter this line while setting up Lucene... Connect to the top-level of your Lucene installation Kindly guide me in this regard. Liaqat Ali - To unsubscribe, e-mail: [EMAIL

Lucene Setting

2007-11-19 Thread Liaqat Ali

I m new to lucene and want to clear about some questions. When I unpacked the Lucene, which i downloaded from Apache site. I ran the Build.txt file and there are five steps to set up lucene. Lucene Build Instructions $Id: BUILD.txt 476955 2006-11-19 22:28:41Z hossman $ Basic steps: 0) Instal

Problem in Running Lucene Demo

2007-11-19 Thread Liaqat Ali

his Regard Liaqat Ali - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Help needed

2007-11-23 Thread Liaqat Ali

de me in this regard Liaqat Ali - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Problem with indexing

2007-11-24 Thread Liaqat Ali

situation. Kindly guide me in this regard.. Liaqat Ali - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

LIA example problem

2007-11-25 Thread Liaqat Ali

Hello I m studying Lucene In Action. In chapter 2 the first example in generating errors in this part of code. doc.add(Field.Keyword("id", keywords[i])); doc.add(Field.UnIndexed("country", unindexed[i])); doc.add(Field.UnStored("contents", unstored[i])); doc.add(Field.Text("cit

Problem with Add method

2007-11-29 Thread Liaqat Ali

This code generate error, kindly tell me that what parameters will be use when we use constructors. Document doc = new Document(); doc.add( Field.Keyword("id", keywords[i])); doc.add( Field.UnIndexed("country", unindexed[i])); doc.add(Field.UnStored("contents", unstored[i]));

Deprecated API

2007-11-29 Thread Liaqat Ali

i m studying LIA. but there is a problem with code. When i run the code i get errorsThe errors are related with the use of deprecated APIs.Kindly suggest me the right APIs and also instructions how to handle this situation with other code.. package lia.indexing; import org.apache.lucene.stor

FSDirectory Again

2007-11-30 Thread Liaqat Ali

No you are not getting me. I have this original code. What i should use instead of this code to create a directory, because the dir =FSDirectory.getDirectory(indexDir, true) is deprecated. import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; protected Directo

FSDirectory

2007-11-30 Thread Liaqat Ali

I m facing problem with this code.. dir = new FSDirectory(); dir.getDirectory(indexDir, true); i get error that FSDirectory has protected access. So what i should use instead of it... Liaqat - To unsubscribe, e-mail: [E

Indexing Non-English text

2007-12-04 Thread Liaqat Ali

Hi, I m facing a problem while indexing a small .txt file with Lucene. The file which i want to index with lucene is in Urdu language (varient of Arabic and Persian). But the Index i get is in Unicode form, not in the real form (original Urdu text). This program works good for a file in Englis

Indexing XML document

2007-12-04 Thread Liaqat Ali

Hi all, I want to index an XML file,containing 200 Urdu language (Varient of Arabic and Persian) documents. This corpus is in CES format,consisting of information about author and many more, I just want to extract textual data of each document and relative Doc number and title in each documen

Errors while running LIA code.

2007-12-06 Thread Liaqat Ali

Hi I am trying to run a code from Lucene In Action, but it generate some errors.There is one one warning at compilation time and the errors generate at run time. Given below the code and errors. Kindly give me some clue. thanks... *_Code:_* ///package lia.handlingtypes.xml; import lia.handl

Re: Errors while running LIA code.

2007-12-06 Thread Liaqat Ali

Michael McCandless wrote: See this thread for one suggestion: http://www.gossamer-threads.com/lists/lucene/java-user/55465 Mike "Liaqat Ali" <[EMAIL PROTECTED]> wrote: Hi I am trying to run a code from Lucene In Action, but it generate some errors.There is on

problem in indexing documents

2007-12-25 Thread Liaqat Ali

hello, I am try to make an index of 191 documents stored in 191 text files. I developed a program, which works well with files containing single line, but files with multiple lines posing a problem.So i added while loop to completely extract data from each document. But it has some logical er

Modifying StopAnalyzer

2007-12-26 Thread Liaqat Ali

Hi, Erick Thanks for your suggestion, putting the declaration of StringBuffer variable sb inside the for loop is working well. I want to ask another question, can we modify the StopyAnalyzer to insert Stop Words of another language, instead of English, like Urdu given below: public stati

StopWords problem

2007-12-26 Thread Liaqat Ali

Hi, Doro Cohen Thanks for your reply, but I am facing a small problem over here. As I am using notepad for coding, then in which format the file should be saved. public static final String[] URDU_STOP_WORDS = { "کے" ,"کی" ,"سے" ,"کا" ,"کو" ,"ہے" }; Analyzer analyzer = new StandardAnalyzer(

Re: StopWords problem

2007-12-26 Thread Liaqat Ali

李晓峰 wrote: "javac" has an option "-encoding", which tells the compiler the encoding the input source file is using, this will probably solve the problem. or you can try the unicode escape: \u, then you can save it in ANSI, had for human to read though. or use an IDE, eclipse is a good choic

Re: StopWords problem

2007-12-26 Thread Liaqat Ali

Doron Cohen wrote: On Dec 26, 2007 10:33 PM, Liaqat Ali <[EMAIL PROTECTED]> wrote: Using javac -encoding UTF-8 still raises the following error. urduIndexer.java : illegal character: \65279 ? ^ 1 error What I am doing wrong? If you have the stop-words in a file, say one wor

Re: StopWords problem

2007-12-26 Thread Liaqat Ali

Grant Ingersoll wrote: Are you altering (stemming) the token before it gets to the StopFilter? On Dec 26, 2007, at 5:08 PM, Liaqat Ali wrote: Doron Cohen wrote: On Dec 26, 2007 10:33 PM, Liaqat Ali <[EMAIL PROTECTED]> wrote: Using javac -encoding UTF-8 still raises the following

Re: StopWords problem

2007-12-26 Thread Liaqat Ali

Grant Ingersoll wrote: On Dec 26, 2007, at 5:24 PM, Liaqat Ali wrote: - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] No, at this level I am not using any stemming technique. I

Re: StopWords problem

2007-12-27 Thread Liaqat Ali

;text",URDU_STOP_WORDS[0] + " regular text",Store.YES, Index.TOKENIZED)); indexWriter.addDocument(doc); Now URDU_STOP_WORDS[0] should not appear within the index terms. You can easily verify this by iterating IndexReader.terms(); Regards, Doron On Dec 27, 2007 9:36 AM

Re: StopWords problem

2007-12-27 Thread Liaqat Ali

Doron Cohen wrote: This is not a self contained program - it is incomplete, and it depends on files on *your* disk... Still, can you show why you're saying it indexes stopwords? Can you print here few samples of IndexReader.terms().term()? BR, Doron On Dec 27, 2007 10:22 AM, Liaqa

Re: StopWords problem

2007-12-27 Thread Liaqat Ali

Doron Cohen wrote: On Dec 27, 2007 11:49 AM, Liaqat Ali <[EMAIL PROTECTED]> wrote: I got your point. The program given does not give not any error during compilation and it is interpreted well. But the it does not create any index. when the StandardAnalyzer() is called without Sto

Calculating Precision and Recall

2007-12-29 Thread Liaqat Ali

Hello All, I want to calculate the Precision and Recall of the current system, based on Lucene. What should be the procedure and either there are some tools available for this purpose. Kindly guide me. Regards, Liaqat - To

Re: Calculating Precision and Recall

2007-12-29 Thread Liaqat Ali

, Liaqat Ali wrote: Hello All, I want to calculate the Precision and Recall of the current system, based on Lucene. What should be the procedure and either there are some tools available for this purpose. Kindly guide me. Regards, Liaqat

Scoring in Lucene (for Precision and Recall)

2008-01-02 Thread Liaqat Ali

Hello, I am using treceval for precision, recall calculation. Treceval takes Relevance judgments and Result file as an arguments to calculate the precision, recall. There is a similarity parameter in the result file. The score which is calculated by Lucene is equal to that similarity paramet

Re: Scoring in Lucene (for Precision and Recall)

2008-01-02 Thread Liaqat Ali

hed each conference, my guess is one of them will explain it in more detail, or perhaps any docs for the trec_eval program will. -Grant On Jan 2, 2008, at 3:07 PM, Liaqat Ali wrote: Hello, I am using treceval for precision, recall calculation. Treceval takes Relevance judgments and Result fi

Open source Arabic stemmer

2008-01-16 Thread Liaqat Ali

Hi Kindly tell me about some open source Arabic Stemmer which can be used with Lucene. Regards, Liaqat Ali

lucene for Arabic and Urdu

setting up lucene

Integration of Lucene

Corpus interpretation

Lucene setting

Lucene Setting

Problem in Running Lucene Demo

Help needed

Problem with indexing

LIA example problem

Problem with Add method

Deprecated API

FSDirectory Again

FSDirectory

Indexing Non-English text

Indexing XML document

Errors while running LIA code.

Re: Errors while running LIA code.

problem in indexing documents

Modifying StopAnalyzer

StopWords problem

Re: StopWords problem

Re: StopWords problem

Re: StopWords problem

Re: StopWords problem

Re: StopWords problem

Re: StopWords problem

Re: StopWords problem

Calculating Precision and Recall

Re: Calculating Precision and Recall

Scoring in Lucene (for Precision and Recall)

Re: Scoring in Lucene (for Precision and Recall)

Open source Arabic stemmer

33 matches

Site Navigation

Mail list logo

Footer information