the scratch using Lucene.
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi All
I m facing problems in setting up lucene. kindly some guy guide me in this
Hi All,
I m developing a search engine for Urdu language. I want to use lucene
for that purpose. Now the situation is that
---I have a corpus of 2000 Urdu(Variant of Persian and Arabic) documents
in XML form, how i will make index of them using Lucene.
---Well there will be need some stemming
I want to index the Urdu language corpus (200 documents in CES XML DTD
format). Is net necessary to break the XML file into 200 different files
or it can be indexed in the original form using Lucene. Kindly guide in
this regard.
---
Hi All,
Can some explain to me this line. I encounter this line while setting up
Lucene...
Connect to the top-level of your Lucene installation
Kindly guide me in this regard.
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL
I m new to lucene and want to clear about some questions.
When I unpacked the Lucene, which i downloaded from Apache site.
I ran the Build.txt file and there are five steps to set up lucene.
Lucene Build Instructions
$Id: BUILD.txt 476955 2006-11-19 22:28:41Z hossman $
Basic steps:
0) Instal
his Regard
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
de me in this regard
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
situation. Kindly guide me in this regard..
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello
I m studying Lucene In Action. In chapter 2 the first example in
generating errors in this part of code.
doc.add(Field.Keyword("id", keywords[i]));
doc.add(Field.UnIndexed("country", unindexed[i]));
doc.add(Field.UnStored("contents", unstored[i]));
doc.add(Field.Text("cit
This code generate error, kindly tell me that what parameters will be
use when we use constructors.
Document doc = new Document();
doc.add( Field.Keyword("id", keywords[i]));
doc.add( Field.UnIndexed("country", unindexed[i]));
doc.add(Field.UnStored("contents", unstored[i]));
i m studying LIA. but there is a problem with code. When i run the code
i get errorsThe errors are related with the use of deprecated
APIs.Kindly suggest me the right APIs and also instructions how to
handle this situation with other code..
package lia.indexing;
import org.apache.lucene.stor
No you are not getting me. I have this original code. What i should use
instead of this code to create a directory, because the dir
=FSDirectory.getDirectory(indexDir, true) is deprecated.
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
protected Directo
I m facing problem with this code..
dir = new FSDirectory();
dir.getDirectory(indexDir, true);
i get error that FSDirectory has protected access. So what i should use
instead of it...
Liaqat
-
To unsubscribe, e-mail: [E
Hi,
I m facing a problem while indexing a small .txt file with Lucene. The
file which i want to index with lucene is in Urdu language (varient of
Arabic and Persian). But the Index i get is in Unicode form, not in the
real form (original Urdu text). This program works good for a file in
Englis
Hi all,
I want to index an XML file,containing 200 Urdu language (Varient of
Arabic and Persian) documents. This corpus is in CES format,consisting
of information about author and many more, I just want to extract
textual data of each document and relative Doc number and title in each
documen
Hi
I am trying to run a code from Lucene In Action, but it generate some
errors.There is one one warning at compilation time and the errors
generate at run time. Given below the code and errors. Kindly give me
some clue. thanks...
*_Code:_*
///package lia.handlingtypes.xml;
import lia.handl
Michael McCandless wrote:
See this thread for one suggestion:
http://www.gossamer-threads.com/lists/lucene/java-user/55465
Mike
"Liaqat Ali" <[EMAIL PROTECTED]> wrote:
Hi
I am trying to run a code from Lucene In Action, but it generate some
errors.There is on
hello,
I am try to make an index of 191 documents stored in 191 text files. I
developed a program, which works well with files containing single line,
but files with multiple lines posing a problem.So i added while loop to
completely extract data from each document. But it has some logical
er
Hi, Erick
Thanks for your suggestion, putting the declaration of StringBuffer
variable sb inside the for loop is working well. I want to ask another
question, can we modify the StopyAnalyzer to insert Stop Words of
another language, instead of English, like Urdu given below:
public stati
Hi, Doro Cohen
Thanks for your reply, but I am facing a small problem over here. As I
am using notepad for coding, then in which format the file should be saved.
public static final String[] URDU_STOP_WORDS = { "کے" ,"کی" ,"سے" ,"کا"
,"کو" ,"ہے" };
Analyzer analyzer = new StandardAnalyzer(
李晓峰 wrote:
"javac" has an option "-encoding", which tells the compiler the
encoding the input source file is using, this will probably solve the
problem.
or you can try the unicode escape: \u, then you can save it in
ANSI, had for human to read though.
or use an IDE, eclipse is a good choic
Doron Cohen wrote:
On Dec 26, 2007 10:33 PM, Liaqat Ali <[EMAIL PROTECTED]> wrote:
Using javac -encoding UTF-8 still raises the following error.
urduIndexer.java : illegal character: \65279
?
^
1 error
What I am doing wrong?
If you have the stop-words in a file, say one wor
Grant Ingersoll wrote:
Are you altering (stemming) the token before it gets to the StopFilter?
On Dec 26, 2007, at 5:08 PM, Liaqat Ali wrote:
Doron Cohen wrote:
On Dec 26, 2007 10:33 PM, Liaqat Ali <[EMAIL PROTECTED]> wrote:
Using javac -encoding UTF-8 still raises the following
Grant Ingersoll wrote:
On Dec 26, 2007, at 5:24 PM, Liaqat Ali wrote:
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
No, at this level I am not using any stemming technique. I
;text",URDU_STOP_WORDS[0] +
" regular text",Store.YES, Index.TOKENIZED));
indexWriter.addDocument(doc);
Now URDU_STOP_WORDS[0] should not appear within the index terms.
You can easily verify this by iterating IndexReader.terms();
Regards, Doron
On Dec 27, 2007 9:36 AM
Doron Cohen wrote:
This is not a self contained program - it is incomplete, and it depends
on files on *your* disk...
Still, can you show why you're saying it indexes stopwords?
Can you print here few samples of IndexReader.terms().term()?
BR, Doron
On Dec 27, 2007 10:22 AM, Liaqa
Doron Cohen wrote:
On Dec 27, 2007 11:49 AM, Liaqat Ali <[EMAIL PROTECTED]> wrote:
I got your point. The program given does not give not any error during
compilation and it is interpreted well. But the it does not create any
index. when the StandardAnalyzer() is called without Sto
Hello All,
I want to calculate the Precision and Recall of the current system,
based on Lucene. What should be the procedure and either there are some
tools available for this purpose.
Kindly guide me.
Regards,
Liaqat
-
To
, Liaqat Ali wrote:
Hello All,
I want to calculate the Precision and Recall of the current system,
based on Lucene. What should be the procedure and either there are
some tools available for this purpose.
Kindly guide me.
Regards,
Liaqat
Hello,
I am using treceval for precision, recall calculation. Treceval takes
Relevance judgments and Result file as an arguments to calculate the
precision, recall. There is a similarity parameter in the result file.
The score which is calculated by Lucene is equal to that similarity
paramet
hed each conference, my guess is
one of them will explain it in more detail, or perhaps any docs for
the trec_eval program will.
-Grant
On Jan 2, 2008, at 3:07 PM, Liaqat Ali wrote:
Hello,
I am using treceval for precision, recall calculation. Treceval takes
Relevance judgments and Result fi
Hi
Kindly tell me about some open source Arabic Stemmer which can be used with
Lucene.
Regards,
Liaqat Ali
33 matches
Mail list logo