Hi, I'm trying to index a big set of plain text files, almost 8,104,467
files, that are all under the same
directory /media/MAFALDA/yohasebewp2txt/Archivos and want to get my index
under /media/MAFALDA/LuceneIndex using IndexFiles.java program from the
documentation.
I'm using Netbeans IDE, and I
Could it be possible to index Wikipedia in a 2 core machine with 3 GB in
RAM? I have had the same problem trying to index it.
I've tried with a dump from april 2011.
Thanks
Reyna
CIC-IPN
Mexico
2012/6/19 Michael McCandless
> Likely the bottleneck is pulling content from the database? Maybe
>
Thanks to all that have done a reply to my question.
Send regards,
Reyna
2012/1/11 Michael Wechner
> Maybe Tika is also of help to you
>
> http://tika.apache.org/
>
> HTH
>
> Michael
>
> Am 11.01.12 20:13, schrieb Reyna Melara:
>
>> Hi, my name is Reyna
Hi, my name is Reyna Melara I'm a PhD student form Mexico, and I have a set
of 11,051,447 files with txt extension but the content of each file is in
fact in wiki format, I want and I need them to be indexed, but I don't know
if I have to convert this content to flat text, I have been r