Lucene Indexing performance issue

2014-10-22 Thread Jason Wu
Hi Team, I am a new user of Lucene 4.8.1. I encountered a Lucene indexing performance issue which slow down my application greatly. I tried several ways from google searchs but still couldn't resolve it. Any suggestions from your experts might help me a lot. One of my application uses the l

Re: indexing performance issue

2006-11-30 Thread Antony Bowesman
spinergywmy wrote: I have posted this question before and this time I found that it could be pdfbox problem and this pdfbox I downloaded doesn't use the log4j.jar. To index the app 2.13mb pdf file took me 17s and total time to upload a file is 18s. Re: PFDBox. I have a 2.5Mb test file that

Re: indexing performance issue

2006-11-30 Thread Antony Bowesman
Grant Ingersoll wrote: On Nov 30, 2006, at 10:54 AM, spinergywmy wrote: For my scenario will be every time the users upload the single file, I need to index that particular file. Previously was because the previous version of pdfbox integrate with log4j.jar file and I believe is the log4j.j

Re: indexing performance issue

2006-11-30 Thread Grant Ingersoll
On Nov 30, 2006, at 10:54 AM, spinergywmy wrote: Hi Grant, Thanks for the tips. I will take ur adviced and look into the link that u send to me. For my scenario will be every time the users upload the single file, I need to index that particular file. Previously was because the

Re: indexing performance issue

2006-11-30 Thread spinergywmy
f I'm wrong. Thanks regards, Wooi Meng -- View this message in context: http://www.nabble.com/indexing-performance-issue-tf2730895.html#a7621903 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --

Re: indexing performance issue

2006-11-30 Thread Grant Ingersoll
re any way or others software than pdfbox to solve the performance issue. Thanks. regards, Wooi Meng -- View this message in context: http://www.nabble.com/indexing- performance-issue-tf2730895.html#a7617155 Sent from the Lucene - Java Users mailing list a

indexing performance issue

2006-11-30 Thread spinergywmy
than pdfbox to solve the performance issue. Thanks. regards, Wooi Meng -- View this message in context: http://www.nabble.com/indexing-performance-issue-tf2730895.html#a7617155 Sent from the Lucene - Java Users mailing list archive at Nabbl

Re: Indexing Performance issue

2006-11-16 Thread Antony Bowesman
spinergywmy wrote: Hi, I having this indexing the pdf file performance issue. It took me more than 10 sec to index a pdf file about 200kb. Is it because I only have a segment file? How can I make the indexing performance better? If you're using the log4j PDFBox jar file, you must make sure

Re: Indexing Performance issue

2006-11-10 Thread Ioan Cocan
You may want to use something like pdftotext part of XPDF (http://www.foolabs.com/xpdf/download.html). It will produce a text extract for a PDF. Indexing will work like a breeze, without memory consumption of PDFBox. Regards, Ioan spinergywmy wrote: Hi, I having this indexing the pdf file

Re: Indexing Performance issue

2006-11-10 Thread Erick Erickson
Have you measured to see how much of your time is spent indexing and how much is just parsing the file? You need to do this before having a clue what you need to make faster Erick On 11/10/06, Daniel Naber <[EMAIL PROTECTED]> wrote: On Friday 10 November 2006 12:18, spinergywmy wrote: > I

Re: Indexing Performance issue

2006-11-10 Thread Daniel Naber
On Friday 10 November 2006 12:18, spinergywmy wrote: >  I having this indexing the pdf file performance issue. It took me more > than 10 sec to index a pdf file about 200kb. Is it because I only have a > segment file? How can I make the indexing performance better? PDFBox (which I assume you are

Indexing Performance issue

2006-11-10 Thread spinergywmy
://www.nabble.com/Indexing-Performance-issue-tf2607084.html#a7275098 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]