Re: Lucene: If I have picture, table, or somthing others in the PDF

Alexander Aristov Sat, 19 Feb 2011 22:36:33 -0800

your search engine would extract text content from a PDF file and all
markup, pictures etc would be lost. and so when you search you would get
only text, highlighted or not.



Best Regards
Alexander Aristov


On 18 February 2011 21:29, Gong Li <ee07b...@gmail.com> wrote:

> Hi,
>
> I am developing a PDF search engine, locally. I have used API: pdfbox and
> lucene.
>
> I must show the user the PDF page containing the keywords(if highlight,
> it's
> great) and sort by relevance(default in lucene). HOW???
>
> Maybe, if there are some pictures in the PDF page, how could it display to
> the user after index and search the extracted text???
>
> Thanks
>

Re: Lucene: If I have picture, table, or somthing others in the PDF

Reply via email to