your search engine would extract text content from a PDF file and all markup, pictures etc would be lost. and so when you search you would get only text, highlighted or not.
Best Regards Alexander Aristov On 18 February 2011 21:29, Gong Li <ee07b...@gmail.com> wrote: > Hi, > > I am developing a PDF search engine, locally. I have used API: pdfbox and > lucene. > > I must show the user the PDF page containing the keywords(if highlight, > it's > great) and sort by relevance(default in lucene). HOW??? > > Maybe, if there are some pictures in the PDF page, how could it display to > the user after index and search the extracted text??? > > Thanks >