Hello Tika Gurus,
        I am trying to extract the main text content using BoilerPipe
interfaces provided in Tika.

When i use the interface.
            ContentHandler handler1 = new
BoilerpipeContentHandler(textBuffer);
It works fine.

I believe this constructor uses the DefaultExtractor. I want to use the
article extractor.
I tried doing something like this.

 ContentHandler handler1 = new BoilerpipeContentHandler(textBuffer);
ContentHandler handler2 = new BoilerpipeContentHandler(handler1,
ArticleExtractor.getInstance());

But this gave weird nested <A> errors.

Could you please let me know what is the right away to invoke the
ArticleExtractor.

Thanks
Shyam

Reply via email to