Hello,

I'm using Apache POI to extract heading (not header/footer) content
from a document. Essentially, I need to:

1. Determine if it has headings
2. Get the headings in order as they appear in the document (i.e.,
reading order)
3. Get the "heading levels" if possible

The point of this is to inspect if the document is set up in such a
way that it can be navigated using accessibility tools.

I've read a bit about how to do this on StackOverflow, but I was
wondering if there are more direct ways to get it than inspecting the
style names?

I've seen at least in the OLE2 document format, the paragraph model
has a getLvl which gives its outline level. Is there an "outline"
model available that can be navigated to find the headings (if any) in
the document? Is that what the "Bookmarks" are or am I going down the
wrong path?

Note that I'll be writing an interface that can do this for both the
XML and OLE2 variations of this content so information for both would
be extra helpful.

If there is anything that is built into the underlying document
formats but just not exposed in the POI APIs, I'd certainly consider
having a look and contributing some additions if existing ways are
very unreliable.

Thanks,
Branden

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to