[O] Extract document structure from Org file

Oleg Sivokon Fri, 03 Jul 2015 01:45:09 -0700

Hello list!

Suppose I wanted to extract the structure from an Org document, where,
what's important for me would be to have it cathegorically divided into
headers, paragraphs of text, technical information and inclusion of
other documents (code snippets).  How would I do it?


The reason I'm asking is that I've a small project I work on, where I'm
trying to enhance the search in documents by using indexing combined
with queries based on things like distance between words, frequency of a
word appearing in a document and so on.  (I'm using Sphinx for it.)
I've tried to do this with Info pages, and I liked the results, however,
in order to do this more intelligently, I'd like to index the documents
with better granularity (i.e. so that later on I could search assigning
different weights to words appearing in headers and words appearing in
comments).

Best.

Oleg

[O] Extract document structure from Org file

Reply via email to