Hello list! Suppose I wanted to extract the structure from an Org document, where, what's important for me would be to have it cathegorically divided into headers, paragraphs of text, technical information and inclusion of other documents (code snippets). How would I do it?
The reason I'm asking is that I've a small project I work on, where I'm trying to enhance the search in documents by using indexing combined with queries based on things like distance between words, frequency of a word appearing in a document and so on. (I'm using Sphinx for it.) I've tried to do this with Info pages, and I liked the results, however, in order to do this more intelligently, I'd like to index the documents with better granularity (i.e. so that later on I could search assigning different weights to words appearing in headers and words appearing in comments). Best. Oleg