On 26/02/2014 10:42 p.m., Tommaso Cucinotta wrote:
Hi, the advanced find and replace is a feature that is already
integrated within the widely distributed LyX. Only, it is realized as
a proof of concept, having started from a basic minimally invasive
approach in which plain-text and latex export features of LyX are
re-used without any change, then complex text-based matching has to
happen from the outside, processing the exported strings. This is
CPU-intensive and error-prone, and indeed the feature embeds plenty of
heuristic and hacky string processing tricks that work for the most
common cases, but miserably fail in a huge number of corner cases. The
project is about re-engineering the implementation to have a robust
feature with possibly enhanced search options. First of all, let me
advice you to go and try hard to use the feature as it is now, and try
to see whether you find use-cases in which its behaviour doesn't meet
expectations (many of these are on the bug tracker, though).
To implement a
regular expression search I know a basic approach used by compilers
that is form tokens and parse them using a grammar. I can do that
using two tools flex and yacc. I do not know that they'll interface
well with LyX or not. Any help regarding the same is welcome.
I can tell you what ideas have been around so far (but you can check out the
lyx-devel list archives):
a) introduce a search-specific export method within the already available export
machinery of LyX, by customizing the export according to the selected search
options; great chance to reuse what is already there for plain-text and
latex
exports, only additional if() have to be added to skip constructs whenever
search options don't need them; then, from the outside, we would have a
simple
and straight and fast string comparison; no more corner cases nor complex
regexps all over the place (some of which have 8 '\' chars in a row :-) )
etc.
b) introduce some custom visitor-like interface to all insets, to realize a
visit
to the whole LyX document model objects graph, where search options are kept
into consideration as needed, and comparison is made directly comparing
document
object model contents, rather than comparing some exported string out of
them,
as in a)
In both cases, one challenging feature is the regular expression one, which for
now
just kind of works in common use-cases, but it doesn't even have a clear
specification
of what is supposed to work and what not etc....
Between a) and b) above, I see option a) a bit more realistic and possibly
easier,
in that it may leverage already existing interfaces and visitor infrastructure
already in the code base for export. Option b) might require more work, but
this is
a pure guess, and I'm quite sure that, in both cases, to have things working
properly,
one has to dig into the specifics of each and every inset.
Hope this helps, bye.
T.
This is belated but I find myself intrigued by your description of the
underlying process involving latex export and searching the strings
resulting from that. Did you consider exporting to LyX's own native
format? That would remove one level of processing (translating LyX into
LaTeX and back). And for tasks like replacing one mathematical symbol
by another, I think it would align more closely with what a user wants.
At present, a symbol can be replaced in all inline formulas, then as a
separate task in all display formulas, then as a further find-&-replace
in all numbered formulas, then as yet further tasks in all the separate
AMS environments. Using LyX's native format means all these searches can
be combined into just the one, finding-&-replacing what is between
"\begin_inset Formula", "\end_inset" pairs.
I hasten to add that I know no C++. This question is spurred by my
efforts at writing a (find-&-)replace script for the pLyX system.
Andrew