On Wed, Dec 23, 2020 at 9:31 PM Tim Chase <[email protected]> wrote:
> On 2020-12-23 20:39, John Cordes wrote:
> >> I'd start with this ugly monstrosity:
> >>
> >> :%s/^2 \u\{3,} \zs\(.*\n\(\%(\D\|3 CONC \).*\n\)\+\)/\='<div
> >> class="xxx">'.substitute(substitute(submatch(1), '\n3 CONC ', '',
> >> 'g'), '\n', '', 'g')."<\/div>\n"
> >
> > I will attempt to deconstruct your 'monstrosity' somewhat later,
>
> Tweaking it so that it only does NOTE items, not generic
> continuations:
>
> :%s/^2 NOTE \zs\(.*\n\%(\%(\D\|3 CONC \).*\n\)\+\)/\='<div
> class="xxx">'.substitute(substitute(submatch(1), '\n3 CONC ', '',
> 'g'), '\n', '', 'g')."<\/div>\n"
>
> Breaking it down so hopefully you can swap parts as you see fit:
>
> :%s/^2 NOTE \zs On every line starting with "2 NOTE "
> start our replacement here (\zs)
> \( start capturing the note
> this will be submatch(1) later
> .* everything else on that line
> \n and the newline
> \%( a non-capturing group for another line that
> \%(\D starts with either a non-digit
> \| or
> 3 CONC a literal "3 CONC "
> \) (end of this OR of things marking a continuation)
> .*\n followed by the rest of the line
> \) (end of this continuation-line)
> \+ we can have 1 or more continuation lines
> \) end the capturing
> / replace it with
> \= the result of evaluating this expression
> '<div class="xxx">' the literal opening tag
> . and then the results of
> substitute( remove all the newlines from the results of
> substitute( removing from
> submatch(1), the whole set of continuation stuff
> '\n3 CONC ', the literal newline-followed-by-"3 CONC "
> '', and replace them with nothing
> 'g' everywhere
> ), and in that "\n3 CONC "-less text, replace
> '\n', newlines with
> '', nothing
> 'g') everywhere
> . and then tack on
> "<\/div>\n" the literal closing </div> followed by a newline
>
> > It's a bit more complicated than I first explained. Two aspects:
> > a) I *do* need to search on the "2 NOTE" lines, since there are
> > various other chunks of lines with the CONC lines; and
> > b) Sometimes the line "2 TYPE tngnote" has a line between it and
> > the "2 NOTE". The intervening line can look like this
> >
> > 2 DATE 18 AUG 1776
> > or this
> > 2 _SDATE 1802
>
> Given the substitution command above, it should only touch "2 NOTE"
> lines with subsequent "3 CONT" lines. It does *every* "2 NOTE" so if
> you need to limit them to just those that immediately follow "2 TYPE
> tngnote" (assuming there aren't any "2 TYPE tngnote" that *don't*
> have a NOTE immediately following them), you can tweak that command,
> changing that inital "%" to
>
> :g/^2 TYPE tngnote//2 NOTE /s/^2 NOTE \zs…
>
> This looks for all the "2 TYPE tngnote" lines, searches forward
> (skipping over any DATE/_SDATE lines or other intervening stuff) for
> the "2 NOTE " line following it, and then only performs the
> subsitution on those particular lines.
>
> > So the lines to change could look like this:
> >
> > ===================
> > 1 EVEN
> > 2 TYPE tngnote
> > 2 _SDATE 1802
> > 2 NOTE The surname of John's wife is not positively established.
> > However, it is certain that her given name is Elizabeth; evidence
> > for this comes first from the baptismal records for Rebecca and
> > Eliza Catherine; these children were born while th
> > 3 CONC e family was in London so the records are available in the
> > London Metropolitan Archives (the other two children were born in
> > Sheffield). Henry's baptismal record in Sheffield also has his
> > parents being John (a skinner) and Elizabeth. The id
> > 3 CONC entification of John's wife specifically with Elizabeth
> > Coxsey is somewhat tentative, however.
> > 1 EVEN
> > ===================
> >
> > This search pattern
> > /^2 TYPE tngnote.*\n*\(\_^2 .*DATE.*\)*\n\_^2 NOTE
> >
> > works to find all 3 possibilities: no DATE line, an _SDATE line
> > or a DATE line.
> >
> > I thought I would be able to combine that with your pattern like
> > so:
> >
> > :%s/^2 TYPE tngnote.*\n*\(\_^2 .*DATE.*\)*\n\_^2 NOTE
> > \zs\(.*\n\(\%(\D\|3 CONC \).*\n\)\+\)/\='<div
> > class="xxx">'.substitute(substitute(submatch(1), '\n3 CONC ', '',
> > 'g'), '\n', '', 'g')."<\/div>\n"
> >
> > but that is not working.
>
> I suspect that the problem snuck in by using \(…\) in your added
> conditions which captured that as submatch(1). So you can either
> make it non-capturing by adding that "%" before the open-paren:
>
> \%(\_^2 .*DATE.*\)
>
> or change the "submatch(1)" to "submatch(2)"
>
> > Here's an example of one small chunk of
> > lines which were transformed by that command:
> >
> > 1 EVEN
> > 2 TYPE tngnote
> > 2 DATE 18 AUG 1776
> > 2 NOTE <div class="xxx">2 DATE 18 AUG 1776</div>
> > 1 EVEN
>
> Note that the content here is what you captured in the first group.
> :-)
>
> Hope this helps get you on the right path,
>
> -tim
>
>
This is amazing looking, Tim -- thanks so much! There is a lot for a
nearly 80-year old to unpack here -- it's going to take me a while. :)
It looks as though you have covered all the bases I want to deal with.
Thank you again,
John
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/vim_use/CAGZBEdSChuJr8t82%3DOE-aMwQ6GgXyUKj-6SnBMmpQJLEHC9h%2BA%40mail.gmail.com.