Thanks to all.  I learn a little more each day ;)

On May 22, 3:03 am, "Ralf S. Engelschall" <rse+jquery-
[EMAIL PROTECTED]> wrote:
> On Tue, May 22, 2007, Jörn Zaefferer wrote:
> >  Dan G. Switzer, II wrote:
> > >> This is a little off-topic, but when doing a regex search and replace
> > >> within a text editor, how can I replace one character within a
> > >> specific pattern?
>
> > >> I want to get rid of newlines within <td> tags.  This finds them:
> > >> <td>[^<]+(\r\n).+</td>
>
> > >> How do I specify that I only want to replace the matched set?
>
> > > You group all the contents and then the replacement string are all the
> > > matched sets pieced back together:
>
> > > sHtml.replace(/(<td>[^<]+)(\r\n)(.+</td>)/gi, "$1$3")
>
> >  If I got that right, you could even mark the second group to be skipped by
> >  adding a colon:
>
> >  sHtml.replace(/(<td>[^<]+)(:\r\n)(.+</td>)/gi, "$1$2")
>
> The syntax requires a question mark: (?:...)
>
> >  Or just skip the parentheses?
>
> >  sHtml.replace(/(<td>[^<]+)\r\n(.+</td>)/gi, "$1$2")
>
> Yes, but this IMHO is still too weak because...
>
> 1. the ".+" in this regex is greedy and matches too much and this
>    way you would only remove newlines from every _second_ <td>...</td>
>    construct. So one has to use at least .+? to fix this.
>
> 2. Additionally, I recommend to use \r?\n to support both the Windows
>    CR-LF and Unix LF-only field.
>
> 3. The [^<]+ I do not understand as it would NOT allow to remove the
>    newlines when there is additional markup in the <td> container as in
>    "<td>...\n...<span>...</span>...</td>". I recommend to replace it
>    with just ".*?".
>
> 4. The "+" qualifier should be actually "*" as it might be fully valid
>    to have a "<td>\r\n</td>" container ;-)
>
> 5. The </td> has to be written escaped as in <\/td> within the regex
>    construct.
>
> 6. As the "." regex character in JavaScript does NOT match newline
>    character one has to use "(.|\r?\n)*".
>
> So, I recommend the following stronger version:
>
> sHtml.replace(/(<td>.*?)\r?\n((?:.|\r?\n)*?<\/td>)/gi, "$1$2")
>
> But even this still has the problem that it is unable to remove MULTIPLE
> occurences of newlines in the SAME <td> container. If this should be
> also allowed one has to trick a little bit more:
>
> sHtml = sHtml.replace(
>     /(<td>)(.*\r?\n(?:.|\r?\n)*)(<\/td>)/gi,
>     function ($0, $1, $2, $3) {
>         return $1 + $2.replace(/\r?\n/g, "") + $3;
>     }
> );
>
> This now should be a strong enough version and finally
> do what was requested...
>
>                                        Ralf S. Engelschall
>                                        [EMAIL PROTECTED]
>                                        www.engelschall.com

Reply via email to