Hi,

<target name="depends">
     <echo file="Y:/test.html">
         <![CDATA[
         <html>
         <head>
         <title>summary</title>
         <link rel="stylesheet" href="summary.css" type="text/css">
         </head>
         <body>
         <a name="overview"></a>
         <center>
         <table class="summary"> was wrong </table>
         </center>
         </html>
         ]]>
         </echo>
        </target>

        <target name="main" depends="depends">
     
        <loadfile srcfile="Y:/test.html" property="summary">
        <filterchain>
            <containsregex
              pattern='&lt;table[^&lt;/]*&gt;(.*?)&lt;/table&gt;'
              replace="\1"
              byline="true"
              />
            <tokenfilter>
                <!-- to get rid of whitespace in ${summary} -->
                <trim/>
            </tokenfilter>
        </filterchain>
    </loadfile> 
     
     <echo>Summary == ${summary}</echo>
        
        </target>

gives only the text =

depends:
main:
     [echo] Summary == was wrong
BUILD SUCCESSFUL
Total time: 407 milliseconds


you have to use \1 and byline=true

Regards, Gilbert 

-----Original Message-----
From: George Bills [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 28, 2006 6:14 AM
To: Ant Users List
Subject: Re: containsregex and concat

Thanks: the regular expression works now, which is progress. 
Unfortunately I'm getting all of the concatenated text, not just the 
matching text. If I use replace:
<filterchain>
  <!--<tokenfilter><filetokenizer />-->
    <containsregex flags="isg"
      pattern="${summary.regex}"
      replace="SUMMARYTABLE"
      byline="false" <!-- implies filetokenizer -->
      />
    <!-- </tokenfilter>-->
</filterchain>

I end up getting something like:
[concat] <html>
[concat] <head>
[concat] <title>summary</title>
[concat] <link rel="stylesheet" href="summary.css" type="text/css">
[concat] </head>
[concat] <body>
[concat] <a name="overview"></a>
[concat] <center>
[concat] SUMMARYTABLE
[concat] </center>
[concat] ...more HTML here...
[concat] </html>

I'm assuming it's because the file is just one big token - but if I use 
a line tokenizer, will I be able to match regular expressions over 
multiple lines?

Thanks for the help.

Rebhan, Gilbert wrote:
> Hi,
>
> <table[^>/]*>(.*?)</table>
>
> should match :
>
> <table class="summary">foobar</table>
>
> also with more than one attribute
>
> <table class="summary" foo="bar">foobar</table>
>
>
> foobar is  /1  (group 1)
>
>
> Regards, Gilbert
>  
>
> -----Original Message-----
> From: George Bills [mailto:[EMAIL PROTECTED] 
> Sent: Monday, November 27, 2006 6:41 AM
> To: Ant Users List
> Subject: Re: containsregex and concat
>
> Hrm, it probably isn't since advanced regexs are still black magic to 
> me. The "." was supposed to match any character, including a newline 
> (with the s flag), the * to say match 0-n of them and the ? to say be 
> lazy, match as little as possible (so that I don't pull in 
> <table>...</table><table>...</table> in one match).
>
> I just tried [^<], but it doesn't seem to work - I think because of
such
>
> things as "<table><tr>...</tr></table>" - the opening bracket of <tr> 
> conflicts. I tried [.&lt;&gt]*? to make sure that the "regex.body"
part 
> was matching the brackets, but that didn't work either.
>
> Also, <table class="summary"> was wrong - <table class="summary"(.*?)>

> is a little better since the tables can have more than the class 
> attribute (in fact, all of them do). But after changing that I'm 
> matching the entire document - <html> through to </html>. That might 
> just be because I'm using filetokenizer - if I make one match within 
> filetokenizer, do I end up getting the entire document? If so, how do
I 
> get only the matching text?
>
> Regex is now: <table class="summary".*?>.*?</table>
>
> Thanks for the help, I appreciate it.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to