Hrm, it probably isn't since advanced regexs are still black magic to me. The "." was supposed to match any character, including a newline (with the s flag), the * to say match 0-n of them and the ? to say be lazy, match as little as possible (so that I don't pull in <table>...</table><table>...</table> in one match).

I just tried [^<], but it doesn't seem to work - I think because of such things as "<table><tr>...</tr></table>" - the opening bracket of <tr> conflicts. I tried [.&lt;&gt]*? to make sure that the "regex.body" part was matching the brackets, but that didn't work either.

Also, <table class="summary"> was wrong - <table class="summary"(.*?)> is a little better since the tables can have more than the class attribute (in fact, all of them do). But after changing that I'm matching the entire document - <html> through to </html>. That might just be because I'm using filetokenizer - if I make one match within filetokenizer, do I end up getting the entire document? If so, how do I get only the matching text?

Regex is now: <table class="summary".*?>.*?</table>

Thanks for the help, I appreciate it.

Dave Brosius wrote:
.*?

doesn't seem right to me.

what's that's suppposed to do?

probably something like [^<]*



----- Original Message ----- From: "George Bills" <[EMAIL PROTECTED]>
To: <user@ant.apache.org>
Sent: Sunday, November 26, 2006 11:47 PM
Subject: containsregex and concat


I've been trying to use a regular expression and the concat task to pull summary tables (<table class="summary">...</table>) out of a set of test reports. The reports are all HTML files sitting in ${report.path}. The task works fine up until I start trying to select output from it with <containsregex>. Is there something wrong with my regular expression? Is there an easier way to do this? Any help would be appreciated.

The code is:
====================
<target name="summary"> <!-- make a report summary -->
<property name="summary.start" value="&lt;table class=&quot;summary&quot;&gt;" /> <property name="summary.body" value=".*?" /> <!-- enable "s" for newline matches -->
   <property name="summary.end"   value="&lt;/table&gt;" />
<property name="summary.regex" value="${summary.start}${summary.body}${summary.end}" />
   <echo>${summary.regex}</echo>
   <concat>
       <header>HEADER</header>
       <fileset dir="${report.path}"
           includes="*.html"
           excludes="${summary.file}" />
           <filterchain>
               <tokenfilter>
                   <filetokenizer />
                   <containsregex flags="is"
                                  pattern="${summary.regex}" />
               </tokenfilter>
           </filterchain>
       <footer>FOOTER</footer>
   </concat>
</target>
====================

The regular expression echoes as:
====================
<table class="summary">.*?</table>
====================

I've done some testing of the expression at http://www.fileformat.info/tool/regex.htm, and it seems to work there.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to