| containsregex and concat | containsregex and concat 2006-11-27 - By George Bills
Hrm, it probably isn't since advanced regexs are still black magic to me. The "." was supposed to match any character, including a newline (with the s flag), the * to say match 0-n of them and the ? to say be lazy, match as little as possible (so that I don't pull in <table>...</table><table>...</table> in one match).
I just tried [^<], but it doesn't seem to work - I think because of such things as "<table><tr>...</tr></table>" - the opening bracket of <tr> conflicts. I tried [.<>]*? to make sure that the "regex.body" part was matching the brackets, but that didn't work either.
Also, <table class="summary"> was wrong - <table class="summary"(.*?)> is a little better since the tables can have more than the class attribute (in fact, all of them do). But after changing that I'm matching the entire document - <html> through to </html>. That might just be because I'm using filetokenizer - if I make one match within filetokenizer, do I end up getting the entire document? If so, how do I get only the matching text?
Regex is now: <table class="summary".*?>.*?</table>
Thanks for the help, I appreciate it.
Dave Brosius wrote: > .*? > > doesn't seem right to me. > > what's that's suppposed to do? > > probably something like [^<]* > > > > ----- Original Message ----- From: "George Bills" <gbills@(protected)> > To: <user@(protected)> > Sent: Sunday, November 26, 2006 11:47 PM > Subject: containsregex and concat > > >> I've been trying to use a regular expression and the concat task to >> pull summary tables (<table class="summary">...</table>) out of a set >> of test reports. The reports are all HTML files sitting in >> ${report.path}. The task works fine up until I start trying to select >> output from it with <containsregex>. Is there something wrong with my >> regular expression? Is there an easier way to do this? Any help would >> be appreciated. >> >> The code is: >> ==================== >> <target name="summary"> <!-- make a report summary --> >> <property name="summary.start" value="<table >> class="summary">" /> >> <property name="summary.body" value=".*?" /> <!-- enable "s" for >> newline matches --> >> <property name="summary.end" value="</table>" /> >> <property name="summary.regex" >> value="${summary.start}${summary.body}${summary.end}" /> >> <echo>${summary.regex}</echo> >> <concat> >> <header>HEADER</header> >> <fileset dir="${report.path}" >> includes="*.html" >> excludes="${summary.file}" /> >> <filterchain> >> <tokenfilter> >> <filetokenizer /> >> <containsregex flags="is" >> pattern="${summary.regex}" /> >> </tokenfilter> >> </filterchain> >> <footer>FOOTER</footer> >> </concat> >> </target> >> ==================== >> >> The regular expression echoes as: >> ==================== >> <table class="summary">.*?</table> >> ==================== >> >> I've done some testing of the expression at >> http://www.fileformat.info/tool/regex.htm, and it seems to work there. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscribe@(protected) >> For additional commands, e-mail: user-help@(protected) >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@(protected) > For additional commands, e-mail: user-help@(protected) >
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@(protected) For additional commands, e-mail: user-help@(protected)
|
|
 |