date:20120414

Re: Regex again..

2012-04-14 Thread Somu

okay okay okay Let me smile a bit first. I'm a newbie, a kid. [?][?][?] I'm still new to the intricacies of regular expressions. Maybe some day i will succeed in crossing 3000kms on foot in 24hours. hahahaha. Thank you all. *I finished the job using HTML::Strip much earlier* <<338.gif>><<

Re: Regex again..

2012-04-14 Thread Michael Rasmussen

I hate it when I post something and then find a bit of information I should have included. http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg The poster lists four valid HTML constructs that regex are ill equiped to handle.

Re: Regex again..

2012-04-14 Thread Michael Rasmussen

On Sat, Apr 14, 2012 at 07:05:54PM +0300, Shlomi Fish wrote: > Hi Somu, > > On Sat, 14 Apr 2012 21:01:03 +0530 > Somu wrote: > > > OK. Can i ask "WHY?" > > Why can't it be done using regex. Isn't a html file just another long > > string with more, but similar special characters?? > > > > first

Re: Regex again..

2012-04-14 Thread Shlomi Fish

Hi Somu, On Sat, 14 Apr 2012 21:01:03 +0530 Somu wrote: > OK. Can i ask "WHY?" > Why can't it be done using regex. Isn't a html file just another long > string with more, but similar special characters?? > first of all I should note that you appear to be replying to the wrong messages which br

Re: Regex again..

2012-04-14 Thread Uri Guttman

On 04/14/2012 11:42 AM, Zheng Du wrote: Hi Somu, Of course if can be done by using regex, but if there is a single line command can do the job, that's absolutely more efficient, and less bug. actually it can't be done by a regex. consider the issue of comments. think about comments containing

Re: Regex again..

2012-04-14 Thread Zheng Du

Hi Somu, Of course if can be done by using regex, but if there is a single line command can do the job, that's absolutely more efficient, and less bug. Unless you're eager to polish your Perl skill. =D Du Zheng 2012/4/14 Somu > OK. Can i ask "WHY?" > Why can't it be done using regex. Isn't a

Re: Regex again..

2012-04-14 Thread Somu

OK. Can i ask "WHY?" Why can't it be done using regex. Isn't a html file just another long string with more, but similar special characters?? Somu

Re: Regex again..

2012-04-14 Thread Shlomi Fish

Hi Somu, On Sat, 14 Apr 2012 14:46:50 +0530 Somu wrote: > Sir, what is this??: > > lynx -stdin -dump < in.html > out.txt > It's a UNIX command. What it does is take the file "in.html" (without the quotes), pipe it through "lynx -stdin -dump" and put its output in the "out.txt" fi

Re: Regex again..

2012-04-14 Thread Shlomi Fish

Hi Somu, On Sat, 14 Apr 2012 12:56:03 +0530 Somu wrote: > *Hi all, > I was trying to strip off all html tags and the special characters from a > html file using regex. > my code is as follows.. please don't use regular expressions to parse and process HTML: * http://perl-begin.org/FAQs/freeno

Re: Regex again..

2012-04-14 Thread Somu

Sir, what is this??: lynx -stdin -dump < in.html > out.txt For now, the job got done by HTML::Strip @Zheng Du, will try your suggestion, but the other files maybe big for one variable?(these are files containing words and meaning) Somu.

Re: Regex again..

2012-04-14 Thread Zheng Du

Hi Som, Looks like you want to do the minimal match, so you can refer to the code: $line =~ s/(<.*>)?//; => $line =~ s/<.*?>//g; But there is still a problem,you have '<' and '>' placing in different lines, so you can try to read all the file content into a variable, and replace them once for al

Re: Regex again..

2012-04-14 Thread Dr.Ruud

On 2012-04-14 09:26, Somu wrote: I was trying to strip off all html tags and the special characters from a html file using regex. Alternative: lynx -stdin -dump < in.html > out.txt -- Ruud -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginne

Regex again..

2012-04-14 Thread Somu

*Hi all, I was trying to strip off all html tags and the special characters from a html file using regex. my code is as follows.. * use strict; use warnings; sub strip_html{ my $line = shift; #something wrong in the following

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Re: Regex again..

Regex again..

13 matches

Site Navigation

Mail list logo

Footer information