Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Perl Programming Language

simple regex


Hi
I am trying to extract all  URLs ending with php or cgi or pl from one
website with the following code :
    foreach $judge( $res->content=~m#((http://[a-z-\/\.~]+\.(php|cgi|
pl)))#g)
                {
                        print $judge,"\n";
                }
But for some reason I get redundant results :

http://www.kanazawa-gu.ac.jp/~hayashiy/cgi-bin/log/env.cgi
cgi
http://www.bsnoop.de/cgi-bin/jenv.cgi
cgi

etc.

Could someone explain to me why the file extension is present in this
result set . What am I doing wrong ?

On Jun 5, 12:33 pm, r3gis <regi@gmail.com> wrote:

You have multiple capturing parentheses in your pattern match.  A
pattern match in list context (such as that imposed by the foreach
loop) returns a list of ALL captured parentheses.

Change the ones you don't want to capture to be noncapturing, by
adding a ?: right after the (

See also:
perldoc perlre
perldoc perlretut
perldoc perlreref

Paul Lalli

well paul i tested this on windows n works fine... So is it really
related to multiple paranthesis?.
So I think, the input has to be checked. But r3gis please follow
Paul's advice as I may be wrong being a newbie

#!/usr/bin/perl
use strict;
use warnings;
my @arr = ('http://www.kanazawa-gu.ac.jp/~hayashiy/cgi-bin/log/
env.cgi', 'http://www.bsnoop.de/cgi-bin/jenv.cgi','asdadada');
foreach  (@arr) {
                                if ($_=~m!((http://[a-z-\/\.~]+\.(php|cgi|pl)))!g) {
                                        print $_;
                                }

On Jun 6, 2:11 am, jeevs <jeevan.ing@gmail.com> wrote:

> well paul i tested this on windows n works fine... So is it really
> related to multiple paranthesis?.

You tested *what* on Windows?  The code that r3gis posted, or the code
that you posted?  The code that r3gis posted is incomplete, so I'd
like to see the actual program you used.  The code that you posted has
nothing at all to do with the original problem.

Confused,
Paul Lalli

On Jun 6, 3:38 pm, Paul Lalli <mri@gmail.com> wrote:

> On Jun 6, 2:11 am, jeevs <jeevan.ing@gmail.com> wrote:

> > well paul i tested this on windows n works fine... So is it really
> > related to multiple paranthesis?.
> You tested *what* on Windows?  The code that r3gis posted, or the code
> that you posted?

Sorry for my irrelevant post ... I meant the code posted by me which I
accept was not at all related to the original problem and I apologize
for taking your and others time into this.

r3gis as suggested by Paul you can replace the following line in your
code

foreach $judge( $res->content=~m#((http://[a-z-\/\.~]+\.(php|cgi|
pl)))#g)

by

foreach $judge( $res->content=~m!(http://[a-z-\/\.~]+\.(?:php|cgi|pl))!
g)

Thanks Paul. I will be carefull next time.

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc