On 03/21/2011 10:07 AM, Terry Carmen wrote:
> I'm trying to match any URL that points to a URL shortener.
> 
> They typically consist of http(s) followed by a domain name,
> a slash and a small series of alphanumeric characters,
> *without a trailing "/" or file extension*.
> 
> I seem to be having pretty good luck matching the URL, however I
> can't figure out how to make the regex explicity *not* match
> anything that ends in a slash or contains an extension.
> 
> For example, I want to match "http://asdf.ghi/j2kj4l23";, but not 
> "http://asdf.ghi/j2kj4l23/abc.html"; or "http://asdf.ghi/j2kj4l23/";

In this specific case, I think you want a simple end-of-line indicator,

uri  ASDF_GHI_SHORT  m'^http://asdf\.ghi/[\w-]{1,12}$'i

In order to match  http://asdf.ghi/j2kj4l23#mno  you might want:

uri  ASDF_GHI_SHORT  m'^http://asdf\.ghi/[\w-]{1,12}(?:[^/.\w-]|$)'i

( I used m'' instead of // so I didn't have to escape the slashes.  Any
punctuation can be used in that manner, though the leading "m" is only
optional in m// ).

> I tried using the perl negative look-ahead as both : (?!/) and
> (?!\/) without success.

As to using a negative look-ahead operator:  Though I'm not exactly sure
about when it's needed, you sometimes have to put something after it,
like  /foo(?!bar)(?:.|$)/  ... this is not mentioned in the spec.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to