[Aeditor-talk] regexp ala perl6

Simon Strandgaard neoneye at adslhome.dk
Mon Feb 23 01:03:30 EST 2004


ATT, Mark Sparshatt: I see you is back in action ;-)


My guess is that these 2 documents is the definition of perl6-regexp.
Is there other resources about the subject?

http://www.perl.com/pub/a/2002/06/04/apo5.html
http://www.perl.com/pub/a/2002/08/22/exegesis5.html?page=1


There is some really delicious regexp constructions, I have tried to
make a summary (there is corresponding perl5 regexp's in comments).

... | ...           # alternation


[...]:              # grab (any atom)


[ cond :: yes | no ]              # conditional  if then else
[ <(defined $1)> :: yes | no ]    # conditional+register#1


...*               # quantifier, 0 or more, greedy
...*?              # quantifier, 0 or more, non-greedy
...+               # quantifier, 1 or more, greedy
...+?              # quantifier, 1 or more, non-greedy
...?               # quantifier, 0 or 1 times, greedy
...??              # quantifier, 0 or 1 times, non-greedy
...<n,m>           # quantifier, n..m times, greedy
...<n,m>?          # quantifier, n..m times, non-greedy
...<!n,m>          # quantifier, 0 or more, exclude n..m, greedy
...<!n,m>?         # quantifier, 0 or more, exclude n..m, non-greedy


(...)               # capturing
(:i ...)            # capturing ignorecase, same as ((?i) ...)
[...]               # non-capturing brackets (?: ...)
[:i ...]            # non-capturing ignorecase, same as (?i: ...)


$1                  # backref, same as \1


{ code }            # call Ruby code, ignore result
<( code )>          # call code as boolean assertion
<$rule>             # call regex in variable
<self>              # the regexp itself.. for nesting the expression.


<('...')>           # in-line comment


<after ...>         # positive lookbehind    (?<= ...)
<!after ...>        # negative lookbehind    (?<! ...)
<before ...>        # positive lookahead     (?= ...)
<!before ...>       # negative lookahead     (?! ...)


<null>              # match nothing (epsilon transition)


<alpha>             # posix charclass        [[:alpha:]]
<-alpha>            # negated charclass      [^[:alpha:]]
<<alpha><digit>>    # more charclasses       [[:alpha:][:digit]]
<[a-z]>             # custom charclass       [a-z]
<sp>                # match space
<'...'>             # match against literal string

.                   # match anything (also newline)

\N                  # match non-newline

$                   # anchor, begin of string
$$                  # anchor, begin of line
^                   # anchor, end of string
^^                  # anchor, end of line



Is there any constructions I have overseen?
There is some more syntax described here:
http://www.perl.com/pub/a/2002/06/04/apo5.html?page=6

It has become really easy to refer to standard charclasses.. i like that

I think I saw <english> and <danish>.. I guess those are charclasses
which contains the letters in the specific languages?


--
Simon Strandgaard



More information about the Aeditor-talk mailing list