[Aeditor-talk] regexp ala perl6
Simon Strandgaard
neoneye at adslhome.dk
Mon Feb 23 01:03:30 EST 2004
ATT, Mark Sparshatt: I see you is back in action ;-)
My guess is that these 2 documents is the definition of perl6-regexp.
Is there other resources about the subject?
http://www.perl.com/pub/a/2002/06/04/apo5.html
http://www.perl.com/pub/a/2002/08/22/exegesis5.html?page=1
There is some really delicious regexp constructions, I have tried to
make a summary (there is corresponding perl5 regexp's in comments).
... | ... # alternation
[...]: # grab (any atom)
[ cond :: yes | no ] # conditional if then else
[ <(defined $1)> :: yes | no ] # conditional+register#1
...* # quantifier, 0 or more, greedy
...*? # quantifier, 0 or more, non-greedy
...+ # quantifier, 1 or more, greedy
...+? # quantifier, 1 or more, non-greedy
...? # quantifier, 0 or 1 times, greedy
...?? # quantifier, 0 or 1 times, non-greedy
...<n,m> # quantifier, n..m times, greedy
...<n,m>? # quantifier, n..m times, non-greedy
...<!n,m> # quantifier, 0 or more, exclude n..m, greedy
...<!n,m>? # quantifier, 0 or more, exclude n..m, non-greedy
(...) # capturing
(:i ...) # capturing ignorecase, same as ((?i) ...)
[...] # non-capturing brackets (?: ...)
[:i ...] # non-capturing ignorecase, same as (?i: ...)
$1 # backref, same as \1
{ code } # call Ruby code, ignore result
<( code )> # call code as boolean assertion
<$rule> # call regex in variable
<self> # the regexp itself.. for nesting the expression.
<('...')> # in-line comment
<after ...> # positive lookbehind (?<= ...)
<!after ...> # negative lookbehind (?<! ...)
<before ...> # positive lookahead (?= ...)
<!before ...> # negative lookahead (?! ...)
<null> # match nothing (epsilon transition)
<alpha> # posix charclass [[:alpha:]]
<-alpha> # negated charclass [^[:alpha:]]
<<alpha><digit>> # more charclasses [[:alpha:][:digit]]
<[a-z]> # custom charclass [a-z]
<sp> # match space
<'...'> # match against literal string
. # match anything (also newline)
\N # match non-newline
$ # anchor, begin of string
$$ # anchor, begin of line
^ # anchor, end of string
^^ # anchor, end of line
Is there any constructions I have overseen?
There is some more syntax described here:
http://www.perl.com/pub/a/2002/06/04/apo5.html?page=6
It has become really easy to refer to standard charclasses.. i like that
I think I saw <english> and <danish>.. I guess those are charclasses
which contains the letters in the specific languages?
--
Simon Strandgaard
More information about the Aeditor-talk
mailing list