[Aeditor-talk] regexp ala perl6

Simon Strandgaard neoneye at adslhome.dk
Wed Feb 25 03:16:13 EST 2004


On Mon, 2004-02-23 at 19:54, Mark wrote:
> I've made some notes
> 
> Simon Strandgaard wrote:
> 
> >ATT, Mark Sparshatt: I see you is back in action ;-)
> >
> >
> >My guess is that these 2 documents is the definition of perl6-regexp.
> >Is there other resources about the subject?
> >
> >http://www.perl.com/pub/a/2002/06/04/apo5.html
> >http://www.perl.com/pub/a/2002/08/22/exegesis5.html?page=1
> >
> >
> >  
> >
> Those are the two best places. The only other resource is to download 
> Parrot and look at the regexp implementation included.


while browsing around in Parrot's cvs-repository, I found a BNF
definition of perl6re, which I have attached.
"parrot/languages/perl6/perl6re/Perl6RE.bnf"
however it seems as its far from being complete.

The "parrot/languages/regex/t" directory seems only to contain a
few perl5 tests (not usable).

Above 2 locations (perl6re + regex) seems to be the only place 
where there is regexp in parrot.  Have I overseen something?

--
Simon Strandgaard
-------------- next part --------------
rx		: "rx" (":" modifier)* body
		| "m" (":" modifier)* body

modifier	: "e" | "each"
		| "x" "(" digit+ ")"
		| digit+ "x"
		| digit+ letter+
		| "nth" "(" ( digit+ | identifier ) ")"
		| "p5" | "perl5"
		| "w" | "word"
		| "any"
		| "u0" | "u1" | "u2" | "u3"
		| "i"

body		: delimiter expression? delimiter
		| " " letter expression letter
		| "[" expression? "]"
		| "{" expression? "}"
		| "(" expression? ")"
		| "<" expression? ">"

delimiter	: "/" | "!" | "?" | "=" | "]" | "}" | ")" | ">" | "#"

expression	: term ( "|" term )*
term		: factor factor*
factor		: "(" expression ")"
		| "[" expression "]"
		| "<[" character_class "]>"
		| "<" identifier ">"
		| "#" character_not_newline*
		| metacharacter
		| identifier
#		| "<" perl_expression ">"
		|				# Will break stuff...

character_not_newline :	# Not going to enumerate this...

identifier	: sigil(?) letter ( letter | digit )*
# This needs to be rethought... And it's likely that it will break.
		| digit | "-" | "&" | "_" | "." | "+" | ","

sigil		: "$" | "@" | "%"

# Ranges aren't standard BNF

regular_char	: "a"..."z" | "A"..."Z" | "0"..."9"

special_char	: "\0" octal_digit+
		| "\x" hex_digit+
		| "\x" "{" hex_digit+ "}"
		| ( "\p" | "\P" ) "{" identifier "}"

metacharacter	: "+" | "*" | "?" | "^" | "^^" | "$" | "$$"
		| "\" ( "c" | "s" | "S" | "d" | "D" | "w" | "W" )
		| "\[" | "\]"
		| "\(" | "\)"
		| "\{" | "\}"
		| "\<" | "\>"
		| special_char
		| ":"+

# Ranges aren't standard BNF
digit		: "0"..."9"
letter		: "a"..."z" | "A"..."Z"


More information about the Aeditor-talk mailing list