Close to a 4.2 release; experimenting with Ragel alternatives

Gaspard Bucher gaspard at teti.ch
Sun Jun 7 15:17:41 EDT 2009


Hi Jason !

Hmmm, this is good and bad news:

Good: ruby hooks means I could use a single pass to parse textile
customizations in zena instead of running two parsers: nice.

Bad: I have just switched to ragel for QueryBuilder to parse pseudo
sql and I fear your shortcomings (if that's an english phrase).

Could you describe more precisely what you are missing with ragel ?
I'm parsing about anything I want with this thing but maybe I'm too
dumb to see the walls I'm running into...

Gaspard

On Sun, Jun 7, 2009 at 12:59 PM, Jason Garber<jg at jasongarber.com> wrote:
> I just went through the ticket list and dropped a bunch from the 4.2
> milestone that are just too difficult with Ragel.  Many of them I've poked
> at and they've left me saying, "how the heck am I supposed to do that!?"
>  Multi-byte content will probably never work because Ragel docs say it won't
> with conditionals (actions that return true or false to determine if a state
> should be accepted), which I see no way around.  Not recognizing vertical
> pipes escaped with notextile tags in tables, exiting the HTML machine on the
> first closing block tag it sees, leaving pre blocks prematurely... all these
> bugs would require a lot of time and code to fix.  And they're just the tip
> of the iceberg.  If I walk through the code and look at it through the lens
> of nondeterminism, I can see lots more problems that people just haven't run
> into yet.
> I'd like to release RedCloth 4.2 once I fix the low-hanging fruit.  Then, I
> plan to poke around for alternatives to Ragel.  It's been great, but
> RedCloth has gotten really difficult to maintain because:
> 1.) It has to compile
> 2.) It compiles to three languages, has a couple binary gem distributions,
> and needs to work with Ruby 1.8 and 1.9, which is always a challenge
> 3.) Many reported bugs involve nondeterminism and require things DFAs like
> Ragel have a hard time doing
> 4.) Not that many people can fix bugs themselves because they don't know
> Ragel or they don't understand the code.
> 5.) It's hard to tell people they can't mix in extensions.  Right now
> RedCloth is a black box and you have to pre- or post-parse for extra
> patterns, like wiki links.  I want people to be able to use it how they
> want.  If that means mixing in their own cruddy patterns, awesome.
> A PEG might be the way to go.  Looking at Treetop, which is nice, decently
> maintained, has some history, and is used by Cucumber.  Doesn't let me
> manipulate the parser's acceptance of expressions in code, though.  It's a
> known problem, which is why you don't see any yaml parsers in treetop yet
> (they have a proposal on Global Parsing State and Semantic Backtrack
> Triggering).  Also, without backreferences or the equivalent in code, it
> would be hard to match things like HTML tags.
> Also looking at James Edward Gray II's Ghost Wheel.  I like the grammar
> syntax better and he says it "provides hooks for Ruby code that can be used
> to make parsing decisions or transform parsed results," but it's less widely
> used and well-documented and I haven't tried it out, so I don't know its
> limitations.
> If anyone else has suggestions of things I should explore, do let me know!
>  I want to keep RedCloth fast, but it also needs to be maintainable.
> Jason
> _______________________________________________
> Redcloth-upwards mailing list
> Redcloth-upwards at rubyforge.org
> http://rubyforge.org/mailman/listinfo/redcloth-upwards
>


More information about the Redcloth-upwards mailing list