From thiago.arrais at gmail.com Mon Oct 1 07:33:25 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Mon, 1 Oct 2007 08:33:25 -0300 Subject: [Mediacloth-devel] Ping In-Reply-To: <46FFF496.8070908@sun.com> References: <46FFF496.8070908@sun.com> Message-ID: Gregory, I also have some patches to share and would prefer not to be forced to fork the project. If only the current mantainers would register someone else as a project developer on rubyforge.org, we could get together to continue with development. In any case, we can always get the code and start a new project somewhere else. But we need to regard this as a last resort, just to avoid the confusion for newcomers. Cheers, Thiago Arrais From thiago.arrais at gmail.com Mon Oct 1 07:55:59 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Mon, 1 Oct 2007 08:55:59 -0300 Subject: [Mediacloth-devel] Ping In-Reply-To: <1191239081.20085.5.camel@noosnascla-laptop> References: <46FFF496.8070908@sun.com> <1191239081.20085.5.camel@noosnascla-laptop> Message-ID: On 10/1/07, Claes N?st?n wrote: > > In any case, we can always get the code and start a new > > project somewhere else. But we need to regard this as a last > > resort, just to avoid the confusion for newcomers. > > Also, respect to the developers. Indeed! -- Thiago Arrais From Gregory.Murphy at Sun.COM Mon Oct 1 10:44:45 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Mon, 01 Oct 2007 07:44:45 -0700 Subject: [Mediacloth-devel] Ping In-Reply-To: References: <46FFF496.8070908@sun.com> Message-ID: <470107DD.3060008@sun.com> Thiago Arrais wrote: > Gregory, > > I also have some patches to share and would prefer not to be > forced to fork the project. If only the current mantainers would > register someone else as a project developer on rubyforge.org, > we could get together to continue with development. > > In any case, we can always get the code and start a new > project somewhere else. But we need to regard this as a last > resort, just to avoid the confusion for newcomers. > > Cheers, > > Thiago Arrais > I can start by bringing over the patches that you have submitted, and I will submit patches for the few changes that I have made thus far. I suppose that if it is only two or three people doing development, sharing patches won't be too onerous, but it's not ideal. Is there any possibility that the RubyForge site admins would add one of us as a developer? // Gregory From thiago.arrais at gmail.com Mon Oct 1 11:08:10 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Mon, 1 Oct 2007 12:08:10 -0300 Subject: [Mediacloth-devel] Ping In-Reply-To: <470107DD.3060008@sun.com> References: <46FFF496.8070908@sun.com> <470107DD.3060008@sun.com> Message-ID: On 10/1/07, Gregory Murphy wrote: > I can start by bringing over the patches that you have submitted, and I > will submit patches for the few changes that I have made thus far. Publishing your changes here will certainly be a good idea anyway, since it gives the guys the chance to apply them back to the main codebase. > I suppose that if it is only two or three people doing development, > sharing patches won't be too onerous, but it's not ideal. In any case there is always the people that are anonymously interested in the development. Unfortunately, sharing patches in private will not include them. I think the best solution for everyone (those of us interested in improving Mediacloth, the current developers, future newcomers and current anonymous people) will be to use the current site. > Is there any possibility that the RubyForge site admins would add one of > us as a developer? I think there is, but we don't need to bother them if the project has not been abandoned (and Mediacloth certainly has not). Maybe if the current developers do not reply in a couple of days we could appeal to the RubyForge gods. Cheers, Thiago Arrais From me at pekdon.net Mon Oct 1 07:44:41 2007 From: me at pekdon.net (Claes =?ISO-8859-1?Q?N=E4st=E9n?=) Date: Mon, 01 Oct 2007 13:44:41 +0200 Subject: [Mediacloth-devel] Ping In-Reply-To: References: <46FFF496.8070908@sun.com> Message-ID: <1191239081.20085.5.camel@noosnascla-laptop> Hi, On Mon, 2007-10-01 at 08:33 -0300, Thiago Arrais wrote: > Gregory, > > I also have some patches to share and would prefer not to be > forced to fork the project. If only the current mantainers would > register someone else as a project developer on rubyforge.org, > we could get together to continue with development. > Same goes for me! :-O One maybe could create a GIT archive somewhere else and sync it back with SVN later on when somebody gets access or the developer(s) get back to the project? > > In any case, we can always get the code and start a new > project somewhere else. But we need to regard this as a last > resort, just to avoid the confusion for newcomers. > Also, respect to the developers. > > Cheers, > > Thiago Arrais Cheers! -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071001/0d1f00f6/attachment.bin From alexander.dymo at gmail.com Mon Oct 1 15:54:18 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Mon, 1 Oct 2007 22:54:18 +0300 Subject: [Mediacloth-devel] Pong :) Message-ID: <200710012254.19198.alexander.dymo@gmail.com> Hi, sorry for the long time it takes me to reply, but my free time is alas currently less than zero because I'm preparing my PhD for defense this year. To keep the project alive fresh blood is indeed necessary and it looks like Thiago and Gregory are the best candidates here. So I've added them as project developers with svn access. Have fun :) Thiago and Gregory, feel free to commit any patches and changes to svn. The only thing I and Gleb would like to do is to review the commits (as time permits of course). We'll set up the code review infrastructure today. From thiago.arrais at gmail.com Mon Oct 1 15:54:20 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Mon, 1 Oct 2007 16:54:20 -0300 Subject: [Mediacloth-devel] Pong :) In-Reply-To: <200710012254.19198.alexander.dymo@gmail.com> References: <200710012254.19198.alexander.dymo@gmail.com> Message-ID: Alexander, Thanks for the trust. Would you guys like that we create our own branches or do you think it is OK for us to commit changes to the trunk? Cheers, Thiago Arrais From alexander.dymo at gmail.com Mon Oct 1 16:17:19 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Mon, 1 Oct 2007 23:17:19 +0300 Subject: [Mediacloth-devel] Pong :) In-Reply-To: References: <200710012254.19198.alexander.dymo@gmail.com> Message-ID: <200710012317.19993.alexander.dymo@gmail.com> On Monday 01 October 2007 22:54, Thiago Arrais wrote: > Would you guys like that we create our own branches or do you think > it is OK for us to commit changes to the trunk? I think we can follow in the usual manner for free software projects: - commit changes that we'd like to see in the next (0.3) release into the trunk - create branches/work/ for any experimental short-term stuff - create branches/ for any new features that are not expected to be seen in the next release or that heavily break existing functionality tests The only thing I'd like to ask about is to commit tests together with code and check that existing tests run fine. From Gregory.Murphy at Sun.COM Tue Oct 2 01:00:03 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Mon, 01 Oct 2007 22:00:03 -0700 Subject: [Mediacloth-devel] Reference to "external" link Message-ID: <4701D053.1010207@sun.com> In mediawikihtmlgenerator.rb, line 73, there is a check for text formatting corresponding to a link or an "external link". I believe that this is a typo, and should read "internal link", since the grammar returns external links as "links". If there are no objections, I will fix this, and update the tests accordingly: Index: lib/mediacloth/mediawikihtmlgenerator.rb =================================================================== --- lib/mediacloth/mediawikihtmlgenerator.rb (revision 39) +++ lib/mediacloth/mediawikihtmlgenerator.rb (working copy) @@ -70,7 +70,7 @@ tag = ["b", ""] elsif ast.formatting == :Italic tag = ["i", ""] - elsif ast.formatting == :Link or ast.formatting == :ExternalLink + elsif ast.formatting == :Link or ast.formatting == :InternalLink links = ast.contents.split link = links[0] link_name = links[1, links.length-1].join(" ") // Gregory From Gregory.Murphy at Sun.COM Tue Oct 2 01:10:40 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Mon, 01 Oct 2007 22:10:40 -0700 Subject: [Mediacloth-devel] Fix for bug #8804 Catching newlines Message-ID: <4701D2D0.9040209@sun.com> In order to allow the lexer to handle both "\n" and "\r\n" sequences, I would like to change it as follows. Let me know if there are any objects. I also have a modified set of tests which test for "\r\n". // Gregory Index: lib/mediacloth/mediawikilexer.rb =================================================================== --- lib/mediacloth/mediawikilexer.rb (revision 39) +++ lib/mediacloth/mediawikilexer.rb (working copy) @@ -45,6 +45,7 @@ @lexer_table["\n"] = method(:match_newline) + @lexer_table["\r"] = method(:match_carriagereturn) end @@ -385,6 +387,21 @@ match_other end + #Matches a new line and breaks the paragraph if two carriage return - newline + #sequences ("\r\n\r\n") are met. + def match_carriagereturn + if @text[@cursor, 4] == "\r\n\r\n" + if @para + @next_token[0] = :PARA_END +# @para = false + @sub_tokens = [[:PARA_START, ""]] + @cursor += 4 + return + end + end + match_other + end + #-- ================== Helper methods ================== ++# #Checks if the token is placed at the start of the line. @@ -408,7 +425,9 @@ #Returns true if the TEXT token is empty or contains newline only def empty_text_token? - @current_token == [:TEXT, ''] or @current_token == [:TEXT, "\n"] + if @current_token[0] == :TEXT + @current_token[1] == '' or @current_token[1] == "\n" or @current_token[1] == "\r\n" + end end From me at pekdon.net Tue Oct 2 01:18:43 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Tue, 2 Oct 2007 07:18:43 +0200 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <4701D053.1010207@sun.com> References: <4701D053.1010207@sun.com> Message-ID: <20071002051843.GA11105@hydrogen.pekdon.net> On Mon, Oct 01, 2007 at 10:00:03PM -0700, Gregory Murphy wrote: > In mediawikihtmlgenerator.rb, line 73, there is a check for text > formatting corresponding to a link or an "external link". I believe that > this is a typo, and should read "internal link", since the grammar > returns external links as "links". > However, the format of InternalLink is link|title which would make the split below to break. Locally I've just added an elsif ast.formatting == :InternalLink and handled it with a split('|') instead. > > If there are no objections, I will fix this, and update the tests > accordingly: > > Index: lib/mediacloth/mediawikihtmlgenerator.rb > =================================================================== > --- lib/mediacloth/mediawikihtmlgenerator.rb (revision 39) > +++ lib/mediacloth/mediawikihtmlgenerator.rb (working copy) > @@ -70,7 +70,7 @@ > tag = ["b", ""] > elsif ast.formatting == :Italic > tag = ["i", ""] > - elsif ast.formatting == :Link or ast.formatting == :ExternalLink > + elsif ast.formatting == :Link or ast.formatting == :InternalLink > links = ast.contents.split > link = links[0] > link_name = links[1, links.length-1].join(" ") > > // Gregory > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071002/59a80bef/attachment.bin From thiago.arrais at gmail.com Tue Oct 2 06:58:45 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 2 Oct 2007 07:58:45 -0300 Subject: [Mediacloth-devel] Fix for bug #8804 Catching newlines In-Reply-To: <4701D2D0.9040209@sun.com> References: <4701D2D0.9040209@sun.com> Message-ID: What about (for empty_text_token?) @current_token[0] == :TEXT and @current_token[1] == '' or @current_token[1] == "\n" or @current_token[1] == "\r\n" or even @current_token[0] == :TEXT and ['', "\n", "\r\n"].include?(@current_token[1]) I am in favour of this patch. Just make sure to provide the tests when checking it in. Cheers, Thiago Arrais From thiago.arrais at gmail.com Tue Oct 2 07:12:18 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 2 Oct 2007 08:12:18 -0300 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <20071002051843.GA11105@hydrogen.pekdon.net> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> Message-ID: I have done the same change in my custom HTML generator. I just make sure to generate a valid URL for the page names (my wiki allows references to internal pages by page name only) and remove the "nofollow" attribute from the link. Claes, I do not understand... I thought the only difference between internal and external links would be the double brackets (and the ability to refer to the pages with page names only, but that is wiki-dependent). Cheers, Thiago Arrais From me at pekdon.net Tue Oct 2 09:16:52 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Tue, 2 Oct 2007 15:16:52 +0200 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> Message-ID: <20071002131652.GA15434@hydrogen.pekdon.net> On Tue, Oct 02, 2007 at 08:12:18AM -0300, Thiago Arrais wrote: > > Claes, I do not understand... I thought the only difference between > internal and external links would be the double brackets (and > the ability to refer to the pages with page names only, but that > is wiki-dependent). > Well, considering the MediaWiki syntax I'd guess doing the same for links would be a natural thing as you can do: [[HelloWorld|My link text]] in MediaWiki But maybe it should be up to the user of the library to decide, which it still would be possible to do by subclassing. Also, I wonder, should there be mangle internal and mangel external link methods for subclasses of the generator to use to reduce the amount of code that requires changing when support special URLs? > > Cheers, > > Thiago Arrais > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071002/23c7f9c0/attachment.bin From thiago.arrais at gmail.com Tue Oct 2 09:41:40 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 2 Oct 2007 10:41:40 -0300 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <20071002131652.GA15434@hydrogen.pekdon.net> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> Message-ID: On 10/2/07, Claes N?st?n wrote: > Well, considering the MediaWiki syntax I'd guess doing the same for links > would be a natural thing as you can do: > > [[HelloWorld|My link text]] in MediaWiki Hmmm, right. My bad. My wiki does not currently require the | separator between the page name and reference text, it just interprets the first word as the page name and the remaining text as reference text. It does not allow spaces in page names, thus it is able to do that. But it will obviously be impossible to do that with spaces in page names (like in Wikipedia). Anyway, if we _really_ want Mediacloth to be 100% compatible with MediaWiki, we better pay some attention to that. Thanks for the heads up. > But maybe it should be up to the user of the library to decide, which it still > would be possible to do by subclassing. Agreed. That is exactly what I've been doing. > Also, I wonder, should there be mangle internal and mangel external link > methods for subclasses of the generator to use to reduce the amount of code > that requires changing when support special URLs? We can take this even further and not require any subclassing to customize links. This actually is my goal with the last patches. I was thinking about having some kind of per-application URL generator that deals with internal links. Different wikis may (and probably will) use different page locations and I understand that this should be easier to do. Maybe we could have something like class ApplicationSpecificUrlGenerator def url_for(page_name) # generate an url according to my app's scheme "/wiki/#{page_name}" end end parser = MediaWikiParser.new parser.lexer = MediaWikiLexer.new ast = parser.parse(input) generator = MediaWikiHTMLGenerator.new(ApplicationSpecificUrlGenerator.new) puts generator.parse(ast) What do you think? Best regards, Thiago Arrais From Gregory.Murphy at Sun.COM Tue Oct 2 12:25:54 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Tue, 02 Oct 2007 09:25:54 -0700 Subject: [Mediacloth-devel] Fix for bug #8804 Catching newlines In-Reply-To: References: <4701D2D0.9040209@sun.com> Message-ID: <47027112.5050604@sun.com> Thiago Arrais wrote: > What about (for empty_text_token?) > > @current_token[0] == :TEXT and > @current_token[1] == '' or @current_token[1] == "\n" or > @current_token[1] == "\r\n" > > or even > > @current_token[0] == :TEXT and ['', "\n", "\r\n"].include?(@current_token[1]) > These are more elegant expressions, thanks for the suggestions. I'm still a bit new to Ruby (but enjoying it!). I will also add test cases. // Gregory > I am in favour of this patch. Just make sure to provide the tests when > checking it in. > > Cheers, > > Thiago Arrais > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel > From me at pekdon.net Tue Oct 2 14:29:34 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Tue, 2 Oct 2007 20:29:34 +0200 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> Message-ID: <20071002182934.GA22422@hydrogen.pekdon.net> On Tue, Oct 02, 2007 at 10:41:40AM -0300, Thiago Arrais wrote: > > Anyway, if we _really_ want Mediacloth to be 100% compatible with MediaWiki, > we better pay some attention to that. Thanks for the heads up. > Yeps, using Mediawiki at work so might try parsing some of those pages with mediacloth to see how it goes... When and if I get the time. :D > > > > Also, I wonder, should there be mangle internal and mangel external link > > methods for subclasses of the generator to use to reduce the amount of code > > that requires changing when support special URLs? > > We can take this even further and not require any subclassing to customize > links. > That's what I've done myself with a Proc handling the links in my case, not sure if subclassing is a design choise already done or not. > > This actually is my goal with the last patches. I was thinking about > having some kind of per-application URL generator that deals with internal > links. Different wikis may (and probably will) use different page locations > and I understand that this should be easier to do. > > What do you think? It should have url_for_internal and url_for_external also url and name data should both be included there IMHO. > > Best regards, > > Thiago Arrais Cheers! -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071002/f878d592/attachment.bin From thiago.arrais at gmail.com Tue Oct 2 14:34:02 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 2 Oct 2007 15:34:02 -0300 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <20071002182934.GA22422@hydrogen.pekdon.net> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> <20071002182934.GA22422@hydrogen.pekdon.net> Message-ID: On 10/2/07, Claes N?st?n wrote: > It should have url_for_internal and url_for_external also url and name data > should both be included there IMHO. Name data? Would you mind to detail that? -- Thiago Arrais From alexander.dymo at gmail.com Tue Oct 2 14:49:13 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Tue, 2 Oct 2007 21:49:13 +0300 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: References: <4701D053.1010207@sun.com> <20071002131652.GA15434@hydrogen.pekdon.net> Message-ID: <200710022149.14018.alexander.dymo@gmail.com> On Tuesday 02 October 2007 16:41, Thiago Arrais wrote: > Maybe we could have something like > class ApplicationSpecificUrlGenerator Maybe we do that by adding url generation method into MediaWikiParams class to avoid having extra classes? From thiago.arrais at gmail.com Tue Oct 2 15:02:32 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 2 Oct 2007 16:02:32 -0300 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <200710022149.14018.alexander.dymo@gmail.com> References: <4701D053.1010207@sun.com> <20071002131652.GA15434@hydrogen.pekdon.net> <200710022149.14018.alexander.dymo@gmail.com> Message-ID: On 10/2/07, Alexander Dymo wrote: > Maybe we do that by adding url generation method into MediaWikiParams class to > avoid having extra classes? Do you mean an url_generation method that receives a block? Just to make it clear, the ApplicationSpecificUrlGenerator is client code, not Mediacloth code. We don't need to add any classes if we don't want to... Cheers, Thiago Arrais From Gregory.Murphy at Sun.COM Tue Oct 2 15:07:58 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Tue, 02 Oct 2007 12:07:58 -0700 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <20071002182934.GA22422@hydrogen.pekdon.net> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> <20071002182934.GA22422@hydrogen.pekdon.net> Message-ID: <4702970E.6020709@sun.com> Claes N?st?n wrote: > On Tue, Oct 02, 2007 at 10:41:40AM -0300, Thiago Arrais wrote: > >> This actually is my goal with the last patches. I was thinking about >> having some kind of per-application URL generator that deals with internal >> links. Different wikis may (and probably will) use different page locations >> and I understand that this should be easier to do. >> >> What do you think? >> > > It should have url_for_internal and url_for_external also url and name data > should both be included there IMHO. I also like the idea of allowing clients to register a handler for link events. I agree with Claes, a distinction should be made between internal and external URLs. In this case, should we enhance the grammar so that it deals with both parts (link, text) of a link node. In the case of an internal link, MediaWiki requires a '|' char to separate link from text, otherwise the entire content is treated as the link. In the case of external links, the first space is the separator. By "name data", I assume you mean the link text? Why would it be necessary to also pass the link text? // Gregory From me at pekdon.net Tue Oct 2 18:31:40 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Wed, 3 Oct 2007 00:31:40 +0200 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <4702970E.6020709@sun.com> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> <20071002182934.GA22422@hydrogen.pekdon.net> <4702970E.6020709@sun.com> Message-ID: <20071002223140.GB22422@hydrogen.pekdon.net> Hi, On Tue, Oct 02, 2007 at 12:07:58PM -0700, Gregory Murphy wrote: > I also like the idea of allowing clients to register a handler for link > events. I agree with Claes, a distinction should be made between > internal and external URLs. In this case, should we enhance the grammar > so that it deals with both parts (link, text) of a link node. In the > case of an internal link, MediaWiki requires a '|' char to separate link > from text, otherwise the entire content is treated as the link. In the > case of external links, the first space is the separator. > Would that actually add some extra value? Giving the handler "url" and "other text" as parameters would IMHO make it simple and thus less error prone compared to changing the actual grammar. > > By "name data", I assume you mean the link text? Why would it be > necessary to also pass the link text? > This would be necessary to handle WikiMedia style image/file links which AFAIK look like [[image:ImageFile.png|Alt Text]] generating something like: Alt Text To support that in a rather generic matter the handler would have to get all information and return a tag / list of tags back to main code. Or do you have any cleaner solution? > > // Gregory -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071003/ac30c14d/attachment.bin From Gregory.Murphy at Sun.COM Tue Oct 2 20:30:07 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Tue, 02 Oct 2007 17:30:07 -0700 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <20071002223140.GB22422@hydrogen.pekdon.net> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> <20071002182934.GA22422@hydrogen.pekdon.net> <4702970E.6020709@sun.com> <20071002223140.GB22422@hydrogen.pekdon.net> Message-ID: <4702E28F.6000503@sun.com> Claes N?st?n wrote: > Hi, > > On Tue, Oct 02, 2007 at 12:07:58PM -0700, Gregory Murphy wrote: > >> I also like the idea of allowing clients to register a handler for link >> events. I agree with Claes, a distinction should be made between >> internal and external URLs. In this case, should we enhance the grammar >> so that it deals with both parts (link, text) of a link node. In the >> case of an internal link, MediaWiki requires a '|' char to separate link >> from text, otherwise the entire content is treated as the link. In the >> case of external links, the first space is the separator. >> >> > Would that actually add some extra value? Giving the handler "url" and > "other text" as parameters would IMHO make it simple and thus less error prone > compared to changing the actual grammar. If this distinction isn't part of the grammar, then... what is it part of? Every "[[ ... ]]" character sequence in a mediawiki page contains either a link; or, a link, followed by the '|' character, followed by the link text. A link, in turn, may be a sequence of alpha-numeric characters, including spaces, which indicates a reference to the named page; or, a special link prefix (e.g. "image") that indicates a link to some other type of resource, followed by the ':' character, followed by the resource name. If we want to treat this link syntax as outside of Mediawiki (which would make it easier for users to describe new link syntaxes), then the grammar should just treat everything between "[[ ... ]]" as a string. But it seems to me that this is syntax is part of Mediawiki. Note that this also effects error handling. If we just treat "[[ ... ]]" as a string, then users will have to perform there own validation of the internals, report errors, etc. // Gregory From me at pekdon.net Wed Oct 3 01:59:50 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Wed, 3 Oct 2007 07:59:50 +0200 Subject: [Mediacloth-devel] Reference to "external" link In-Reply-To: <4702E28F.6000503@sun.com> References: <4701D053.1010207@sun.com> <20071002051843.GA11105@hydrogen.pekdon.net> <20071002131652.GA15434@hydrogen.pekdon.net> <20071002182934.GA22422@hydrogen.pekdon.net> <4702970E.6020709@sun.com> <20071002223140.GB22422@hydrogen.pekdon.net> <4702E28F.6000503@sun.com> Message-ID: <20071003055950.GC22422@hydrogen.pekdon.net> On Tue, Oct 02, 2007 at 05:30:07PM -0700, Gregory Murphy wrote: > Claes N?st?n wrote: > > Would that actually add some extra value? Giving the handler "url" and > > "other text" as parameters would IMHO make it simple and thus less error prone > > compared to changing the actual grammar. > If this distinction isn't part of the grammar, then... what is it part of? > Trying to speak backwards ... :/ > > Every "[[ ... ]]" character sequence in a mediawiki page contains either > a link; or, a link, followed by the '|' character, followed by the link > text. A link, in turn, may be a sequence of alpha-numeric characters, > including spaces, which indicates a reference to the named page; or, a > special link prefix (e.g. "image") that indicates a link to some other > type of resource, followed by the ':' character, followed by the > resource name. > > If we want to treat this link syntax as outside of Mediawiki (which > would make it easier for users to describe new link syntaxes), then the > grammar should just treat everything between "[[ ... ]]" as a string. > But it seems to me that this is syntax is part of Mediawiki. > > Note that this also effects error handling. If we just treat "[[ ... ]]" > as a string, then users will have to perform there own validation of the > internals, report errors, etc. > Convinced me! :) Anyway, probably should have a look at the mediawiki docs on internal links to verify that the only parts an internal link can consist of is indeed resource, link and link text. On the url handling, any better idea than returning tag(s) from the link handlers? Cheers! -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071003/4edb1acd/attachment.bin From Gregory.Murphy at Sun.COM Wed Oct 3 18:50:45 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Wed, 03 Oct 2007 15:50:45 -0700 Subject: [Mediacloth-devel] More on links Message-ID: <47041CC5.2000105@sun.com> Looking around at some of the links on Wikipedia, I find constructs like this: [[Image:Calend?rio Asteca.jpg|thumb|140px|right| [[June]]: [[Aztec]] battles.]] This is from the wiki page for 1520, http://en.wikipedia.org/wiki/1520. It's the image of the Aztec calendar in the right-hand column. I think this pretty conclusively illustrates why link syntax needs to be part of the grammar. In the above case, the parser needs to recognize links to pages embedded inside of the text portion of a link to an image. I think this also illustrates why the user link handler should be passed only the link portion, and be responsible for returning a URL. // Gregory From me at pekdon.net Wed Oct 3 19:00:53 2007 From: me at pekdon.net (Claes =?iso-8859-1?B?TuRzdOlu?=) Date: Thu, 4 Oct 2007 01:00:53 +0200 Subject: [Mediacloth-devel] More on links In-Reply-To: <47041CC5.2000105@sun.com> References: <47041CC5.2000105@sun.com> Message-ID: <20071003230053.GD22422@hydrogen.pekdon.net> On Wed, Oct 03, 2007 at 03:50:45PM -0700, Gregory Murphy wrote: > > I think this also illustrates why the user link handler should be passed > only the link portion, and be responsible for returning a URL. > What if the user wants to add a file icon or maybe some ajax linking to their wiki, only handling the URL would not be enough then? But yes, that URL example is rather complex and would be rather painfull to implement all the time everywhere. So, for most cases handling only the url should be enough and I very much do agree on this. Myself I do not have the need for anything more advanced _if_ the grammars handles image constructs as well, but is it generic enough for most common use? If so then we should go for an url only handler. > > // Gregory Cheers! -- Claes N?st?n, , http://www.pekdon.net/ "Money has corrupted so much in this world, life would be meaningless if it kills music as well." - Bin?rpilot -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://rubyforge.org/pipermail/mediacloth-devel/attachments/20071004/214f2272/attachment.bin From Gregory.Murphy at Sun.COM Fri Oct 5 17:49:13 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Fri, 05 Oct 2007 14:49:13 -0700 Subject: [Mediacloth-devel] Proposal for handling links Message-ID: <4706B159.1000209@sun.com> Here's a proposal for handling links, based on our discussion. I've got this working in my local copy, with a set of tests that encompass just about all the types of links that I have found on Wikipedia. First, enhance the lexer to recognize the different parts that make up a link. For an external link, there are one or two parts: a URL followed by an optional text label, the two being separated by white space. For an internal link, there is a locator (such as a wiki page name), followed by one or more text fields, with all being separated by the '|' character. Second, enhance the grammar to contain productions for links and internal links. The URL of a link and the locator of an internal link must be text, but the remaining fields are repeatable content (with one exception: an external link cannot contain another link in its text label). Links are stored in an instance of LinkAST, which has a url attribute, and zero or one children; and internal links are stored in an instance of InternalLinkAST, which has a locator attribute, and zero or more children. Third, enhance the HTML generator to recognize the new AST nodes. An external link is easy to handle: def parse_link(ast) text = super(ast) href = ast.url text = href if text.length == 0 "#{text}" end An internal link takes more work. It invokes a link handler, which can be set by the user to provide custom handling of internal links. The handler has two methods. The first method "url_for" handles the simple case of an internal link with a reference to a page (e.g. "[My Page]"). Here the user need only resolve the location. The second method handles all the special Mediawiki internal links that are of the form "prefix:locator" (e.g. "[Category:Help]" or "[Image:foo.jpg|100px|border]"). Since these links can only be interpreted correctly in their entirety, all the link contents are passed to the handler, and the handler must return a complete HTML link. class MediaWikiLinkHandler #Method invoked to resolve references to wiki pages when they occur in an #internal link. In all the following internal links, the page name is #My Page: #* [[My Page]] #* [[My Page|Click here to view my page]] #* [[My Page|Click ''here'' to view my page]] #The return value should be a URL that references the page resource. def url_for(resource) "javascript:void(0)" end #Method invoked to resolve references to resources of unknown types. The #type is indicated by the resource prefix. Examples of inline links to #unknown references include: #* [[Media:video.mpg]] (prefix Media, resource video.mpg) #* [[Image:pretty.png|100px|A ''pretty'' picture]] (prefix Image, # resource pretty.png, and options 100px and A # pretty picture. #The return value should be a well-formed hyperlink, image, object or #applet tag. def link_for(prefix, resource, options=[]) "#{prefix}:#{resource}(#{options.join(', ')})" end end The generator has a "link_handler" property that the caller can use to set a custom handler. The generator method for internal links uses the handler: def parse_internal_link(ast) tokens = ast.locator.split(':') resource = tokens.pop prefix = tokens.pop if prefix options = ast.children.map do |node| r = parse_formatted(node) if node.class == FormattedAST r = parse_text(node) if node.class == TextAST r = parse_link(node) if node.class == LinkAST r = parse_internal_link(node) if node.class == InternalLinkAST r end link_handler.link_for(prefix, resource, options) else text = parse_wiki_ast(ast) text = ast.locator if text.length == 0 href = link_handler.url_for(ast.locator) "#{text}" end end The advantage of this approach is that it provides a simple extension mechanism. If a wiki want to support applets or other embedded objects, it can just extend the internal link syntax, and provide code in the link_for() method to handle it, e.g. "[applet:foo|var1=one|var2=two]". With the above, I am able to handle all the following in my unit tests: parse("http://sun.com") parse("[http://sun.com]") parse("[http://sun.com stars]") parse("[http://sun.com stars and moon]") parse("[http://sun.com stars and '''moon''']") parse("[[sun]]") parse("[[sun|All about Sun]]") parse("[[image:sun|All about Sun]]") parse("[[image:sun|nofollow|All about Sun]]") parse("[[image:sun|nofollow|All about '''Sun''']]") parse("[[image:sun|All about [[Sun]]]]") parse("[[image:sun|All about [[sun|More about]]]]") parse("[ ]") parse("[]") parse("[[ ]]") parse("[[]]") (The last four appear as plain text in the AST, rather than as empty links, since this is how Mediawiki treats them). I would like to commit these changes to the trunk if everyone agrees to them. I can post a complete set of diffs if you want to try it out first. // Gregory From Gregory.Murphy at Sun.COM Mon Oct 8 14:50:10 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Mon, 08 Oct 2007 11:50:10 -0700 Subject: [Mediacloth-devel] Proposal for handling links In-Reply-To: <4706B159.1000209@sun.com> References: <4706B159.1000209@sun.com> Message-ID: <470A7BE2.8090708@sun.com> I have committed the changes to a private branch for now, ./mediacloth/branches/gjmurphy // Gregory From thiago.arrais at gmail.com Sat Oct 13 09:32:30 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Sat, 13 Oct 2007 10:32:30 -0300 Subject: [Mediacloth-devel] Proposal for handling links In-Reply-To: <4706B159.1000209@sun.com> References: <4706B159.1000209@sun.com> Message-ID: Gregory, I have tested the latest revision on your branch and all I can say is "me likes it!" :-) At least the url_for method works greatly for me, I have not tested the other one just because I do not need it yet. I would still prefer to pass the link handler into the object constructor, though. I know this isn't mediacloth current style (look at the call "parser.lexer = MediaWikiLexer.new" for example), but I just like it. So you have my +1 for this. What do you people think about the constructor thing? We could use a hash or something else just to avoid parameter proliferation... Cheers, Thiago Arrais From thiago.arrais at gmail.com Sat Oct 13 10:54:30 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Sat, 13 Oct 2007 11:54:30 -0300 Subject: [Mediacloth-devel] HTML generator ignoring link text Message-ID: The HTML generator (in gjmurphy's branch) was ignoring the text provided for the links and always using the resource reference as link text. For example: [[sun|All about Sun]] was rendered as sun instead of All about Sun I took the liberty to fix that on gjmurphy's branch. Please revert the change if you feel this isn't expected behavior. Best regards, Thiago Arrais From Gregory.Murphy at Sun.COM Sun Oct 14 11:33:34 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Sun, 14 Oct 2007 08:33:34 -0700 Subject: [Mediacloth-devel] Proposal for handling links In-Reply-To: References: <4706B159.1000209@sun.com> Message-ID: <471236CE.4070708@sun.com> Thiago Arrais wrote: > Gregory, > > I have tested the latest revision on your branch and all I can say is > "me likes it!" :-) At least the url_for method works greatly for me, I > have not tested the other one just because I do not need it yet. > > I would still prefer to pass the link handler into the object constructor, > though. I know this isn't mediacloth current style (look at the call > "parser.lexer = MediaWikiLexer.new" for example), but I just like it. > > So you have my +1 for this. What do you people think about the > constructor thing? We could use a hash or something else just to > avoid parameter proliferation... Would you prefer passing the handler in as a parameter to the constructor, and also getting rid of the attribute? This might send the message that the generator is not re-usable with different handlers. Or do you prefer having the constructor parameter as a simple alternative to calling the setter? If so, I agree with you, as it makes for simpler client code. // Gregory From Gregory.Murphy at Sun.COM Sun Oct 14 12:36:47 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Sun, 14 Oct 2007 09:36:47 -0700 Subject: [Mediacloth-devel] Proposal for handling links In-Reply-To: References: <4706B159.1000209@sun.com> Message-ID: <4712459F.4090003@sun.com> Thiago Arrais wrote: > Gregory, > > I have tested the latest revision on your branch and all I can say is > "me likes it!" :-) At least the url_for method works greatly for me, I > have not tested the other one just because I do not need it yet. Is it OK if I merge this into the trunk? // Gregory From thiago.arrais at gmail.com Sun Oct 14 16:15:39 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Sun, 14 Oct 2007 17:15:39 -0300 Subject: [Mediacloth-devel] Proposal for handling links In-Reply-To: <4712459F.4090003@sun.com> References: <4706B159.1000209@sun.com> <4712459F.4090003@sun.com> Message-ID: On 10/14/07, Gregory Murphy wrote: > > Is it OK if I merge this into the trunk? At least for me it is. I have some suggestions on nofollow and other link attributes, but we can change that later. Cheers, Thiago Arrais From thiago.arrais at gmail.com Tue Oct 16 11:09:35 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 16 Oct 2007 12:09:35 -0300 Subject: [Mediacloth-devel] Pipe chars inside link text Message-ID: MediaWiki has two kinds of internal links: resource references and page links. Resource references are identified by a resource prefix followed by a resource locator, while page links have only the locator. This is a resource reference: [[Image:example.jpg|100px|80px|Just an example]] but this is a page link: [[MyPage|Please visit my page]] Pipe chars inside resource references should all be interpreted as options separators, but inside page links only the first one should. This text here, for example: [[MyOtherPage|With some|weird pipes inside|the text]] should be rendered like With some|weird pipes inside|the text The pipe chars are interpreted as common text, not as options separators. I have changed the lexer and parser to behave like this on my branch (branches/thiago, r56). What do you all think? Should I merge this into trunk? Best regards, Thiago Arrais From Gregory.Murphy at Sun.COM Tue Oct 16 19:18:09 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Tue, 16 Oct 2007 16:18:09 -0700 Subject: [Mediacloth-devel] Pipe chars inside link text In-Reply-To: References: Message-ID: <471546B1.7090901@sun.com> Thanks for noticing this! I merged my changes into the trunk already, so please merge yours too. I think that we have pretty good link handling now. Did you also add some test cases? // Gregory Thiago Arrais wrote: > MediaWiki has two kinds of internal links: resource references and > page links. Resource references are identified by a resource prefix > followed by a resource locator, while page links have only the > locator. This is a resource reference: > > [[Image:example.jpg|100px|80px|Just an example]] > > but this is a page link: > > [[MyPage|Please visit my page]] > > Pipe chars inside resource references should all be interpreted as > options separators, but inside page links only the first one should. > This text here, for example: > > [[MyOtherPage|With some|weird pipes inside|the text]] > > should be rendered like > > With some|weird pipes > inside|the text > > The pipe chars are interpreted as common text, not as options > separators. > > I have changed the lexer and parser to behave like this on my > branch (branches/thiago, r56). What do you all think? Should I merge this > into trunk? > > Best regards, > > Thiago Arrais > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel > From thiago.arrais at gmail.com Tue Oct 16 19:27:28 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Tue, 16 Oct 2007 20:27:28 -0300 Subject: [Mediacloth-devel] Pipe chars inside link text In-Reply-To: <471546B1.7090901@sun.com> References: <471546B1.7090901@sun.com> Message-ID: On 10/16/07, Gregory Murphy wrote: > Thanks for noticing this! I merged my changes into the trunk already, so > please merge yours too. I think that we have pretty good link handling now. > > Did you also add some test cases? The ones that we had before pretty much covered what I changed, I just had to change the expected output. By the way, I sometimes find it hard to pinpoint the exact failure point with all that output text. I did merge my changes back to trunk. We should be good to go with the links right now. I just need to try some changes to handle custom link attributes and nofollow directives. Cheers, Thiago Arrais From Gregory.Murphy at Sun.COM Tue Oct 16 20:18:01 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Tue, 16 Oct 2007 17:18:01 -0700 Subject: [Mediacloth-devel] Pipe chars inside link text In-Reply-To: References: <471546B1.7090901@sun.com> Message-ID: <471554B9.50206@sun.com> Thiago Arrais wrote: > On 10/16/07, Gregory Murphy wrote: > >> Thanks for noticing this! I merged my changes into the trunk already, so >> please merge yours too. I think that we have pretty good link handling now. >> >> Did you also add some test cases? >> > > The ones that we had before pretty much covered what I changed, I > just had to change the expected output. By the way, I sometimes > find it hard to pinpoint the exact failure point with all that output text. I agree! I find the current testing files very hard to work with, as both the expected lexer and HTML output files are hard to read. For the lexer output, it would be nice if the files contained something like YML or JSON formatted lexer tokens, as white space could be used to make them more readable. For the HTML, I'm not sure how to improve things. Perhaps do the assert comparison using DOMs instead of strings? It would also be nice to have a third set of tests based on comparing the actual AST with the expected AST, again, using a file format that makes the AST easy to read. // Gregory From thiago.arrais at gmail.com Thu Oct 18 16:03:01 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Thu, 18 Oct 2007 20:03:01 +0000 Subject: [Mediacloth-devel] Verbose test failure Message-ID: On 10/17/07, Gregory Murphy wrote: > > I agree! I find the current testing files very hard to work with, as > both the expected lexer and HTML output files are hard to read. > > For the lexer output, it would be nice if the files contained something > like YML or JSON formatted lexer tokens, as white space could be used to > make them more readable. > > For the HTML, I'm not sure how to improve things. Perhaps do the assert > comparison using DOMs instead of strings? > > It would also be nice to have a third set of tests based on comparing > the actual AST with the expected AST, again, using a file format that > makes the AST easy to read. Very good ideas, Gregory. I think we can even avoid using input/result files for some cases (wherever that does not represent a chance for duplication). I like, for example, how the lexer tests are structured with the input and expected output both in the same place. It makes them easier to read. Maybe we could do that with some of the other tests. Cheers, Thiago Arrais -- Mergulhando no Caos - http://thiagoarrais.wordpress.com Pensamentos, id?ias e devaneios sobre desenvolvimento de software e tecnologia em geral From thiago.arrais at gmail.com Fri Oct 19 19:58:06 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Fri, 19 Oct 2007 23:58:06 +0000 Subject: [Mediacloth-devel] Handling custom attributes in links Message-ID: As I have said before, I was missing some way to provide wiki links with custom attributes besides the referenced URL. Some wikis will want to apply a custom appearance to empty pages links, for example. The wiki I am working on, for example, is used for project tracking and links to already done features should be striked. Uses are many and people could always subclass the HTML generator to override the default behaviour, but maybe things will be easier for everyone if it is already supported in "vanilla" Mediacloth. The changes I have made my private branch try to accomplish that, but I would like to get some comments on the code before merging into trunk. I have changed the code Gregory wrote to accept two results from the url_for method (inside an array). The first one should be the URL, and the second one a hash with any extra attributes the link handler wishes to provide. The code still accepts URL-only returns and should not break old code and makes it easier for people that do not want to mess with link attributes. I, however, haven't decided about the final design. One of the other options I thought about was to expect either a hash or an string. If the link handler provides a string only, it would be interpreted as the resource URL and the link element would not have any extra attributes. Otherwise (if provided with a hash), the handler could return any attributes that it wanted and would have to remember to include an href attribute. Any ideas? By the way, for those that want to play with the code, it is in http://mediacloth.rubyforge.org/svn/mediacloth/branches/thiago Cheers, Thiago Arrais From Gregory.Murphy at Sun.COM Sat Oct 20 10:19:15 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Sat, 20 Oct 2007 07:19:15 -0700 Subject: [Mediacloth-devel] Handling custom attributes in links In-Reply-To: References: Message-ID: <471A0E63.7080502@sun.com> Thiago, I agree with the need to customize links. But it could require even more than custom attributes. Looking at how TWiki handles page links, for example, if the page does not yet exist, then the link text is modified. So in this case, the user really needs to customize the whole link element. I wonder if we should split the link handler into three methods: url_for_resource link_for_resource link_for The "link_for" method is invoked for internal links to resources other than pages (e.g. images, or anything with a special prefix). However, if the internal link is to a page resource, then "link_for_resource" is invoked. The default implementation of this method just creates a link, by calling "url_for_resource". If we make it easier for users to sub-class LinkHandler, then most users need only provide a custom implemenation of "url_for_resource". In your case, you would provide a custom implementation of "link_for_resource". The result might not even be a link, for example, if the page target has been deleted. I realize that all the above may seem very Java, but I've been doing C++ and Java for a long time, still new to Ruby. :-) // Gregory Thiago Arrais wrote: > As I have said before, I was missing some way to provide wiki links > with custom attributes besides the referenced URL. Some wikis will > want to apply a custom appearance to empty pages links, for > example. The wiki I am working on, for example, is used for project > tracking and links to already done features should be striked. > > Uses are many and people could always subclass the HTML > generator to override the default behaviour, but maybe things will be > easier for everyone if it is already supported in "vanilla" Mediacloth. > > The changes I have made my private branch try to accomplish that, > but I would like to get some comments on the code before merging > into trunk. I have changed the code Gregory wrote to accept two > results from the url_for method (inside an array). The first one should > be the URL, and the second one a hash with any extra attributes > the link handler wishes to provide. The code still accepts URL-only > returns and should not break old code and makes it easier for people > that do not want to mess with link attributes. > > I, however, haven't decided about the final design. One of the other > options I thought about was to expect either a hash or an string. If > the link handler provides a string only, it would be interpreted as the > resource URL and the link element would not have any extra > attributes. Otherwise (if provided with a hash), the handler could > return any attributes that it wanted and would have to remember to > include an href attribute. > > Any ideas? > > By the way, for those that want to play with the code, it is in > > http://mediacloth.rubyforge.org/svn/mediacloth/branches/thiago > > Cheers, > > Thiago Arrais > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel > From Gregory.Murphy at Sun.COM Thu Oct 25 18:35:34 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Thu, 25 Oct 2007 15:35:34 -0700 Subject: [Mediacloth-devel] Support for tables Message-ID: <47211A36.5020306@sun.com> As of revision 71, there is support for Mediawiki-style tables in the trunk. See the file "input10" in the test data for numerous examples of tables. // Gregory From alexander.dymo at gmail.com Fri Oct 26 03:20:43 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Fri, 26 Oct 2007 10:20:43 +0300 Subject: [Mediacloth-devel] Support for tables In-Reply-To: <47211A36.5020306@sun.com> References: <47211A36.5020306@sun.com> Message-ID: This is really cool :) Thanks! 2007/10/26, Gregory Murphy : > As of revision 71, there is support for Mediawiki-style tables in the > trunk. See the file "input10" in the test data for numerous examples of > tables. > > // Gregory > _______________________________________________ > Mediacloth-devel mailing list > Mediacloth-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/mediacloth-devel > From Gregory.Murphy at Sun.COM Fri Oct 26 19:21:30 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Fri, 26 Oct 2007 16:21:30 -0700 Subject: [Mediacloth-devel] Dealing with (X)HTML markup Message-ID: <4722767A.4040109@sun.com> Mediawiki allows users to add some (X)HTML markup to the wiki text for a page. For example, to wrap a code sample, you use the XHTML "tt" element, as in "sprintf". In this case, what you get in the converted page is exactly what you typed in, since Mediawiki accepted it. If an element is not acceptable, then the angle brackets are escaped. For example, if you type "" in the editor, it is converted to "<script>do_evil_things();</script>". Mediawiki also converts "<" and ">" (as well as "&") when they are not part of an element start or end tag. This feature is important, because it allows wiki services to prevent cross-site scripting attacks, and other forms of evil. (Mediawiki also supports the special, non-XHTML element "nowiki", as in "== this is not a heading ==", used to escape wiki syntax.) I think it would be a good thing for Mediacloth to support this style of markup "white listing". Here is a suggested strategy. The lexer would recognize well-formed XHTML start tags, end tags, and attribute name/value pairs. So, for example, the input string < <+> xxx> zzz Would be tokenized as [:TEXT, "< <+> xxx> "], [:TAG_START, "xxx"], [:TEXT, " zzz "], [:TAG_END, "xxx"] In other words, anything that is not well-formed is simply not recognized as a start- or end-tag. The parser then has the job of checking for validity, by ensuring that all start- and end-tag pairs match. If they don't, a parse exception is thrown. I suggest that only XHTML be accepted, and not HTML, as it forces users to type empty tags as "", which the lexer can easily recognize. The walker has default handlers for start and end tags, which perform the "white-list" processing. This could be implement as a plug-in, so that users can customize the accepted markup. The default implementation should be based on current best-practices for safe sanitizing of XHTML, see for example http://wiki.whatwg.org/wiki/Sanitization_rules If a tag fails sanitizing test, the default implementation would escape its angle brackets, just as Mediawiki does. If it passes, it is output as-is. Comments? // Gregory From Gregory.Murphy at Sun.COM Wed Oct 31 12:47:51 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Wed, 31 Oct 2007 09:47:51 -0700 Subject: [Mediacloth-devel] Lexer's treatment of newlines Message-ID: <4728B1B7.7080303@sun.com> I've just filed a bug about this, but thought I would also bring it to the attention of the list. Mediawiki generates paragraphs for every two newlines, no matter how many pairs of newlines occur in a row. Mediacloth generates only one. So, for example, the text one two three four five six is rendered by Mediawiki as

one

two


three


four



five



six

Mediacloth on the other hand suppresses the intervening empty paragraphs, so all six lines are equally spaced. Mediawiki also wraps text in a paragraph, even when it is flanked top and bottom by a list or table, so for example the text text * a * b text {| | a | b |} text is rendered by Mediawiki as

text

  • a
  • b

text

a b

text

Mediacloth on the other hand suppresses the paragraphs. Making the lexer conform more closely to what Mediawiki does will have the added benefit of simplifying the lexer code considerably, as there will be no need of extra logic to look remove empty paragraphs. From Gregory.Murphy at Sun.COM Wed Oct 31 12:49:32 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Wed, 31 Oct 2007 09:49:32 -0700 Subject: [Mediacloth-devel] Time for another release? Message-ID: <4728B21C.9000504@sun.com> There have been quite a few bug fixes, and some new features, since the last Mediacloth release. Is it time for another? I think only the project owners can arrange this. I'd be happy to put together the release notes. // Gregory From alexander.dymo at gmail.com Wed Oct 31 13:18:30 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Wed, 31 Oct 2007 19:18:30 +0200 Subject: [Mediacloth-devel] Time for another release? In-Reply-To: <4728B21C.9000504@sun.com> References: <4728B21C.9000504@sun.com> Message-ID: <200710311918.30981.alexander.dymo@gmail.com> On Wednesday 31 October 2007 18:49, Gregory Murphy wrote: > There have been quite a few bug fixes, and some new features, since the > last Mediacloth release. Is it time for another? Do we have tests for all new features? Does gem builds? > I think only the project owners can arrange this. I'd be happy to put > together the release notes. No problem, please update gemspec and send release notes here and I'll do the release. From thiago.arrais at gmail.com Wed Oct 31 13:24:55 2007 From: thiago.arrais at gmail.com (Thiago Arrais) Date: Wed, 31 Oct 2007 14:24:55 -0300 Subject: [Mediacloth-devel] Lexer's treatment of newlines In-Reply-To: <4728B1B7.7080303@sun.com> References: <4728B1B7.7080303@sun.com> Message-ID: On 10/31/07, Gregory Murphy wrote: > Making the lexer conform more closely to what Mediawiki does will have > the added benefit of simplifying the lexer code considerably, as there > will be no need of extra logic to look remove empty paragraphs. +1 -- Thiago Arrais From Gregory.Murphy at Sun.COM Wed Oct 31 14:19:07 2007 From: Gregory.Murphy at Sun.COM (Gregory Murphy) Date: Wed, 31 Oct 2007 11:19:07 -0700 Subject: [Mediacloth-devel] Time for another release? In-Reply-To: <200710311918.30981.alexander.dymo@gmail.com> References: <4728B21C.9000504@sun.com> <200710311918.30981.alexander.dymo@gmail.com> Message-ID: <4728C71B.5030501@sun.com> Alexander Dymo wrote: > On Wednesday 31 October 2007 18:49, Gregory Murphy wrote: > >> There have been quite a few bug fixes, and some new features, since the >> last Mediacloth release. Is it time for another? >> > Do we have tests for all new features? Does gem builds? > The tests are up-to-date, and the gem builds and installs correctly. > >> I think only the project owners can arrange this. I'd be happy to put >> together the release notes. >> > No problem, please update gemspec and send release notes here and I'll do the > release. I have incremented the release version in the gemspec to 0.0.3. I'll send you the release notes shortly. Thanks! // Gregory From alexander.dymo at gmail.com Wed Oct 31 15:16:51 2007 From: alexander.dymo at gmail.com (Alexander Dymo) Date: Wed, 31 Oct 2007 21:16:51 +0200 Subject: [Mediacloth-devel] Time for another release? In-Reply-To: <4728C71B.5030501@sun.com> References: <4728B21C.9000504@sun.com> <200710311918.30981.alexander.dymo@gmail.com> <4728C71B.5030501@sun.com> Message-ID: <200710312116.51493.alexander.dymo@gmail.com> On Wednesday 31 October 2007 20:19, Gregory Murphy wrote: > I have incremented the release version in the gemspec to 0.0.3. I'll > send you the release notes shortly. Released!