[kramdown-users] kramdown table-making has gone completely insane

Shawn Van Ittersum svicalifornia at gmail.com
Wed Oct 13 16:15:58 EDT 2010


What about the possibility that automatic line-wrapping (as by mail agents) will break a code span across multiple lines?  Should it not still be considered a code span?  To fully support lazy indentation, code spans should be allowed to span multiple lines.

And if so, then this example:

>     This is *not* a code span, | it `continues on the
>     second line and ends` here | some other text here

is really like this:

>     This is *not* a code span, | it `continues on the second line and ends` here | some other text here

which yields a table with one row, three cells, the middle of which contains a code span.

This is really simple: regexp search the document for backtick pairs, and replace each pair's contents with a hash, GFM-style.  Then apply your other kramdown interpreters, line-by-line.  Then expand the hashes at the end.

If someone puts a couple of unescaped backticks in the document, then there's going to be a huge code span.  But that's what is supposed to happen.  Backticks wrap code spans, pure and simple.  If users don't want code spans, they should escape their backticks.

Shawn

On Wed, 13 Oct 2010 16:00:29 +0200, Thomas Leitner wrote:
> On 2010-10-13 19:18 +0700 Shawn Van Ittersum wrote:
>>> However, think about the following example:
>>> 
>>>     This is *not* a `code | span, it continues on the
>>>     second line and ends` here | some other text here
>> 
>> Why not interpret backticks across multiple lines as multi-line code
>> spans?
> 
> This is currently done *if* the above is parsed as paragraph and not as
> table.
> 
>>> If I parse for code spans line by line, the code span which
>>> continues to the second line, is not found and therefore this is
>>> interpreted as a table with two rows and two cells each. However, a
>>> human reader would probably see the code span...
>> 
>> Right, which is why kramdown should interpret it as a code span, the
>> same way a human reader would.  Principle of least surprise.
> 
> I know, I just wanted to let you know that the GFM approach with line by
> line parsing does not work. Another example:
> 
>     This is *not* a code span, | it `continues on the
>     second line and ends` here | some other text here
> 
> How would you interpret this? It looks like a table but the code span
> seems to be continued on the second line... So I think the only way to
> resolve this would be to post-process paragraphs which also may not
> work correctly (I'm thinking of definition lists...). Regarding the
> above example, I would treat it as a table with two rows and two
> columns.
> 
> So what about this:
> 
> * A line that contains a pipe potentially starts a table if found on a
>   block boundary.
> * Read the line with the table parser and continue with the next lines
>   until (a) a block boundary is found or (b) a line without a pipe is
>   found.
> * If we are in case (b), then this is definitely not a table and we go
>   back to the first line and run the rest of the block parsers on the
>   line.
> * If we are in case (a), the whole text (i.e. all parsed lines
>   together) is parsed with the code span parser. Then we have again
>   multiple possibilities:
> 
>   - (a1) No code span contains a pipe
>   - (a2) A code span that is on one line contains a pipe and there is
>     another pipe on the same line
>   - (a3) A code span that is on one line contains a pipe and there is no
>     other pipe on the same line
>   - (a4) A code span spanning two or more lines contains a pipe and
>     there is another pipe on the line where the pipe is
>   - (a5) A code span spanning two or more lines contains a pipe and
>     there is no other pipe on the line where the pipe is
> 
>   In case (a1) everything is fine and we parse the lines as table lines
>   even if a code span spans multiple lines.
>   In case (a2) we have to account for the pipe character in the code
>   span and not split the table line there but aside from this we can
>   parse the lines as table lines.
>   In cases (a3) and (a5) we have a line without a pipe (except the one
>   inside the code span) and therefore we go back to the first line and
>   run the rest of the block parsers on this first line.
>   Case (a4) is difficult, here is an example:
> 
>       bla bla | bla bla `bla bla | bla bla
>       bla bla bla bla` bla | bla bla bla
> 
>   Since this is similar to case (a2) I would treat this as a table.
>   However, since the code span spans multiple lines, treating each line
>   as a table line will destroy the code span and the second pipe on the
>   first line will become another table cell separator since it is not
>   enclosed in a code span. I don't think that this is a major problem,
>   though.
> 
> -- Thomas
> _______________________________________________
> kramdown-users mailing list
> kramdown-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/kramdown-users


More information about the kramdown-users mailing list