[kramdown-users] possible bug: kramdown wrapping <br> in <p>

Thomas Leitner t_leitner at gmx.at
Thu Aug 12 06:14:04 EDT 2010


On 2010-08-11 23:15 +0700 Shawn Van Ittersum wrote:
> I disagree that contiguous is irrelevant.  I also see what kramdown
> is doing, and it's something that developers can figure out.
> However, when you or I use kramdown in some other product, its end
> users are non-developers, who are not going to figure this out.
> Non-developers who are going to copy in a block of HTML from
> somewhere, expecting it to be untouched as it would be in Markdown,
> and being very frustrated to see kramdown messing with it.
>
> Contiguous blocks of HTML should be left alone.  kramdown results
> should not diverge from Markdown results except to add clear benefit,
> and I fail to see the benefit if this particular divergence.

The Markdown implementation by Gruber does actually the same in this
case, as does PHP Markdown.
 
> With regard to your example input of "test": the rule we discussed
> before was that the first line and any line that followed a blank
> line would be treated as the start of a new block.  Following this
> rule, it is fitting to wrap "test" in p tags.  It is not appropriate
> to apply p tags in the middle of a block of HTML, regardless of the
> opening and closing of DOM elements within that HTML block.  kramdown
> should respect the rules of Markdown first.

No, this is not correct. For example, you can have a blank line between
two parts of one code block, like this:

    This is the first part of the code block.

    This is still the same code block.

Markdown is more or less a line oriented markup language: each line is
processed on its own and assigned a certain meaning. If it starts with
a hash, it becomes a header; if it starts with four spaces or one tab,
it is a code block. This is also how kramdown does the parsing: one
line after another.

There are some syntax elements that span multiple lines, like setext
headers, but they also don't stop in the middle of the line but need
whole lines.

The only block level parser that is different regarding this is... the
HTML parser but there are good reasons for being different!

> I still don't understand why kramdown has an HTML parser... HTML tags
> should simply pass through untouched.

HTML parsing is hard, although it seems easy. This is the reason why
there have been several completely different HTML parsers in kramdown.
The first incarnation did something very simple and wasn't very
powerful (ie. no "markdown" attribute and no markdown-in-html
parsing, many problematic edge cases).

The current parser is needed to ensure that everything works as good as
possible:

* There are clearly defined rules how HTML is parsed in kramdown. If
  you follow the rules, you know what you will get as output. This is
  documented at http://kramdown.rubyforge.org/syntax.html#html-blocks
  If you find a difference to what kramdown actually does, please
  report!

* kramdown-in-HTML is supported as well as the "markdown" attribute.

* Edge cases are reduced to a minimum.

* Adding HTML to a kramdown document is easier for end users because the
  kramdown HTML parser is more intelligent and tries to do the correct
  thing.

* I did not develop the rules for the HTML parser alone but was helped
  and often corrected by people on this ML.

-- Thomas


More information about the kramdown-users mailing list