[kramdown-users] using kramdown in xhtml

Matt Neuburg matt at tidbits.com
Wed Jun 9 15:46:29 EDT 2010

On Jun 9, 2010, at 12:07 PM, john muhl wrote:

> i don't think the readability argument is very strong in this case. of
> course if you're composing html by hand then named entities are
> preferable but when it's output by an application from markdown input
> then it doesn't matter to me what comes out the end.
> p.s. markdown/smartypants.pl outputs numeric entities; probably for
> the same reason.

The main reason numeric entities are preferable is that they are easier to generate and they exist for every character. You just put &#xxx; where xxx is the numeric value of the character. What is the named entity for ☐ (a "ballot box")? There isn't one.

I think what I'd really like is the option for actual *characters* to be output. Only two characters must be entityized: ampersand and less-than. But that's all. I'm making UTF-8 encoded Web pages and I'm starting with UTF-8 encoded Markdown / kramdown, so all the characters I'm using are legal as they stand. I don't need them transformed at all. I type my own em-dashes, ellipses, Ancient Greek, etc. The only things I'm not typing myself are the curly quotes and curly apostrophe. I think Smartypants produces entities for its curly quotes etc. only because it doesn't know what encoding I'll be using. But I do know! :) So I'd be happiest if kramdown's Smartypants function allowed me to specify that I just want characters for any transformations it produces. Then I'd be able to read my own XHTML.

On the other hand if I were producing output for a non-UTF-8 milieu then everything that isn't ASCII must be entityized. In that case, only numeric entities make sense, because there are no named entities for most of the characters I use. m.

More information about the kramdown-users mailing list