[kramdown-users] [ANN] kramdown 0.9.0 released
Eric Sunshine
sunshine at sunshineco.com
Wed Jun 23 05:55:19 EDT 2010
Hi Thomas,
On 6/23/2010 2:51 AM, Thomas Leitner wrote:
> The biggest change in this release is the addition of a kramdown
> converter. This converter together with the HTML parser enables one to
> convert an HTML document into a kramdown document.
Very nicely done. I ran some tests on this feature and the results were
very favorable, though there were a few issues worth reporting.
First, what is the intended behavior when feeding kramdown a
fully-structured HTML document containing <html>, <head>, <body>? In my
tests, upon converting to kramdown, a 'markdown="1"' attribute was added
to <body>, however, this was ignored when converting the document back
to HTML. Even when adding 'markdown="1"' manually to the parent <html>
node, conversion back to HTML failed (that is, no Markdown processing
was performed at all inside the body).
Second, once I stripped the <html> and <body> boilerplate from the
document, conversion on Windows from HTML to kramdown succeeded, but
conversion back to HTML failed with this exception:
C:\>kramdown test.kd > test.html
c:/ruby/lib/ruby/gems/1.9.1/gems/kramdown-0.9.0/lib/kramdown/parser/kramdown.rb:206:in
`check': incompatible encoding regexp match (UTF-8 regexp with IBM437
string) (Encoding::CompatibilityError)
Invoking 'set LANG=en_US.UTF-8' at the Windows command prompt resolved
this issue. Note that the original HTML contained &#xHH; entity
references for copyright, elipses, etc.
Third, this is an old HTML document still using <b>bold</b> elements
rather than <strong>...</strong>. The <b>bold</b> elements were not
converted to **bold** Markdown. I think it should be safe to treat <b>
as equivalent to <strong> for conversion purposes.
Fourth, I ran into a problem with a stand-alone <img/> element being
consumed by a subsequent paragraph. For instance, given HTML input:
<img src="foo.png" alt="foo" />
<p>bar</p>
Conversion to kramdown produced:

bar
And conversion back to HTML resulted in the <img/> becoming a child of
the <p>:
<p><img alt="foo" src="foo.png" />
bar</p>
-- ES
More information about the kramdown-users
mailing list