[kramdown-users] [ANN] kramdown 0.9.0 released
t_leitner at gmx.at
Sat Jul 17 05:19:41 EDT 2010
On 2010-06-25 09:26 -0400 Eric Sunshine wrote:
> Hi Thomas,
> On 6/25/2010 2:19 AM, Thomas Leitner wrote:
> >> First, what is the intended behavior when feeding kramdown a
> >> fully-structured HTML document containing<html>,<head>,<body>?
> > It should output it in a hybrid format, i.e. converting everything
> > possible to kramdown and leaving the rest as HTML. I just ran a
> > sample HTML document through html-to-kramdown-to-html and it worked
> > fine for all things except the DOCTYPE - I have put this on my TODO
> > list.
> I'm not sure that I understand. When I feed it the HTML input:
> Body <strong>text</strong>.
> The emitted kramdown is:
> <body markdown="1"># Header
> Body **text**.
> But in the conversion back to HTML, kramdown entities, such as "#
> Header" and "***text***" are not converted to HTML equivalents. In
> fact, the output of kramdown -> HTML is identical to the input (minus
> the markdown="1" attribute):
> <body># Header
> Body **text**.
The problem is that the <body> tag was not in the list of tags that may
contain block level elements. I have fixed this and the use case above
> >> C:\>kramdown test.kd> test.html
> >> c:/ruby/lib/ruby/gems/1.9.1/gems/kramdown-0.9.0/lib/kramdown/parser/kramdown.rb:206:in
> >> `check': incompatible encoding regexp match (UTF-8 regexp with
> >> IBM437 string) (Encoding::CompatibilityError)
> > Hmm... I have to look at this, and probably generate some test cases
> > for checking encodings under Ruby 1.9. Could you send me the test.kd
> > document so that I can dig into it and find the offending regexp?
> I narrowed it down to this fragment:
> The equivalent <p>François</p> is converted to kramdown and
> back to HTML without problem.
I have tried to reproduce the problem but wasn't successful. I have
used the following test program (named `tt.rb`):
text = "François"
p [text.encoding, Encoding.default_internal, Encoding.default_external]
$ ruby tt.rb
[#<Encoding:IBM437>, nil, #<Encoding:UTF-8>]
where the question mark character is ccedil in the IBM437 encoding.
Could you provide step-by-step instructions of how to reproduce the
> >> Third, this is an old HTML document still using<b>bold</b>
> >> elements rather than<strong>...</strong>. The<b>bold</b> elements
> >> were not converted to **bold** Markdown. I think it should be safe
> >> to treat <b> as equivalent to<strong> for conversion purposes.
> > Yeah, I thought about this... but decided against it, can't remember
> > why. But it should probably be okay converting<b> and<i>
> > to<strong> and<em>.
> If the intention is for perfect fidelity in the HTML -> kramdown ->
> HTML chain, then I can understand not touching <b> and <i> since you
> could not reproduce them in the final HTML. Perhaps an option in the
> HTML parser could control whether <b> and <i> are folded to <strong>
> and <em>.
Haven't decided on this one, yet.
More information about the kramdown-users