Bugs: Browse | Submit New | Admin

[#29282] Option to skip unknown markdown tags

Date:
2011-06-16 11:37
Priority:
3
Submitted By:
Bernard Potocki (imanel)
Assigned To:
Thomas Leitner (gettalong)
Category:
Parser
State:
Closed
Summary:
Option to skip unknown markdown tags

Detailed description
Kramdown currently have no option to strip non-markdown syntax. This normally would not be a problem, but some HTML
elements(like DIV) would break rendering. So if you will try to convert HTML->MD and then MD->HTML then you will
have not parsed Markdown parts inside of HTML. Example:

<div><a href="/obrazek/149124/znalezione,na,murze..html" class="mOUrl"><img
alt="Znalezione na murze."
src="http://i1.kwejk.pl/site_media/obrazki/a2ec35c793ad433133428fbca04e1c97.jpg?1307970484" title="Znalezione
na murze." /></a></div>

This also shows that class parameters are not always parsed. Is there any option to prevent block parameters?(class
and id)

Add A Comment: Notepad

Please login


Followup

Message
Date: 2012-06-03 07:01
Sender: Thomas Leitner

So, just implemented a converter that removes HTML tags. Will
be available in the next release.
Date: 2012-06-03 05:40
Sender: Thomas Leitner

If you need full forward (MD->HTML) and backward (HTML->MD)
capability, you will need to use the parse_block_html option.

For example, in kramdown you could just write:

<div>
This should be *a* **paragraph**!
</div>

The inside of the div would not be automatically converted, however,
if you use parse_block_html the block HTML tags like <div>
_are_ automatically parsed with kramdown (and therefore have
to adhere to kramdown's syntax spec). Inline HTML tags are always
parsed.

If you do not want to use the parse_block_html option, you will
need to write some Ruby code for extracting the exact HTML elements
that you want to appear in the output since kramdown cannot know
which of the HTML elements it should drop so that the overall
text retains the correct meaning!

What I can do is: I will add a converter that just removes _ALL_
block level HTML tags. This should probably be sufficient for
your use case but may result in undesired output.


Regarding the parsing of the class parameters: There was a bug
with parsing multi-line link titles which affected your example
- this is fixed now and available in the next release.

Attached Files:

Name Description Download
No Files Currently Attached

Changes:

Field Old Value Date By
close_date2012-06-03 07:012012-06-03 07:01gettalong
status_idOpen2012-06-03 07:01gettalong