If the line start with unicode non-ASCII character then all following Maruku code will not work, I've faced in when
working with Maruku in Rails environment due to Rails set Unicode locale, but it should be a general bug as anyone with
unicode locale will be hit by it. Here is the example:
---
Тестовый стринг <mail@mail.com>
Test string <mail@mail.com>
---
It will be parsed to
---
<p>Тестовый стринг <mail@mail.com></p>
<p>Test string <a
href='mailto:mail@mail.com'>mail@mail
;.com</a></p>
---
The problem lies in the next_matches method of CharSourceManual (lib/maruku/input/charsource.rb). It uses regexp
/.{#{@buffer_index}}#{r}/m to match the given regexp against the part in the input string, but in case of Unicode string
@buffer_index can't be used as the repetition counter. Instead we can just get a part of the input string and then just
match it to the original regexp from method's arguments. See the attached patch. |