[Ironruby-core] Encoding problem

Albert-Jan Pieter Nijburg albertjan at curit.com
Thu Jan 13 09:59:50 EST 2011


I Just found this:

 

#<Encoding:UTF-8>

puts "patiënt"

 

which outputs:  pati´┐¢nt 

 

It doesn’t crash anymore J

Van: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] Namens Dezso Zoltan
Verzonden: donderdag 13 januari 2011 14:52
Aan: ironruby-core at rubyforge.org
Onderwerp: Re: [Ironruby-core] Encoding problem

 

Hi,

 

I don't really know the solution to your question, but this might help:

ë is Unicode U+00EB, which is 0xC3AB in UTF-8 (so we are dealing with unicode rather than utf-8, which I assume is because IronRuby uses the immutable .NET strings internally with Unicode encoding).

 

The errors are expected if your default encoding is US-ASCII because it does not contain ë (and uses single bytes, so the 0x00EB would be broken into two bytes and your script would choke on the second 0xEB) : you will need to set your encoding to something compatible, like utf-8. 

 

I don't quite know how to do that properly in IronRuby, but in CRuby 1.9 you could use "magic comments" in your ruby file and in 1.8 something like $KCODE='u' could work. You might also be able to drop back into .NET and set the encoding there, but I'm not sure how that affects IronRuby assemblies.

 

I would start with $KCODE = 'u' Let me know how that works for you.

 

Zaki

 

On Thu, Jan 13, 2011 at 6:33 PM, Albert-Jan Pieter Nijburg <albertjan at curit.com> wrote:

Hi Guys,

 

My boss thought it would be cool to use “ë” in an sql tablename, many of you will want to shoot her now J.

 

But now I did find something weird, I can’t even print “ë”.

 

It says:

 

tabaco.rb:16:in `puts': character U+00EB can't be encoded in US-ASCII (Encoding::InvalidByteSequenceError)

        from tabaco.rb:16

 

or

 

when I print the string somewhere else :S when it comes back from a method.

 

System::Text::DecoderFallbackException at /patient/0

Unable to translate bytes [EB] at index 3 from specified code page to Unicode.

 

Or when I don’t mess with it 

 

Encoding::InvalidByteSequenceError at /patient/0

invalid byte sequence EB on UTF-8

 

 

All the same problem coming from 3 places. 

 

Is this a fundamental issue or should this be solvable?

 

If you could point me in the right direction I could try to maybe fix it.

 

 

Thanks,

 

Albert-Jan

 


_______________________________________________
Ironruby-core mailing list
Ironruby-core at rubyforge.org
http://rubyforge.org/mailman/listinfo/ironruby-core

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20110113/3898b2cf/attachment.html>


More information about the Ironruby-core mailing list