[Ironruby-core] Encoding problem

Albert-Jan Pieter Nijburg albertjan at curit.com
Thu Jan 13 10:48:36 EST 2011


Hey,

 

I found out that if I put nothing at the top and I do this:

 

puts "\x89"

 

it puts "ë"

 

if I put the #Encoding: UTF-8 at the top this happens. J

 

ëmscorlib:0:in `Throw': Unable to translate bytes [89] at index -1 from specifie

d code page to Unicode. (System::Text::DecoderFallbackException)

        from mscorlib:0:in `Fallback'

        from mscorlib:0:in `InternalFallback'

        from mscorlib:0:in `GetCharCount'

        from mscorlib:0:in `GetCharCount'

        from mscorlib:0:in `GetChars'

        from tabaco.rb:2:in `puts'

        from tabaco.rb:2

 

It does print it but then it dies.

 

#<Encoding: UTF-8>

puts "\x89".force_encoding("UTF-8") does the same

 

#<Encoding: UTF-8>

puts "ë".force_encoding("UTF-8") does the same as before.

Also without the #<Encoding thing>

 

So I thought I had it with the puts "\x89" and I tried this:

 

class PatGeg < ActiveRecord::Base

      set_table_name "Pati\x89ntGegevens"

end

 

PatGeg.first.Achternaam

 

and here's what I got

 

c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby/gems/1.9.1/g

ems/activerecord-3.0.0/lib/active_record/connection_adapters/abstract_adapter.rb

:200:in `log': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding:

:CompatibilityError)

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-sqlserver-adapter-3.0.0/lib/active_record/connecti

on_adapters/sqlserver/database_statements.rb:217:in `raw_select'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-sqlserver-adapter-3.0.0/lib/active_record/connecti

on_adapters/sqlserver/database_statements.rb:178:in `select'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/connection_adapters/abstra

ct/database_statements.rb:7:in `select_all'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/connection_adapters/abstra

ct/query_cache.rb:56:in `select_all'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/base.rb:467:in `find_by_sq

l'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation.rb:64:in `to_a'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/finder_methods.rb

:333:in `find_first'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/finder_methods.rb

:122:in `first'

        from c:6:in `__send__'

        from c:6:in `first'

 

I've tried to do the force_encoding("UTF-8") on this thing to which results in something very similar :

 

c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby/gems/1.9.1/g

ems/activerecord-sqlserver-adapter-3.0.0/lib/active_record/connection_adapters/s

qlserver/quoting.rb:31:in `=~': invalid byte sequence 89 on UTF-8 (Encoding::Inv

alidByteSequenceError)

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-sqlserver-adapter-3.0.0/lib/active_record/connecti

on_adapters/sqlserver/quoting.rb:31:in `quote_table_name'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/base.rb:597:in `quoted_tab

le_name'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/query_methods.rb:

234:in `build_select'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/query_methods.rb:

159:in `build_arel'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/query_methods.rb:

110:in `arel'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation.rb:64:in `to_a'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/finder_methods.rb

:333:in `find_first'

        from c:/ir/irtest/External.LCA_RESTRICTED/Languages/Ruby/ruby19/lib/ruby

/gems/1.9.1/gems/activerecord-3.0.0/lib/active_record/relation/finder_methods.rb

:122:in `first'

        from c:6:in `__send__'

        from c:6:in `first'

        from tabaco.rb:34

 

I have a feeling that ironruby and .net are not in sync with the encodings

 

Albert-Jan

 

Van: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] Namens Dezso Zoltan
Verzonden: donderdag 13 januari 2011 16:04
Aan: ironruby-core at rubyforge.org
Onderwerp: Re: [Ironruby-core] Encoding problem

 

Hi,

 

> warning: variable $KCODE is no longer effective

 

This means that you are in 1.9 mode :) In that case there are two things you could try:

1) set the encoding at the top of the file in the form of the comment:

# encoding: UTF-8

 

2) force an encoding on the string(s) in question with the method (if 1) fails in IronRuby):

.force_encoding("UTF-8")

 

Zaki

 

On Thu, Jan 13, 2011 at 11:20 PM, Albert-Jan Pieter Nijburg <albertjan at curit.com> wrote:

Hey Zaki,

 

WARNING: YAML.add_builtin_type is not implemented

unknown:0: warning: variable $KCODE is no longer effective

tabaco.rb:11:in `puts': character U+00EB can't be encoded in US-ASCII (Encoding:

:InvalidByteSequenceError)

        from tabaco.rb:11

 

Too bad.. thanks though. I'll have a look in the source if I can find something.

 

Annoying Europeans :P 

 

Albert-Jan

 

Van: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] Namens Dezso Zoltan
Verzonden: donderdag 13 januari 2011 14:52
Aan: ironruby-core at rubyforge.org
Onderwerp: Re: [Ironruby-core] Encoding problem

 

Hi,

 

I don't really know the solution to your question, but this might help:

ë is Unicode U+00EB, which is 0xC3AB in UTF-8 (so we are dealing with unicode rather than utf-8, which I assume is because IronRuby uses the immutable .NET strings internally with Unicode encoding).

 

The errors are expected if your default encoding is US-ASCII because it does not contain ë (and uses single bytes, so the 0x00EB would be broken into two bytes and your script would choke on the second 0xEB) : you will need to set your encoding to something compatible, like utf-8. 

 

I don't quite know how to do that properly in IronRuby, but in CRuby 1.9 you could use "magic comments" in your ruby file and in 1.8 something like $KCODE='u' could work. You might also be able to drop back into .NET and set the encoding there, but I'm not sure how that affects IronRuby assemblies.

 

I would start with $KCODE = 'u' Let me know how that works for you.

 

Zaki

 

On Thu, Jan 13, 2011 at 6:33 PM, Albert-Jan Pieter Nijburg <albertjan at curit.com> wrote:

Hi Guys,

 

My boss thought it would be cool to use "ë" in an sql tablename, many of you will want to shoot her now J.

 

But now I did find something weird, I can't even print "ë".

 

It says:

 

tabaco.rb:16:in `puts': character U+00EB can't be encoded in US-ASCII (Encoding::InvalidByteSequenceError)

        from tabaco.rb:16

 

or

 

when I print the string somewhere else :S when it comes back from a method.

 

System::Text::DecoderFallbackException at /patient/0

Unable to translate bytes [EB] at index 3 from specified code page to Unicode.

 

Or when I don't mess with it 

 

Encoding::InvalidByteSequenceError at /patient/0

invalid byte sequence EB on UTF-8

 

 

All the same problem coming from 3 places. 

 

Is this a fundamental issue or should this be solvable?

 

If you could point me in the right direction I could try to maybe fix it.

 

 

Thanks,

 

Albert-Jan

 


_______________________________________________
Ironruby-core mailing list
Ironruby-core at rubyforge.org
http://rubyforge.org/mailman/listinfo/ironruby-core

 


_______________________________________________
Ironruby-core mailing list
Ironruby-core at rubyforge.org
http://rubyforge.org/mailman/listinfo/ironruby-core

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20110113/9ea0be06/attachment-0001.html>


More information about the Ironruby-core mailing list