[Win32utils-devel] Ruby 1.9 and Encoding.default_external

Daniel Berger djberg96 at gmail.com
Tue May 22 02:12:37 UTC 2012


On Mon, May 21, 2012 at 8:09 PM, Heesob Park <phasis at gmail.com> wrote:
> Hi,
>
> 2012/5/22 Daniel Berger <djberg96 at gmail.com>
>>
>> On Mon, May 21, 2012 at 7:02 PM, Heesob Park <phasis at gmail.com> wrote:
>> > Hi,
>> >
>> > 2012/5/22 Daniel Berger <djberg96 at gmail.com>
>> >>
>> >> Hi,
>> >>
>> >> Just curious, when using Ruby 1.9 on my Windows 7 laptop, strings are
>> >> encoded in IBM437 by default. However, when I check my default code
>> >> page using GetCPInfoEx, I get Windows-1252.
>> >>
>> >> This is causing some confusion when trying to port code to FFI and
>> >> JRuby, which by default encodes strings as Windows-1252.
>> >>
>> >> # code_page.rb
>> >> require 'ffi'
>> >>
>> >> class Windows
>> >>  extend FFI::Library
>> >>  ffi_convention :stdcall
>> >>  ffi_lib :kernel32
>> >>
>> >>  attach_function :GetConsoleCP, [], :uint
>> >>  attach_function :GetCPInfoEx, :GetCPInfoExA, [:uint, :ulong,
>> >> :pointer],
>> >> :bool
>> >>
>> >>  # From WinNls.h
>> >>  MAX_LEADBYTES = 12
>> >>  MAX_DEFAULTCHAR = 2
>> >>  CP_ACP = 0
>> >>
>> >>  # From WinDef.h
>> >>  MAX_PATH = 260
>> >>
>> >>  class CPINFOEX < FFI::Struct
>> >>    layout(
>> >>      :MaxCharSize, :uint,
>> >>      :DefaultChar, [:uchar, MAX_DEFAULTCHAR],
>> >>      :LeadByte, [:uchar, MAX_LEADBYTES],
>> >>      :UnicodeDefaultChar, [:char, 2],
>> >>      :CodePage, :uint,
>> >>      :CodePageName, [:char, MAX_PATH]
>> >>    )
>> >>  end
>> >>
>> >>  def self.cp_number
>> >>    GetConsoleCP()
>> >>  end
>> >>
>> >>  def self.cp_name
>> >>    ptr = CPINFOEX.new
>> >>
>> >>    unless GetCPInfoEx(CP_ACP, 0, ptr)
>> >>      raise SystemCallError, FFI.errno, "GetCPInfoEx"
>> >>    end
>> >>
>> >>    ptr[:CodePageName]
>> >>  end
>> >> end
>> >>
>> >> p Windows.cp_number # 437
>> >> p Windows.cp_name # 1252  (ANSI - Latin I)
>> >>
>> >> Is this a case of the system default not being the same as the console
>> >> code page? If so, isn't this a bug in MRI then?
>> >>
>> >>
>> > IBM437 is a legacy of MS-DOS and used for console application.
>> >
>> > Refer to
>> > http://en.wikipedia.org/wiki/Code_page
>> > http://blogs.msdn.com/b/michkap/archive/2005/02/08/369197.aspx
>>
>> Ok, so what's the correct way to encode strings by default then?
>>
>> This all started as the result of the ffi branch of the win32-dir
>> project. The tests pass with MRI using 1.9.3 but if I try to use JRuby
>> with the --1.9 option I get InvalidByteSequence failures in the
>> Dir.getwd method. Which is odd, because I can't duplicate the issues
>> when I run standalone code with JRuby.
>>
>
> Well, I have no idea because I am not a JRuby user or tester.
> You can better answer for it on the JRuby Mailing Lists.
> http://jruby.org/community or
> http://www.ruby-forum.com/forum/jruby

Ok, sorry, I will ask there.

Regards,

Dan


More information about the win32utils-devel mailing list