[Ironruby-core] MutableString encoding issue
olegtk at microsoft.com
Mon Jul 14 12:17:42 EDT 2008
This problem is probably StringContent.ToByteArray(). It uses Encoding.GetBytes(string) which obeys .NET encoding semantics and by default replaces any nonconvertible characters to '?'.
And then MutableStringOps.Dump() is using it to create string representation.
We could make StringContent.ToByteArray() not replacing nonconvertible characters by using EncodingFallback. BinaryContent.ToString()/ToStringBuilder() also has the same issue.
From: Curt Hagenlocher
Sent: Monday, July 14, 2008 6:27 AM
To: Michael Letterle; ironruby-core at rubyforge.org
Cc: IronRuby Team
Subject: RE: [Ironruby-core] MutableString encoding issue
MutableString can have one of three internal representations, depending on how it was last used. One of these is a byte array. This particular problem may be in the scanner or parser and not in the actual string class, as we don't otherwise have a problem storing the character:
>>> $s = "\204"
>>> $s = 132
From: Michael Letterle [mailto:michael.letterle at gmail.com]
Sent: Monday, July 14, 2008 6:21 AM
To: ironruby-core at rubyforge.org
Cc: IronRuby Team
Subject: Re: [Ironruby-core] MutableString encoding issue
This was a known issue a while back, it's the reason the Zlib library didn't work well with binary files. I'm fairly certain there was work being done on making String be backed by a byte array... and in fact I thought this was already done.
On Mon, Jul 14, 2008 at 2:20 AM, Oleg Tkachenko <olegtk at microsoft.com<mailto:olegtk at microsoft.com>> wrote:
Stumbled on this when testing yaml.
I believe Ruby string can hold arbitrary byte values, but as we are storing content as a string we are obviously losing all values that cannot be represented in default encoding. Tomas, what do you think?
Ironruby-core mailing list
Ironruby-core at rubyforge.org<mailto:Ironruby-core at rubyforge.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ironruby-core