[Ironruby-core] Code Review: EncodingsFinal

Tomas Matousek Tomas.Matousek at microsoft.com
Fri Mar 13 14:16:30 EDT 2009

tfpt review "/shelveset:EncodingsFinal;REDMOND\tomat"

Outer DLR:

-          Adds Invariant, Ensures, Result, Parameter and Out stubs to ContractUtils mimicking Dev10 contracts. These allow us to specify post-conditions and object invariants in code rather than comments.


-          Implements infrastructure for $KCODE variable. There are only 3 encodings settable to KCODE (UTF8, SJIS, EUC). These encodings are implemented as special encodings (aka "k-codings", RubyEncoding.KCode* singletons) and need to be special cased. For example, String#size on a string containing a single UTF8 2-byte character returns 1 if its encoding is UTF8, but 2 if it is KCodeUTF8. This emulates MRI 1.8 where strings have no associated encoding.

-          $KCODE is in general considered obsolete and is not available in Silverlight build.

-          Replaces List<byte> and StringBuilder MutableString representations with byte[] and char[]. Reimplements basic char/byte/string buffer operations and moves them to Utils.cs.

-          Improves implementation of MutableString.GetHashCode - the hashcode is now cached on the string until the string is modified. The hash code calculation includes encoding if there are any non-ASCII characters in the string. Otherwise the encoding is not part of the hash.

-          Adds support for multi-byte identifiers in source code if the file has non-binary encoding or k-coding. Any non-ASCII character is considered a lower case letter for the purpose of identifier classification (constant, global var, instance var, class var, local, method name).

-          Fixes \xXX escapes in encoded strings - subsequent escaped bytes can form a single character or part if a character. In both cases the string's representation is switched to binary so that no information is lost. StringContentBuilder takes care of construction such strings. At runtime a string with an incomplete character suffix can be concatenated with a string with the missing part of the character and together these bytes might form a valid character.

-          Adds bunch of unit tests for MutableString and encodings.

-          Reimplements String#dump and String#inspect to handle encoded strings correctly. Moves the implementation to MutableString so that we can use it as a debug view for MutableString as well.

-          Fixes specs - KCODE was set to UTF8 by one spec and not restored, which affected subsequent specs.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20090313/7070138b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EncodingsFinal.diff
Type: application/octet-stream
Size: 316782 bytes
Desc: EncodingsFinal.diff
URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20090313/7070138b/attachment-0001.obj>

More information about the Ironruby-core mailing list