[rhg-discussion] [ANN] Ruby Hacking Guide - New chapters (and a bonus)

Vincent Isambart vincent.isambart at gmail.com
Wed Apr 5 04:58:36 EDT 2006

Hi everyone,

Here they are, translations of chapter 3, 4 and 6 of the Ruby Hacking Guide!
We know the translation is far from being perfect, and we welcome any
correction on the text or diagrams of any chapter (even chapter 2).
Please send them as patches (attached to the mail, not just in the
body of the message) on the rhg-discussion mailing list
(http://rubyforge.org/mailman/listinfo/rhg-discussion). The patches
should be done against the text files in the SVN repository

We also introduced a new feature: previews. It means we put on the web
page chapters that have not be fully proofread and that may have
missing diagrams. They are labelled with a big 'PREVIEW' on it. So do
not hesitate to check our web page often to be able to read the
chapters before they are announced (and send us corrections!). For
example, the previews chapters released today were made available on
the web page more that one week ago.

I may repeat myself but we still need people to help, especially
translators. If you can, even if it's only for one chapter, come help
us. Proofreaders are also welcome, the more they are, the better.

I would like to especially thank the following people for making it possible:
- Clifford Caoile for translating chapter 3
- Meinrad Recheis for making the diagrams
- Jim Driscoll for his proofreading
- and of course Minero Aoki for allowing us to translate his book.

So if you want to read it, the official web page is still
http://rhg.rubyforge.org/ :).

But wait! Today I also have a bonus: a quick translation of matz'
YAPC::Asia 2006 slides. The slide in Japanese are available here:
They are mainly about multilingualisation in Ruby 2. Many thanks to
matz for letting me post this translation and correcting my stupid

For those who have no idea what TRON or Mojikyo is, and what are the
problems of Unicode in Japan, you should check this:

So here comes the translation. It's fact from being perfect, but it's
still easier to understand than the Japanese version ;)

-- beginning of the translation
YAPC::Asia 2006

Ruby on Perl(s)

Yukihiro "Matz" Matsumoto
matz at ruby-lang.org

Copyright (c) 2006 Yukihiro "Matz" Matsumoto, No rights reserved though.
How was Ruby born?
* in a Lisp(ish) system
* I added object oriented capabilities
* and took in some Perl functions
That's why
Perl is Ruby's big brother
Ruby's big sister

Hello World is
print "hello world\n"
in Perl, Ruby or Python
But in PHP it's
<?php echo "hello world"?>
quite different on this point
Ruby and Perl are similar but
* Perl has everything
* Ruby's heart is object oriented
Ruby and Perl are similar but
* Perl uses (most of the time) a functional word order
* Ruby uses a Japanese word order
Functional word order

print reverse(<ARGV>);

prints the reversed ARGV.
Japanese word order


Take ARGF,
call readlines on it,
reverse readlines' result,
display reverse's result
(this is natural order in Japanese language)
Ruby and Perl are similar but
- Larry is American (even if he studies the Japanese language)
- Matz is Japanese (even if he studies the English language)
Ruby and Perl are similar but
* Perl is Unicode-centered
* Ruby is decentralized
Ruby and Perl are similar but
* Perl uses UCS (Universal Character Set)
* Ruby is (will be) CSI (Character Set Independent)
What are your complaints towards Unicode?
* it's thoroughly used, isn't it.
* resentment towards Han unification?
* inferiority complex of Japanese people?
What are your complaints towards Unicode?
* no, no I do no have any complaints about Unicode
* in the domains where Unicode is adequate
Then, why CSI?

In most applications, UCS is enough thanks to Unicode.
However, there are also applications for which this is not the case.
Fields for which Unicode is not enough
Big character sets
* Konjaku-Mojikyo (Japanese encoding which includes many more than Unicode)
* TRON code
* GB18030
Fields for which Unicode is not fitted
Legacy encodings
* conversion to UCS is useless
* big conversion tables
* round-trip problem
If a language chooses the UCS system
* you cannot write non-UCS applications
* you can't handle text that can't be expressed with Unicode
If a language chooses the CSI system
* CSI is a superset of UCS
* Unicode just has to be handled in CSI
... is what we can say but
* CSI is difficult
* can it really be implemented?
That's where comes out Japan's traditional arts

Adaptation for the Japanese language of applications
* Modification of English language applications to be able to process Japanese
Adaptation for the Japanese language of applications

* What engineers of long ago experienced for sure
  - Emacs (NEmacs)
  - Perl (JPerl)
  - Bash
Accumulation of know-how

In Japan, the know-how of adaptation for the Japanese language
(multi-byte text processing)
has been accumulated.
Accumulation of know-how

in the first place, just for local use,
text using 3 encodings circulate
(4 if including UTF-8)
Based on this know-how
* multibyte text encodings
* switching between encodings at the string level
* processing them at practical speed
is finished
Available encodings

euc_tw   euc_jp   iso8859_*  utf-8     utf-32le
ascii    euc_kr   koi8       utf-16le  utf-32be
big5     gb2312   sjis       utf-16be

...and many others
If it's a stateless encodings, in principle it can be available.
It means
For applications using only one encoding, code conversion is not needed
Applications wanting to handle multiple encodings can choose an
internal encoding (generally Unicode) that includes all others
If you want to
* you can also handle multiple encodings without conversion, letting
characters as they are
* but this is difficult so I do not recommend it
only the basic part is done,
it's far from being ready for practical use
* code conversion
* guessing encoding
* etc.
For the time being, today
I want to tell everyone:
* UCS is practical
* but not all-purpose
* CSI is not impossible
The reason I'm saying that
They may add CSI in Perl6 as they had added
* Methods called by "."
* Continuations
from Ruby.
Basically, they hate losing.
Thank you
-- end of the translation

Vincent "scritch" ISAMBART

More information about the rhg-discussion mailing list