[libxml-devel] UTF-8 problem

Charlie Savage cfis at savagexi.com
Wed Nov 19 03:23:00 EST 2008


Hi David,

> I'm trying to input and output special characters, but I'm getting parse 
> errors from libxml.
> 
> Code snippet:
>         node = LibXML::XML::Node.new 'node'
>         node.content = ''
>         doc = LibXML::XML::Document.new
>         doc.encoding = 'UTF-8'
>         doc.root = node
>         xml = doc.to_s true, 'UTF-8'
> 
>         parser = LibXML::XML::Parser.new
>         parser.string = xml
>         doc = parser.parse
>         p doc.root.content
> 
> The first part sets xml to:
> <?xml version="1.0" encoding="UTF-8"?>
> <node>\1</node>
> 
> Where \1 is the byte 1.
> 
> The second part tries to parse this.  But libxml gives this error on the 
> call to Parser.parse:
> Entity: line 2:
> parser
> error :
> internal error
> <node></node>

I see the same thing.  Its hard to know where things are going wrong. Do 
you have time to dig into this?

The first thing to figure out is if the output is correct:

1.  Find some xml editor that let's you write this to disk.
2.  Then run the code above but save the file to disk
3.  Compare the results.
4.  Then compare the files to Ruby's in memory string (looking at the bytes.

Either there will be an error in the string, or if not, then the error 
must be happening in the translation of the string back to libxml.

Charlie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://rubyforge.org/pipermail/libxml-devel/attachments/20081119/a1b23e1c/attachment.bin>


More information about the libxml-devel mailing list