From rosco at roscopeco.co.uk Sun Jan 1 10:22:48 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Sun, 01 Jan 2006 15:22:48 -0000 Subject: [libxml-devel] GCC 4 warnings Message-ID: Hi, I made a bit of a start on these warnings with GCC 4, just really tightening up pointer passing and replacing some uses of STR2CSTR with StringValuePtr (the former is marked obsolete in ruby.h). Anyway, since I've been out of the game for a bit I've just done ruby_xml_node.c for now, I'd like someone to just let me know if the approach is valid and not storing up problems for other platforms or anything like that. Assuming it is okay, I could work through all those warnings and patch them up. A patch for node is attached. Happy new year folks. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk -------------- next part -------------- A non-text attachment was scrubbed... Name: node_gcc4warnings_fix.patch Type: application/octet-stream Size: 5879 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20060101/2c587afb/node_gcc4warnings_fix-0001.obj From sean at gigave.com Mon Jan 2 16:17:59 2006 From: sean at gigave.com (Sean Chittenden) Date: Mon, 2 Jan 2006 13:17:59 -0800 Subject: [libxml-devel] GCC 4 warnings In-Reply-To: References: Message-ID: <20060102211759.GC12999@mailhost.gigave.com> > I made a bit of a start on these warnings with GCC 4, just really > tightening up pointer passing and replacing some uses of STR2CSTR > with StringValuePtr (the former is marked obsolete in ruby.h). Excellent. The patch looks good to me. > Anyway, since I've been out of the game for a bit I've just done > ruby_xml_node.c for now, I'd like someone to just let me know if the > approach is valid and not storing up problems for other platforms or > anything like that. Assuming it is okay, I could work through all > those warnings and patch them up. A patch for node is attached. Casting const char * is sometimes problematic, but necessary and this work is certainly a step forward for the code. You going to commit this or should I? -sc -- Sean Chittenden From jason.e.stewart at gmail.com Sun Jan 1 08:36:29 2006 From: jason.e.stewart at gmail.com (Jason Stewart) Date: Sun, 1 Jan 2006 19:06:29 +0530 Subject: [libxml-devel] Proposal to add Xerces bindings for Ruby Message-ID: <41c1ade50601010536s6f4abfacjd9631558a058063f@mail.gmail.com> Hi, I am the maintainer for the Xerces-C SWIG bindings at Apache.org - I use them to generate the Perl API for Xerces. In the past two releases I have seperated out all Perl dependancies from the SWIG bindings so they could be used for any language that SWIG supports. I would like to test this with a popular language like Ruby and wonder if anyone on this list would be interested in helping. The main problem I have is testing whether it works or not - so if someone can help write ruby scripts that user xerces (DOM, SAX, SAX2, DOMBuilder, DOMWriter, etc) that would be really helpful. Thanks for your time, jas. From rosco at roscopeco.co.uk Mon Jan 2 16:37:47 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Mon, 02 Jan 2006 21:37:47 -0000 Subject: [libxml-devel] GCC 4 warnings In-Reply-To: <20060102211800.8206A217A01@mailhost.gigave.com> References: <20060102211800.8206A217A01@mailhost.gigave.com> Message-ID: On Mon, 02 Jan 2006 21:17:59 -0000, Sean Chittenden wrote: >> I made a bit of a start on these warnings with GCC 4, just really >> tightening up pointer passing and replacing some uses of STR2CSTR >> with StringValuePtr (the former is marked obsolete in ruby.h). > > Excellent. The patch looks good to me. > Great, just wanted to be sure before I got into it. Thanks for checking :) >> Anyway, since I've been out of the game for a bit I've just done >> ruby_xml_node.c for now, I'd like someone to just let me know if the >> approach is valid and not storing up problems for other platforms or >> anything like that. Assuming it is okay, I could work through all >> those warnings and patch them up. A patch for node is attached. > > Casting const char * is sometimes problematic, but necessary and this > work is certainly a step forward for the code. You going to commit > this or should I? -sc > I'll commit this one now, then get started on cleaning up the others the same way. If I have any questions I'll ask about them before making changes. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Mon Jan 2 18:31:11 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Mon, 02 Jan 2006 23:31:11 -0000 Subject: [libxml-devel] GCC 4 warnings In-Reply-To: References: <20060102211800.8206A217A01@mailhost.gigave.com> Message-ID: On Mon, 02 Jan 2006 21:37:47 -0000, I wrote: > > I'll commit this one now, then get started on cleaning up the others the > same way. If I have any questions I'll ask about them before making > changes. > All seemed pretty straightforward, and I've just committed it. We now compile with no warnings on GCC 4 / i686 / Linux. I'm still working on a few testcases that may expose segfaults and other stuff so I'll post again on that. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Fri Jan 6 09:49:57 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Fri, 06 Jan 2006 14:49:57 -0000 Subject: [libxml-devel] xml-smart Message-ID: Hi, Just wondering if anyone uses or knows of ruby-xml-smart (http://raa.ruby-lang.org/project/ruby-xml-smart/) ? Cheers, Ross -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Tue Jan 10 04:54:04 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Tue, 10 Jan 2006 09:54:04 -0000 Subject: [libxml-devel] Libxml memory dump? Message-ID: Hi, I finally got Libxml compiled with memory debugging (my GCC installation was incomplete, I had to compile that from source first), and Libxml-ruby says 'true' to XML::Parser.enabled_memory_debug? . Calling memory_dump does return true, but I don't seem to get the dump...? Reading the Libxml site suggests it's supposed to go to .memdump or /.memdump (??) but I get neither of these. Sorry if this is a(nother) dumb question but I just wanted to see if it's something someone's seen and solved before? Cheers, Ross -- Ross Bamford - rosco at roscopeco.co.uk From sean at gigave.com Fri Jan 13 15:30:20 2006 From: sean at gigave.com (Sean Chittenden) Date: Fri, 13 Jan 2006 12:30:20 -0800 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: References: Message-ID: <20060113203020.GN387@mailhost.gigave.com> > I finally got Libxml compiled with memory debugging (my GCC installation > was incomplete, I had to compile that from source first), and Libxml-ruby > says 'true' to XML::Parser.enabled_memory_debug? . Calling memory_dump > does return true, but I don't seem to get the dump...? Reading the Libxml > site suggests it's supposed to go to .memdump or /.memdump (??) but I get > neither of these. Hrm... > Sorry if this is a(nother) dumb question but I just wanted to see if it's > something someone's seen and solved before? I've seen this when ruby-libxml was linked against the wrong version of libxml, but that's about it. :-/ Using ldd, can you confirm that you've linked against the correct/current version of libxml? -sc -- Sean Chittenden From rosco at roscopeco.co.uk Sat Jan 14 13:38:09 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Sat, 14 Jan 2006 18:38:09 -0000 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: <20060113203021.73861216FF8@mailhost.gigave.com> References: <20060113203021.73861216FF8@mailhost.gigave.com> Message-ID: On Fri, 13 Jan 2006 20:30:20 -0000, Sean Chittenden wrote: >> Sorry if this is a(nother) dumb question but I just wanted to see if >> it's >> something someone's seen and solved before? > > I've seen this when ruby-libxml was linked against the wrong version of > libxml, but that's about it. :-/ Using ldd, can you confirm that > you've linked against the correct/current version of libxml? -sc > No problem. It's libxml2 2.6.22, and it looks like the ruby library is linked to the right one: $ ldd libxml.so linux-gate.so.1 => (0x00e54000) libxslt.so.1 => /usr/lib/libxslt.so.1 (0x0067b000) libxml2.so.2 => /home/rosco/dev/lib/libxml2.so.2 (0x00eae000) libz.so.1 => /usr/lib/libz.so.1 (0x00362000) libm.so.6 => /lib/libm.so.6 (0x00782000) libc.so.6 => /lib/libc.so.6 (0x00111000) libnsl.so.1 => /lib/libnsl.so.1 (0x0023a000) libdl.so.2 => /lib/libdl.so.2 (0x0028a000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x00250000) /lib/ld-linux.so.2 (0x003d3000) And it does claim memory debug is enabled, as I say, but still no .memdump file. It's not the end of the world I guess, but it would be pretty useful to be able to get that dump. My plan for this week is to work on Libxml2, to take a look at those apparent leaks and segfaults, and see if I can get anything out of Valgrind with it. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From sean at gigave.com Sun Jan 15 01:12:49 2006 From: sean at gigave.com (Sean Chittenden) Date: Sat, 14 Jan 2006 22:12:49 -0800 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: References: <20060113203021.73861216FF8@mailhost.gigave.com> Message-ID: <20060115061249.GX2654@mailhost.gigave.com> > >> Sorry if this is a(nother) dumb question but I just wanted to see if > >> it's > >> something someone's seen and solved before? > > > > I've seen this when ruby-libxml was linked against the wrong version of > > libxml, but that's about it. :-/ Using ldd, can you confirm that > > you've linked against the correct/current version of libxml? -sc > > > > No problem. It's libxml2 2.6.22, and it looks like the ruby library is > linked to the right one: > > $ ldd libxml.so > linux-gate.so.1 => (0x00e54000) > libxslt.so.1 => /usr/lib/libxslt.so.1 (0x0067b000) > libxml2.so.2 => /home/rosco/dev/lib/libxml2.so.2 (0x00eae000) > libz.so.1 => /usr/lib/libz.so.1 (0x00362000) > libm.so.6 => /lib/libm.so.6 (0x00782000) > libc.so.6 => /lib/libc.so.6 (0x00111000) > libnsl.so.1 => /lib/libnsl.so.1 (0x0023a000) > libdl.so.2 => /lib/libdl.so.2 (0x0028a000) > libcrypt.so.1 => /lib/libcrypt.so.1 (0x00250000) > /lib/ld-linux.so.2 (0x003d3000) > > > And it does claim memory debug is enabled, as I say, but still no .memdump > file. It's not the end of the world I guess, but it would be pretty useful > to be able to get that dump. My plan for this week is to work on Libxml2, > to take a look at those apparent leaks and segfaults, and see if I can get > anything out of Valgrind with it. Wild guess, but does your directory have write perms? cd /tmp; ./foo.rb and see if that works. :) -sc -- Sean Chittenden From rosco at roscopeco.co.uk Mon Jan 16 07:25:05 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Mon, 16 Jan 2006 12:25:05 -0000 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: <20060115061250.F2C73216E04@mailhost.gigave.com> References: <20060113203021.73861216FF8@mailhost.gigave.com> <20060115061250.F2C73216E04@mailhost.gigave.com> Message-ID: On Sun, 15 Jan 2006 06:12:49 -0000, Sean Chittenden wrote: > Wild guess, but does your directory have write perms? cd /tmp; ./foo.rb > and > see if that works. :) -sc > Now, I know I'm dumb, but I hope I'm not that dumb ... (checks) ... phew :)) What I figured I'd do is write a small C program that just does a dump, to find out whether it's libxml or the ruby bindings that causing the prob, and take it from there - if it happens with plain libxml then I'll go and bug them about it :). For now, I've been trying things out and looking through valgrind (amazing piece of kit, that :)) and here is what I've found so far. Running this script: require 'libxml' xp = XML::Parser.string '' d = xp.parse p d.root.properties p d.root.properties puts "still here" Outputs: $ ruby ../libxml-local/bug_test2.rb # # still here *** glibc detected *** ruby: double free or corruption (fasttop): 0x098af2c0 *** ======= Backtrace: ========= /lib/libc.so.6[0x454124] /lib/libc.so.6(__libc_free+0x77)[0x45465f] ruby(ruby_xfree+0x25)[0x806a865] /usr/lib/libxml2.so.2(xmlFreeNodeList+0x18b)[0x5564dcb] /usr/lib/libxml2.so.2(xmlFreeProp+0xc8)[0x5564bd8] /usr/lib/libxml2.so.2(xmlFreeNode+0x190)[0x5565000] ../libxml/libxml.so(ruby_xml_attr_free+0x52)[0xc18b02] ruby(rb_gc_call_finalizer_at_exit+0x7d)[0x806b21d] ruby[0x8052902] ruby(ruby_cleanup+0xd1)[0x8060a31] ruby(ruby_stop+0xe)[0x8060aee] ruby[0x8066578] ruby(rb_secure+0x0)[0x80521d4] /lib/libc.so.6(__libc_start_main+0xdf)[0x405d5f] ruby[0x805212d] ======= Memory map: ======== [ ... snipped ... ] If you remove the *second* "p d.root.properties", then all appears to be well (i.e. no backtrace). From that, I tried commenting out this line: Index: ruby_xml_attr.c =================================================================== RCS file: /var/cvs/libxml/libxml/ruby_xml_attr.c,v retrieving revision 1.2 diff -u -r1.2 ruby_xml_attr.c --- ruby_xml_attr.c 2 Jan 2006 23:19:21 -0000 1.2 +++ ruby_xml_attr.c 16 Jan 2006 10:46:50 -0000 @@ -11,7 +11,7 @@ ruby_xml_attr_free(ruby_xml_attr *rxa) { if (rxa->attr != NULL && !rxa->is_ptr) { xmlUnlinkNode((xmlNodePtr)rxa->attr); - xmlFreeNode((xmlNodePtr)rxa->attr); + //xmlFreeNode((xmlNodePtr)rxa->attr); rxa->attr = NULL; } And sure enough, no crash, but running with Valgrind now shows leakage (of course) going on around the attributes. So from this, what I'm guessing is (lines may be approx, I have local changes): + In 'ruby_xml_node.c', ruby_xml_node_properties_get(...), line 1621, we do: attr = node->node->properties; return(ruby_xml_attr_new(cXMLAttr, node->xd, attr)); + 'ruby_xml_attr_new' (ruby_xml_attr.c, line 151) looks like: VALUE ruby_xml_attr_new(VALUE class, VALUE xd, xmlAttrPtr attr) { ruby_xml_attr *rxa; rxa = ALLOC(ruby_xml_attr); rxa->attr = attr; rxa->xd = xd; rxa->is_ptr = 0; return(Data_Wrap_Struct(class, ruby_xml_attr_mark, ruby_xml_attr_free, rxa)); } + Sooo, multiple calls to .properties will result in two Ruby objects, with both having the same rxa->attr. (setting is_ptr = 1 doesn't seem to prevent the crash - I'd expect that to have the same effect as commenting out the xmlFreeNode above, but it doesn't seem to. Am I way off?) + (either...) Given that, when freeing the attributes, free is called for both Ruby objects, resulting in calls to Free on the same node, which eventually results in a double-free somewhere along the line. Although we set rxn->attr = NULL after freeing, the other objects have their own ruby_xml_attr (with it's own rxn->attr pointer). + (or...) xmlAttr references data shared with other xmlAttrs, and freeing this twice causes problems. Don't know enough yet about how libxml2 works to know if this is the case. I'm not totally au-fait yet with libxml2's memory handing, so I'm not sure exactly how to fix this - any pointers (no pun intended)? I first thought that we could do something like copying the xmlAttr but I guess that would drastically increase memory usage, and it doesn't seem the right way to me. Maybe we can determine whether or not a property has previously been wrapped (cache the VALUEs on the element node? or something?) and set is_ptr appropriately? I'll keep hacking away at it but I'd appreciate any ideas, or insight into how this might be best handled. I tried following the node set code, which seems to successfully manage this with the nodes it has, but I'm having problems seeing what's being done differently there? Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Tue Jan 17 06:12:56 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Tue, 17 Jan 2006 11:12:56 -0000 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: <20060115061250.F2C73216E04@mailhost.gigave.com> References: <20060113203021.73861216FF8@mailhost.gigave.com> <20060115061250.F2C73216E04@mailhost.gigave.com> Message-ID: On Sun, 15 Jan 2006 06:12:49 -0000, Sean Chittenden wrote: >> >> Sorry if this is a(nother) dumb question but I just wanted to see if >> >> it's >> >> something someone's seen and solved before? >> > >> > I've seen this when ruby-libxml was linked against the wrong version >> of >> > libxml, but that's about it. :-/ Using ldd, can you confirm that >> > you've linked against the correct/current version of libxml? -sc >> > > Further to this, I just tried compiling my local libxml2 (./configure --with-memory-debug) and ran the tests. This gave a .memdump in libxml2 directory. Then, linking ruby-libxml against it again, I'm getting enabled_memory_debug? false and a warning that it's compiled without memory debugging :( So I'm going to go right back to square one and then get onto the libxml2 mailing list... -- Ross Bamford - rosco at roscopeco.co.uk From sean at gigave.com Tue Jan 17 06:25:04 2006 From: sean at gigave.com (Sean Chittenden) Date: Tue, 17 Jan 2006 03:25:04 -0800 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: References: <20060113203021.73861216FF8@mailhost.gigave.com> <20060115061250.F2C73216E04@mailhost.gigave.com> Message-ID: <20060117112504.GO3216@mailhost.gigave.com> > >> >> Sorry if this is a(nother) dumb question but I just wanted to see if > >> >> it's > >> >> something someone's seen and solved before? > >> > > >> > I've seen this when ruby-libxml was linked against the wrong version > >> of > >> > libxml, but that's about it. :-/ Using ldd, can you confirm that > >> > you've linked against the correct/current version of libxml? -sc > > Further to this, I just tried compiling my local libxml2 > (./configure --with-memory-debug) and ran the tests. This gave a > .memdump in libxml2 directory. Then, linking ruby-libxml against it > again, I'm getting enabled_memory_debug? false and a warning that > it's compiled without memory debugging :( So I'm going to go right > back to square one and then get onto the libxml2 mailing list... Isn't there a way of setting enable_memory_debug to something != false? It's been a while, but... could you post your test script here? -sc -- Sean Chittenden From rosco at roscopeco.co.uk Tue Jan 17 07:20:21 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Tue, 17 Jan 2006 12:20:21 -0000 Subject: [libxml-devel] Libxml memory dump? In-Reply-To: <20060117112505.91E44216C10@mailhost.gigave.com> References: <20060113203021.73861216FF8@mailhost.gigave.com> <20060115061250.F2C73216E04@mailhost.gigave.com> <20060117112505.91E44216C10@mailhost.gigave.com> Message-ID: On Tue, 17 Jan 2006 11:25:04 -0000, Sean Chittenden wrote: > Isn't there a way of setting enable_memory_debug to something != > false? It's been a while, but... could you post your test script > here? -sc > I figured out why it was reporting 'false' for memory debug now, the libxml-ruby extconf seems to be ignoring the --with-xml2-dir option, so I was doing manual relink but of course that's only half the story. So I manually edited the Makefile for now to both compile *and* link against the right library. Sorry for that noise. So I'm now getting 'true' for XML::Parser.enabled_memory_debug?, but calling memory_dump still yields no .memdump file. I don't think I can set enabled_memory_debug - all those 'enabled_xxxx?' methods on parser seem to just report compile options from libxml. Memory debug is defined like: VALUE ruby_xml_parser_enabled_memory_debug_location_q(VALUE class) { #ifdef DEBUG_MEMORY_LOCATION return(Qtrue); #else return(Qfalse); #endif } (This is set in the class as singleton meth 'enabled_memory_debug?') The script I'm currently testing with follows. I added a couple of printfs to ruby_xml_attr.c to get a read on this, and also thought I'd cheat with an 'xmlMemoryDump()' here too. Still doesn't work, so I guess this is not a problem in the bindings...? Here's the new free func: Index: ruby_xml_attr.c =================================================================== RCS file: /var/cvs/libxml/libxml/ruby_xml_attr.c,v retrieving revision 1.2 diff -u -r1.2 ruby_xml_attr.c --- ruby_xml_attr.c 2 Jan 2006 23:19:21 -0000 1.2 +++ ruby_xml_attr.c 17 Jan 2006 12:06:38 -0000 @@ -10,10 +10,14 @@ void ruby_xml_attr_free(ruby_xml_attr *rxa) { if (rxa->attr != NULL && !rxa->is_ptr) { + printf("FREE ATTR! (Ruby struct: %x) %x ... ", rxa, rxa->attr); xmlUnlinkNode((xmlNodePtr)rxa->attr); xmlFreeNode((xmlNodePtr)rxa->attr); + printf("done\n"); rxa->attr = NULL; } + + xmlMemoryDump(); free(rxa); } Here's the script: require '../libxml/libxml' xp = XML::Parser.string '' d = xp.parse def ameth(d) p d.root.properties p d.root.properties end ameth(d) p XML::Parser.enabled_memory_debug? p XML::Parser.memory_dump p XML::Parser.memory_used # I guessed they'd be gone now, but seems not? GC.start ObjectSpace.each_object(XML::Attr) do |o| puts o.inspect end puts "END" And the output: $ ruby bug_test2.rb # # true true 0 # # END FREE ATTR! (Ruby struct: 96c0b90) 96c09c0 ... done *** glibc detected *** ruby: double free or corruption (fasttop): 0x096c0a08 *** ======= Backtrace: ========= /lib/libc.so.6[0x454124] /lib/libc.so.6(__libc_free+0x77)[0x45465f] ruby(ruby_xfree+0x25)[0x806a865] /home/rosco/dev/lib/libxml2.so.2(xmlFreeNodeList+0x16b)[0xa4639b] /home/rosco/dev/lib/libxml2.so.2(xmlFreeProp+0x52)[0xa46182] /home/rosco/dev/lib/libxml2.so.2(xmlFreeNode+0x1c9)[0xa46609] ../libxml/libxml.so(ruby_xml_attr_free+0x6a)[0xbeabba] ruby(rb_gc_call_finalizer_at_exit+0x7d)[0x806b21d] ruby[0x8052902] ruby(ruby_cleanup+0xd1)[0x8060a31] ruby(ruby_stop+0xe)[0x8060aee] ruby[0x8066578] ruby(rb_secure+0x0)[0x80521d4] /lib/libc.so.6(__libc_start_main+0xdf)[0x405d5f] ruby[0x805212d] ======= Memory map: ======== [.. snipped ..] FREE ATTR! (Ruby struct: 96c0b48) 96c09c0 ... Aborted -- Ross Bamford - rosco at roscopeco.co.uk From dan at jloreview.com Mon Jan 23 14:58:57 2006 From: dan at jloreview.com (Dan Check) Date: Mon, 23 Jan 2006 14:58:57 -0500 Subject: [libxml-devel] libxml + Ruby Message-ID: <43D53581.4030208@jloreview.com> Hi, I'm currently working through how to handle a large (~1 GB ) xml file in ruby, and I'm trying to get SAX parsing work in libxml. My file is a collection of roughly 4 million nodes, and what I've been doing is using REXML's SAX library to fire off individual nodes libxml, where they are handled as DOM objects (which allows for XPath querying). This prevents memory starvation and paging and all that, but REXML is terribly slow. I'd really like to use libxml's SAX Parser, but I can't find any documentation for it, and looking through the source code, I'm at a loss to see how to create a listener. Any help on this would be appreciated. Thanks, Dan From sean at gigave.com Mon Jan 23 15:27:50 2006 From: sean at gigave.com (Sean Chittenden) Date: Mon, 23 Jan 2006 12:27:50 -0800 Subject: [libxml-devel] libxml + Ruby In-Reply-To: <43D53581.4030208@jloreview.com> References: <43D53581.4030208@jloreview.com> Message-ID: <20060123202750.GE11012@mailhost.gigave.com> > I'm currently working through how to handle a large (~1 GB ) xml > file in ruby, and I'm trying to get SAX parsing work in libxml. My > file is a collection of roughly 4 million nodes, and what I've been > doing is using REXML's SAX library to fire off individual nodes > libxml, where they are handled as DOM objects (which allows for > XPath querying). This prevents memory starvation and paging and all > that, but REXML is terribly slow. > > I'd really like to use libxml's SAX Parser, but I can't find any > documentation for it, and looking through the source code, I'm at a > loss to see how to create a listener. Any help on this would be > appreciated. At the moment, the SAX interface for libxml isn't complete. It will be, I just haven't had a reason to get to it yet. Patches anyone? :) -- Sean Chittenden