From aaron at tenderlovemaking.com Thu May 7 02:29:41 2009 From: aaron at tenderlovemaking.com (Aaron Patterson) Date: Wed, 6 May 2009 23:29:41 -0700 Subject: [Nokogiri-talk] [ANN] nokogiri 1.3.0rc1 has been released! Message-ID: <20090507062941.GA38769@Jordan.local> nokogiri version 1.3.0rc1 has been released! Thanks to herculean efforts by my nokogiri partner in crime, Mike Dalessio, nokogiri now works on JRuby 1.3.0RC1 via FFI. To install this prerelease gem do this: $ jgem install nokogiri -s http://tenderlovemaking.com/ Then you should be able to do this: $ jirb irb(main):001:0> require 'open-uri' => true irb(main):002:0> require 'rubygems' => true irb(main):003:0> require 'nokogiri' => true irb(main):004:0> doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove')) => # irb(main):005:0> doc.css('h3.r a.l').length => 10 irb(main):006:0> == CAVEATS! * The JRuby FFI gem only works with JRuby 1.3.0RC1 * You MUST install it from my gem server * The gem version will say 1.2.4, that is actually because I couldn't get pre release gem versions working. Don't worry, it's actually the 1.3.0 release candidate. * You can get an MRI version and the JRuby version from my gem server, no windows support yet. == ACCOLADES * Mike made this FFI monster happen! I can't thank him enough. * Thanks to the JRuby team for making FFI work! == CHANGELOG * hahahahahaha * hahahahahahaha * hahahaha * hahahahahahahha * You'll get to see the acutal changes when this isn't a release candidate * Or check out the git repository == More information * http://github.com/tenderlove/nokogiri * http://nokogiri.rubyforge.org/ -- Aaron Patterson http://tenderlovemaking.com/ From phlip2005 at gmail.com Thu May 7 09:55:35 2009 From: phlip2005 at gmail.com (Phlip) Date: Thu, 07 May 2009 06:55:35 -0700 Subject: [Nokogiri-talk] [ANN] nokogiri 1.3.0rc1 has been released! In-Reply-To: <20090507062941.GA38769@Jordan.local> References: <20090507062941.GA38769@Jordan.local> Message-ID: <4A02E857.6030808@gmail.com> > $ jgem install nokogiri -s http://tenderlovemaking.com/ (I can hack the repo myself, but..) wouldn't "gem install" work for Vanilla Ruby? From aaron.patterson at gmail.com Thu May 7 11:15:45 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 7 May 2009 08:15:45 -0700 Subject: [Nokogiri-talk] [ANN] nokogiri 1.3.0rc1 has been released! In-Reply-To: <4A02E857.6030808@gmail.com> References: <20090507062941.GA38769@Jordan.local> <4A02E857.6030808@gmail.com> Message-ID: <6959e1680905070815ue83917fn3f87bac50208ef31@mail.gmail.com> On Thu, May 7, 2009 at 6:55 AM, Phlip wrote: >> ?$ jgem install nokogiri -s http://tenderlovemaking.com/ > > (I can hack the repo myself, but..) wouldn't "gem install" work for Vanilla > Ruby? Ya, it should work fine. I pushed a jruby version and a version for MRI. The only difference between the two is that the jruby version doesn't attempt to compile the nokogiri native extensions. -- Aaron Patterson http://tenderlovemaking.com/ From phlip2005 at gmail.com Thu May 7 12:04:13 2009 From: phlip2005 at gmail.com (Phlip) Date: Thu, 7 May 2009 09:04:13 -0700 Subject: [Nokogiri-talk] v1.2.4 broke assert_xhtml! Message-ID: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> >> (I can hack the repo myself, but..) wouldn't "gem install" work for Vanilla >> Ruby? > > Ya, it should work fine. ?I pushed a jruby version and a version for > MRI. ?The only difference between the two is that the jruby version > doesn't attempt to compile the nokogiri native extensions. I got bigger fry to fish. The two or three people using assert_xhtml might have noticed that v1.2.4 breaks it. I don't know why yet. Sorry _I_ didn't notice (the slight matter of an unexpected job hunt put my hobbies on the back burner!). So now I'm off to debug against 1.2.4 before even getting to your RC. Keep up the (honestly!) good work! -- Phlip From phlip2005 at gmail.com Thu May 7 13:00:15 2009 From: phlip2005 at gmail.com (Phlip) Date: Thu, 07 May 2009 10:00:15 -0700 Subject: [Nokogiri-talk] v1.2.4 broke assert_xhtml! In-Reply-To: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> References: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> Message-ID: <4A03139F.1070503@gmail.com> > I got bigger fry to fish. The two or three people using assert_xhtml > might have noticed that v1.2.4 breaks it. I don't know why yet. I fixed it - here's the patch. My assert_xhtml needs this to pass. This test is not for Nokogiri Prime: def test_nokogiri_builder_likes_bangs built = Nokogiri::HTML::Builder.new{ harlequin! } assert{ built.doc.to_html =~ /harlequin\!/ } end Users of Nokogiri do not need the !, they need their node. I need the bang - essentially so I can present the _same_ interface as Nokogiri, but I can see the bang and make use of it with verbose!, without!, etc. Long term, we might want to decorate the node, and say "this node had a bang on it", like node.bang? or something. Or maybe node.original_tag == 'harlequin!'. For now, I'm going with this MP: module Nokogiri module XML class Builder def cleanse_element_name(method) method.to_s.sub(/[_]$/, '') # or [_!] end # monkey patch me! def method_missing method, *args, &block # :nodoc: if @context && @context.respond_to?(method) @context.send(method, *args, &block) else node = Nokogiri::XML::Node.new(cleanse_element_name(method), @doc) { |n| args.each do |arg| case arg when Hash arg.each { |k,v| n[k.to_s] = v.to_s } else n.content = arg end end } insert(node, &block) end end end end end Notice I extract-method-refactored cleanse_element_name(). This is a suggestion to Aaron that the next version of N should accept a monkey patch there. Maybe different users have different bang needs, for example. And it's safer to MP a one-liner than a big method! If Aaron applies that patch, he needs to put the ! back into cleanse_element_name(). So, end of crisis, thanks y'all, and the next version of assert2 will have the eefix for assert_xhtml! -- Phlip From phlip2005 at gmail.com Thu May 7 12:41:24 2009 From: phlip2005 at gmail.com (Phlip) Date: Thu, 07 May 2009 09:41:24 -0700 Subject: [Nokogiri-talk] v1.2.4 broke assert_xhtml! In-Reply-To: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> References: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> Message-ID: <4A030F34.70701@gmail.com> > I got bigger fry to fish. The two or three people using assert_xhtml > might have noticed that v1.2.4 breaks it. I don't know why yet. Builder strips ! off the end of nodes. Put another way, Aaron's patch for the "node with the same name as a method" problem conflicts with my patch, because sometimes I need to see the ! ! I will monkey patch a fix, and then submit the monkey to the committee. (-; From aaron.patterson at gmail.com Thu May 7 14:30:13 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 7 May 2009 11:30:13 -0700 Subject: [Nokogiri-talk] v1.2.4 broke assert_xhtml! In-Reply-To: <4A030F34.70701@gmail.com> References: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> <4A030F34.70701@gmail.com> Message-ID: <6959e1680905071130s7c80c68bwfaf9df3ca8efa386@mail.gmail.com> On Thu, May 7, 2009 at 9:41 AM, Phlip wrote: >> I got bigger fry to fish. The two or three people using assert_xhtml >> might have noticed that v1.2.4 breaks it. I don't know why yet. > > Builder strips ! off the end of nodes. Put another way, Aaron's patch for > the "node with the same name as a method" problem conflicts with my patch, > because sometimes I need to see the ! ! It only strips the last one. So you can do this: xml.foo!! # => a node with one ! xml.foo!_ # => a node with one ! xml.foo!!! # => a node with two ! xml.foo!!_ # => a node with two ! -- Aaron Patterson http://tenderlovemaking.com/ From phlip2005 at gmail.com Thu May 7 14:50:47 2009 From: phlip2005 at gmail.com (Phlip) Date: Thu, 07 May 2009 11:50:47 -0700 Subject: [Nokogiri-talk] v1.2.4 broke assert_xhtml! In-Reply-To: <6959e1680905071130s7c80c68bwfaf9df3ca8efa386@mail.gmail.com> References: <860c114f0905070904p1c6897a7jb6621fb76cef02c5@mail.gmail.com> <4A030F34.70701@gmail.com> <6959e1680905071130s7c80c68bwfaf9df3ca8efa386@mail.gmail.com> Message-ID: <4A032D87.5050602@gmail.com> Aaron Patterson wrote: > It only strips the last one. So you can do this: > > xml.foo!! # => a node with one ! > xml.foo!_ # => a node with one ! .foo!! appears to be illegal. Matz only "maximum munches" one trailing punctuation there! Actually, that's a "minimum munch" (-: From pyurify at gmail.com Thu May 7 14:59:38 2009 From: pyurify at gmail.com (pYuri.fy) Date: Thu, 7 May 2009 14:59:38 -0400 Subject: [Nokogiri-talk] Spaces in predicate searches Message-ID: <8c2311470905071159k22a16f5cv42738a035775b795@mail.gmail.com> When I perform a predicate search like: input[@name^="as_"] if there is any spaces in the predicate, I get an error: Nokogiri::XML::XPath::SyntaxError: Invalid predicate if I remove the spaces, the search runs without any problems. Is this intentional? --------------------------------- require 'nokogiri' require 'open-uri' url = 'http://www.google.com/advanced_search?hl=en' doc = Nokogiri.parse( open(url).read ) list = [ 'input[@name^="as_"]', # okay 'input[@name^= "as_"]', # error 'input[@name ^="as_"]', # error 'input[@name ^= "as_"]' # error ] list.each do | css | begin doc.search( css ) puts "#{css} - okay" rescue StandardError => e puts "#{css} - #{e.message}" end end From aaron.patterson at gmail.com Thu May 7 17:14:36 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 7 May 2009 14:14:36 -0700 Subject: [Nokogiri-talk] Spaces in predicate searches In-Reply-To: <8c2311470905071159k22a16f5cv42738a035775b795@mail.gmail.com> References: <8c2311470905071159k22a16f5cv42738a035775b795@mail.gmail.com> Message-ID: <6959e1680905071414t5fbf03fel198d2982b02d584e@mail.gmail.com> On Thu, May 7, 2009 at 11:59 AM, pYuri.fy wrote: > When I perform a predicate search like: > ?input[@name^="as_"] > > if there is any spaces in the predicate, I get an error: > ?Nokogiri::XML::XPath::SyntaxError: Invalid predicate > > if I remove the spaces, the search runs without any problems. > > Is this intentional? Nope, this is definitely a bug. I will fix it. -- Aaron Patterson http://tenderlovemaking.com/ From aaron.patterson at gmail.com Thu May 7 17:59:25 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 7 May 2009 14:59:25 -0700 Subject: [Nokogiri-talk] Spaces in predicate searches In-Reply-To: <6959e1680905071414t5fbf03fel198d2982b02d584e@mail.gmail.com> References: <8c2311470905071159k22a16f5cv42738a035775b795@mail.gmail.com> <6959e1680905071414t5fbf03fel198d2982b02d584e@mail.gmail.com> Message-ID: <6959e1680905071459m2c13649ck71f3bd63d186017d@mail.gmail.com> On Thu, May 7, 2009 at 2:14 PM, Aaron Patterson wrote: > On Thu, May 7, 2009 at 11:59 AM, pYuri.fy wrote: >> When I perform a predicate search like: >> ?input[@name^="as_"] >> >> if there is any spaces in the predicate, I get an error: >> ?Nokogiri::XML::XPath::SyntaxError: Invalid predicate >> >> if I remove the spaces, the search runs without any problems. >> >> Is this intentional? > > Nope, this is definitely a bug. ?I will fix it. Fixed here: http://github.com/tenderlove/nokogiri/commit/21c3478b25ac96ff4a36a1958461012d38a78741 This will be released with 1.3.0 -- Aaron Patterson http://tenderlovemaking.com/ From xavi.caballe at gmail.com Tue May 12 12:50:32 2009 From: xavi.caballe at gmail.com (=?ISO-8859-1?Q?Xavi_Caball=E9?=) Date: Tue, 12 May 2009 18:50:32 +0200 Subject: [Nokogiri-talk] problem after upgrading from 1.1.0 to 1.2.3 Message-ID: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> This is on a Mac, running Leopard. I was using Nokogiri 1.1.0 and I've upgraded to 1.2.3 . Now when I run the first example... require 'open-uri' require 'nokogiri' # Perform a google search doc = Nokogiri::HTML(open('http://google.com/search?q=tenderlove')) # Print out each link using a CSS selector doc.css('h3.r > a.l').each do |link| puts link.content end ...it doesn't print anything. If I print the value of doc (puts doc), this is what's displayed... and that's all! I guess that's what's not expected, and now I have a broken installation of Nokogiri. Any idea on what can be the problem? I would really appreciate it. Thanks. xavi From aaron.patterson at gmail.com Tue May 12 13:36:16 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Tue, 12 May 2009 10:36:16 -0700 Subject: [Nokogiri-talk] problem after upgrading from 1.1.0 to 1.2.3 In-Reply-To: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> References: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> Message-ID: <6959e1680905121036w7736d1e7le087fedf30931d7@mail.gmail.com> On Tue, May 12, 2009 at 9:50 AM, Xavi Caball? wrote: > This is on a Mac, running Leopard. I was using Nokogiri 1.1.0 and I've > upgraded to 1.2.3 . Now when I run the first example... > > require 'open-uri' > require 'nokogiri' > > # Perform a google search > doc = Nokogiri::HTML(open('http://google.com/search?q=tenderlove')) > > # Print out each link using a CSS selector > doc.css('h3.r > a.l').each do |link| > ?puts link.content > end > > > ...it doesn't print anything. If I print the value of doc (puts doc), > this is what's displayed... > > > > > and that's all! I guess that's what's not expected, and now I have a > broken installation of Nokogiri. Any idea on what can be the problem? > I would really appreciate it. > Thanks. This seems to be working for me. Can you try one more time? Also, please send along the results of these lines: Nokogiri::LIBXML_VERSION Nokogiri::LIBXML_PARSER_VERSION I suspect that you're running with an out of date version of libxml2. Thanks! -- Aaron Patterson http://tenderlovemaking.com/ From xavi.caballe at gmail.com Tue May 12 14:38:22 2009 From: xavi.caballe at gmail.com (=?ISO-8859-1?Q?Xavi_Caball=E9?=) Date: Tue, 12 May 2009 20:38:22 +0200 Subject: [Nokogiri-talk] problem after upgrading from 1.1.0 to 1.2.3 In-Reply-To: <6959e1680905121036w7736d1e7le087fedf30931d7@mail.gmail.com> References: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> <6959e1680905121036w7736d1e7le087fedf30931d7@mail.gmail.com> Message-ID: <40fdbe870905121138j4be6c42bi2e48a89a41163017@mail.gmail.com> thank you very much for your prompt response! I'm not in front of the computer with the problem now, but I'll try it again and I'll send the info you requested tomorrow first thing in the morning (in Barcelona). xavi On Tue, May 12, 2009 at 7:36 PM, Aaron Patterson wrote: > On Tue, May 12, 2009 at 9:50 AM, Xavi Caball? wrote: >> This is on a Mac, running Leopard. I was using Nokogiri 1.1.0 and I've >> upgraded to 1.2.3 . Now when I run the first example... >> >> require 'open-uri' >> require 'nokogiri' >> >> # Perform a google search >> doc = Nokogiri::HTML(open('http://google.com/search?q=tenderlove')) >> >> # Print out each link using a CSS selector >> doc.css('h3.r > a.l').each do |link| >> ?puts link.content >> end >> >> >> ...it doesn't print anything. If I print the value of doc (puts doc), >> this is what's displayed... >> >> >> >> >> and that's all! I guess that's what's not expected, and now I have a >> broken installation of Nokogiri. Any idea on what can be the problem? >> I would really appreciate it. >> Thanks. > > This seems to be working for me. ?Can you try one more time? ?Also, > please send along the results of these lines: > > ?Nokogiri::LIBXML_VERSION > ?Nokogiri::LIBXML_PARSER_VERSION > > I suspect that you're running with an out of date version of libxml2. ?Thanks! > > -- > Aaron Patterson > http://tenderlovemaking.com/ > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk > From xavi.caballe at gmail.com Wed May 13 05:55:02 2009 From: xavi.caballe at gmail.com (=?ISO-8859-1?Q?Xavi_Caball=E9?=) Date: Wed, 13 May 2009 11:55:02 +0200 Subject: [Nokogiri-talk] problem after upgrading from 1.1.0 to 1.2.3 In-Reply-To: <40fdbe870905121138j4be6c42bi2e48a89a41163017@mail.gmail.com> References: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> <6959e1680905121036w7736d1e7le087fedf30931d7@mail.gmail.com> <40fdbe870905121138j4be6c42bi2e48a89a41163017@mail.gmail.com> Message-ID: <40fdbe870905130255r7a736ca9o37fdbe0b67ac083a@mail.gmail.com> LIBXML: 2.6.16 LIBXML_PARSER: 20616 Should I upgrade? Any pointer on how to do this? Thanks! xavi On Tue, May 12, 2009 at 8:38 PM, Xavi Caball? wrote: > thank you very much for your prompt response! > > I'm not in front of the computer with the problem now, but I'll try it > again and I'll send the info you requested tomorrow first thing in the > morning (in Barcelona). > > xavi > > On Tue, May 12, 2009 at 7:36 PM, Aaron Patterson > wrote: >> On Tue, May 12, 2009 at 9:50 AM, Xavi Caball? wrote: >>> This is on a Mac, running Leopard. I was using Nokogiri 1.1.0 and I've >>> upgraded to 1.2.3 . Now when I run the first example... >>> >>> require 'open-uri' >>> require 'nokogiri' >>> >>> # Perform a google search >>> doc = Nokogiri::HTML(open('http://google.com/search?q=tenderlove')) >>> >>> # Print out each link using a CSS selector >>> doc.css('h3.r > a.l').each do |link| >>> ?puts link.content >>> end >>> >>> >>> ...it doesn't print anything. If I print the value of doc (puts doc), >>> this is what's displayed... >>> >>> >>> >>> >>> and that's all! I guess that's what's not expected, and now I have a >>> broken installation of Nokogiri. Any idea on what can be the problem? >>> I would really appreciate it. >>> Thanks. >> >> This seems to be working for me. ?Can you try one more time? ?Also, >> please send along the results of these lines: >> >> ?Nokogiri::LIBXML_VERSION >> ?Nokogiri::LIBXML_PARSER_VERSION >> >> I suspect that you're running with an out of date version of libxml2. ?Thanks! >> >> -- >> Aaron Patterson >> http://tenderlovemaking.com/ >> _______________________________________________ >> Nokogiri-talk mailing list >> Nokogiri-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/nokogiri-talk >> > From xavi.caballe at gmail.com Wed May 13 09:56:18 2009 From: xavi.caballe at gmail.com (=?ISO-8859-1?Q?Xavi_Caball=E9?=) Date: Wed, 13 May 2009 15:56:18 +0200 Subject: [Nokogiri-talk] problem after upgrading from 1.1.0 to 1.2.3 In-Reply-To: <618c07250905130554t617617d4x79e31987bc2195d7@mail.gmail.com> References: <40fdbe870905120950x5c8932f4q4ef024441de7b322@mail.gmail.com> <6959e1680905121036w7736d1e7le087fedf30931d7@mail.gmail.com> <40fdbe870905121138j4be6c42bi2e48a89a41163017@mail.gmail.com> <40fdbe870905130255r7a736ca9o37fdbe0b67ac083a@mail.gmail.com> <618c07250905130554t617617d4x79e31987bc2195d7@mail.gmail.com> Message-ID: <40fdbe870905130656iee7c9cdvd467b2d423c93409@mail.gmail.com> It worked! Thanks, xavi On Wed, May 13, 2009 at 2:54 PM, Mike Dalessio wrote: > $ sudo port install libxml2 > $ sudo gem install nokogiri > it's important that you reinstall nokogiri after upgrading libxml2. > > On Wed, May 13, 2009 at 5:55 AM, Xavi Caball? > wrote: >> >> LIBXML: 2.6.16 >> LIBXML_PARSER: 20616 >> >> Should I upgrade? Any pointer on how to do this? >> Thanks! >> >> xavi >> >> >> On Tue, May 12, 2009 at 8:38 PM, Xavi Caball? >> wrote: >> > thank you very much for your prompt response! >> > >> > I'm not in front of the computer with the problem now, but I'll try it >> > again and I'll send the info you requested tomorrow first thing in the >> > morning (in Barcelona). >> > >> > xavi >> > >> > On Tue, May 12, 2009 at 7:36 PM, Aaron Patterson >> > wrote: >> >> On Tue, May 12, 2009 at 9:50 AM, Xavi Caball? >> >> wrote: >> >>> This is on a Mac, running Leopard. I was using Nokogiri 1.1.0 and I've >> >>> upgraded to 1.2.3 . Now when I run the first example... >> >>> >> >>> require 'open-uri' >> >>> require 'nokogiri' >> >>> >> >>> # Perform a google search >> >>> doc = Nokogiri::HTML(open('http://google.com/search?q=tenderlove')) >> >>> >> >>> # Print out each link using a CSS selector >> >>> doc.css('h3.r > a.l').each do |link| >> >>> ?puts link.content >> >>> end >> >>> >> >>> >> >>> ...it doesn't print anything. If I print the value of doc (puts doc), >> >>> this is what's displayed... >> >>> >> >>> >> >>> >> >>> >> >>> and that's all! I guess that's what's not expected, and now I have a >> >>> broken installation of Nokogiri. Any idea on what can be the problem? >> >>> I would really appreciate it. >> >>> Thanks. >> >> >> >> This seems to be working for me. ?Can you try one more time? ?Also, >> >> please send along the results of these lines: >> >> >> >> ?Nokogiri::LIBXML_VERSION >> >> ?Nokogiri::LIBXML_PARSER_VERSION >> >> >> >> I suspect that you're running with an out of date version of libxml2. >> >> ?Thanks! >> >> >> >> -- >> >> Aaron Patterson >> >> http://tenderlovemaking.com/ >> >> _______________________________________________ >> >> Nokogiri-talk mailing list >> >> Nokogiri-talk at rubyforge.org >> >> http://rubyforge.org/mailman/listinfo/nokogiri-talk >> >> >> > >> _______________________________________________ >> Nokogiri-talk mailing list >> Nokogiri-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/nokogiri-talk > > > > -- > mike dalessio > mike at csa.net > > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk > > From aaron.patterson at gmail.com Thu May 14 18:00:08 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 14 May 2009 15:00:08 -0700 Subject: [Nokogiri-talk] Namespaces giving me headaches In-Reply-To: <88b3725b0905141417i4ecba59ag7377c3dc7aea1e07@mail.gmail.com> References: <88b3725b0905141417i4ecba59ag7377c3dc7aea1e07@mail.gmail.com> Message-ID: <6959e1680905141500k6436b48bx52e024aab0eaf941@mail.gmail.com> Hi Bob! On Thu, May 14, 2009 at 2:17 PM, Bob Fitterman wrote: > I'm having a few problems with namespaces. The first one I want to bring up > is this example. I've reduced the XML from a real feed from Feedburner. If > you read through it, you'll see that a this tag: " xmlns:thr="http://purl.org/syndication/thread/1.0">71" can't be > referred to by my xpath unless I first explicitly merge in the thr > namespace. I'm not all that familiar with namespaces, but it seems odd that > the tag is allowed to declare its namespace right inside the tag that's > using the namespace. And it seems even odder that id' have to first parse > for the namespace, and then tell the XML parser it's needed. Am I missing > something? Yes, I agree, this namespace behavior is strange, but it is also legal (according to spec). You should be able to know the URL before parsing the parsing the document, so hard coding the url is OK. In fact, if the namespace url in your feed changes, that means the fundamental format of your XML has probably changed. Nokogiri tries to be helpful by automatically registering the namespaces on the root node for you. This seems to fit the 80% case. Unfortunately for your case, you must register the URL: doc = Nokogiri::XML(XML) totals = doc.xpath('//foo:total', doc.root.namespaces.merge( 'foo' => "http://purl.org/syndication/thread/1.0" )) puts totals.length I've written an article to hopefully help with namespace confusion. Maybe it can help you: http://tenderlovemaking.com/2009/04/23/namespaces-in-xml/ Feel free to ask more questions if you've got them! -- Aaron Patterson http://tenderlovemaking.com/ From aaron.patterson at gmail.com Fri May 15 13:28:55 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Fri, 15 May 2009 10:28:55 -0700 Subject: [Nokogiri-talk] Namespaces giving me headaches In-Reply-To: <88b3725b0905150736k1fdeb2f8je3455504d0a1cd84@mail.gmail.com> References: <88b3725b0905141417i4ecba59ag7377c3dc7aea1e07@mail.gmail.com> <6959e1680905141500k6436b48bx52e024aab0eaf941@mail.gmail.com> <88b3725b0905150736k1fdeb2f8je3455504d0a1cd84@mail.gmail.com> Message-ID: <6959e1680905151028o3e5cc32bu9dcd3339352d3cb6@mail.gmail.com> On Fri, May 15, 2009 at 7:36 AM, Bob Fitterman wrote: > Aaron, thanks. In fact, I am writing a general-purpose utility where > individuals will put in some xpaths to get at their data, so it's not a > matter of the feed changing, it's actually more an issue of the feed handler > trying to discern the structure so the user doesn't have to do any extra > work. Your response (and the excellent article) give me a couple of ideas > how to get around this. > > Problem #2 with Namespaces: I don't have any examples from the wild but I > know it is possible to have attributes that are prefixed by a namespaces. > For example, you might have an attribute that's called "atom:foo". My naive > users would expect to access it by specifying an attribute with that name. > If you go through the example you'll see that what comes out at the end is > just plain "foo". Am I doing someting wrong? No, you're not doing anything wrong. Right now we don't have any methods to fetch an attribute with a specific namespace. I have a hard time imagining what the usecase would be for adding namespaces to attributes. I guess that would be if someone wanted to have duplicate attribute names, but have the values change depending on the namespace. The namespace would let you avoid attribute name collisions. > > require 'nokogiri' > XML = %{ > > ? > ??? > ????? atom:foo="bar">tag:blogger.com,1999:blog-1932214040062195180 > ????? xmlns:thr="http://purl.org/syndication/thread/1.0">71 > ??? > ? > } > > doc = Nokogiri.XML(XML) > items = doc/'item' > items.first.xpath('./atom:id') > items.first.xpath('./atom:id').first.attributes > > The output from the last line is:? {"foo"=>bar} Yes, this makes sense. The *name* of the attribute is "foo". It sounds like you are providing an abstraction on top of nokogiri, so I would suggest for now that if one of your users specifies a namespace for an attribute, just strip it out. This will cause problems only in situations where you have a node that contains two attributes with the same name but different namespaces. *or* If you're worried about that edge case, you could examine each of the attribute nodes: require 'nokogiri' XML = %{ hello } doc = Nokogiri.XML(XML) item = doc.xpath('//atom:id').first item.attribute_nodes.each do |a_node| puts "#{a_node.namespace || 'nil'} : #{a_node.name} => #{a_node.value}" end I've created a ticket to deal with this problem here: http://github.com/tenderlove/nokogiri/issues#issue/53 We'll make the interface better for this situation in 1.3.0. -- Aaron Patterson http://tenderlovemaking.com/ From sprsquish at gmail.com Wed May 20 18:21:22 2009 From: sprsquish at gmail.com (Jeff Smick) Date: Wed, 20 May 2009 15:21:22 -0700 Subject: [Nokogiri-talk] Building documents with namespaces Message-ID: <669ac2d0905201521k4dd65318ya45fa0556004f34a@mail.gmail.com> Hey guys, I wrote an XMPP library that uses the push parser to receive data from the wire and reconstruct the stanzas one node at a time. I'm running into some major issues with namespaces though. Here's an example of what's happening (the code from is below): http://gist.github.com/114203 I'm digging around the libxml2 docs and libxml-ruby's code trying to figure out what's going on because the library's currently written on libxml-ruby and all the namespacing works. I want to move to Nokogiri though. Thanks for any help! --Jeff require 'nokogiri' include Nokogiri::XML root = Node.new('root', Document.new) root.add_namespace nil, 'default:ns' root.add_namespace 'prefix', 'prefix:ns' root << (child = Node.new('child', root.document)) puts root # # # root = Node.new('root', Document.new) root.add_namespace 'prefix', 'prefix:ns' root.add_namespace nil, 'default:ns' root << (child = Node.new('child', root.document)) puts root # # # # Desired result: # # # From aaron.patterson at gmail.com Wed May 20 19:54:41 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Wed, 20 May 2009 16:54:41 -0700 Subject: [Nokogiri-talk] Building documents with namespaces In-Reply-To: <669ac2d0905201521k4dd65318ya45fa0556004f34a@mail.gmail.com> References: <669ac2d0905201521k4dd65318ya45fa0556004f34a@mail.gmail.com> Message-ID: <6959e1680905201654i39a6e23cm47ce7b6034aef509@mail.gmail.com> On Wed, May 20, 2009 at 3:21 PM, Jeff Smick wrote: > Hey guys, > > I wrote an XMPP library that uses the push parser to receive data from > the wire and reconstruct the stanzas one node at a time. I'm running > into some major issues with namespaces though. > > Here's an example of what's happening (the code from is below): > http://gist.github.com/114203 > > I'm digging around the libxml2 docs and libxml-ruby's code trying to > figure out what's going on because the library's currently written on > libxml-ruby and all the namespacing works. I want to move to Nokogiri > though. > > Thanks for any help! Yes, what you're doing is kind of broken in the current release of nokogiri. In master (as of about 10 minutes ago), you must explicitly set the namespace for a node (unless you add a default namespace declaration). Like so: require 'nokogiri' include Nokogiri::XML root = Node.new('root', Document.new) root.add_namespace_declaration nil, 'url' foobar = root.add_namespace_declaration 'foo', 'bar' root.namespace = foobar Setting a namespace with a nil prefix will automatically associate that node with the default namespace. Here is my thought process behind that decision: p Nokogiri::XML('').root.namespace # => Namespace instance p Nokogiri::XML('').root.namespace # => nil Hope that helps. -- Aaron Patterson http://tenderlovemaking.com/ From sprsquish at gmail.com Thu May 21 11:15:45 2009 From: sprsquish at gmail.com (Jeff Smick) Date: Thu, 21 May 2009 08:15:45 -0700 Subject: [Nokogiri-talk] Building documents with namespaces In-Reply-To: <6959e1680905201654i39a6e23cm47ce7b6034aef509@mail.gmail.com> References: <669ac2d0905201521k4dd65318ya45fa0556004f34a@mail.gmail.com> <6959e1680905201654i39a6e23cm47ce7b6034aef509@mail.gmail.com> Message-ID: <669ac2d0905210815l1bd5f0f8k395c7bb3c508522d@mail.gmail.com> Beautiful! On Wed, May 20, 2009 at 4:54 PM, Aaron Patterson wrote: > On Wed, May 20, 2009 at 3:21 PM, Jeff Smick wrote: >> Hey guys, >> >> I wrote an XMPP library that uses the push parser to receive data from >> the wire and reconstruct the stanzas one node at a time. I'm running >> into some major issues with namespaces though. >> >> Here's an example of what's happening (the code from is below): >> http://gist.github.com/114203 >> >> I'm digging around the libxml2 docs and libxml-ruby's code trying to >> figure out what's going on because the library's currently written on >> libxml-ruby and all the namespacing works. I want to move to Nokogiri >> though. >> >> Thanks for any help! > > Yes, what you're doing is kind of broken in the current release of > nokogiri. ?In master (as of about 10 minutes ago), you must explicitly > set the namespace for a node (unless you add a default namespace > declaration). > > Like so: > > require 'nokogiri' > > include Nokogiri::XML > > root = Node.new('root', Document.new) > > root.add_namespace_declaration nil, 'url' > foobar = root.add_namespace_declaration 'foo', 'bar' > > root.namespace = foobar > > Setting a namespace with a nil prefix will automatically associate > that node with the default namespace. ?Here is my thought process > behind that decision: > > p Nokogiri::XML('').root.namespace # => Namespace instance > p Nokogiri::XML('').root.namespace # => nil > > Hope that helps. > > -- > Aaron Patterson > http://tenderlovemaking.com/ > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk > From sprsquish at gmail.com Thu May 21 14:27:46 2009 From: sprsquish at gmail.com (Jeff Smick) Date: Thu, 21 May 2009 11:27:46 -0700 Subject: [Nokogiri-talk] more namespace stuff Message-ID: <669ac2d0905211127h212c52e7i90df08c0b099019e@mail.gmail.com> Is there a way to get the actual namespace objects from a node. Node#namespaces returns a hash, but want the actual objects so I can attach other nodes to the namespace. From aaron.patterson at gmail.com Thu May 21 14:41:58 2009 From: aaron.patterson at gmail.com (Aaron Patterson) Date: Thu, 21 May 2009 11:41:58 -0700 Subject: [Nokogiri-talk] more namespace stuff In-Reply-To: <669ac2d0905211127h212c52e7i90df08c0b099019e@mail.gmail.com> References: <669ac2d0905211127h212c52e7i90df08c0b099019e@mail.gmail.com> Message-ID: <6959e1680905211141l42e89465sb7b9015d61b05671@mail.gmail.com> On Thu, May 21, 2009 at 11:27 AM, Jeff Smick wrote: > Is there a way to get the actual namespace objects from a node. > Node#namespaces returns a hash, but want the actual objects so I can > attach other nodes to the namespace. Node#namespace_declarations or Node#namespace (in master) -- Aaron Patterson http://tenderlovemaking.com/ From xavi.caballe at gmail.com Mon May 25 06:41:53 2009 From: xavi.caballe at gmail.com (=?ISO-8859-1?Q?Xavi_Caball=E9?=) Date: Mon, 25 May 2009 12:41:53 +0200 Subject: [Nokogiri-talk] next_sibling not working? Message-ID: <40fdbe870905250341p3fd9a3ddj5d286152a772ad91@mail.gmail.com> I have this code... doc = Nokogiri::HTML.parse(<<-eohtml)
  • line item 1
  • line item 2
  • line item 3
eohtml first_li = doc.css('li').first second_li = first_li.next_sibling puts "first_li == #{first_li}" puts "second_li == #{second_li}" When running it, I get... first_li ==
  • line item 1
  • second_li == Shouldn't next_sibling get the second
  • element here? (I'm using version 1.2.3 of Nokogiri with version 2.7.3 of libxml) From jesse at jesseclark.com Wed May 27 15:33:29 2009 From: jesse at jesseclark.com (Jesse Clark) Date: Wed, 27 May 2009 12:33:29 -0700 Subject: [Nokogiri-talk] Nokogiri::XML::SAX::Document and elements with invalid xml contents Message-ID: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> Hi All, I have an xml document which I am trying to parse with Nokogiri::XML::SAX::Parser which contains an element that contains unescaped html fragments. I want to get the entire inner contents of this element but #characters is never being called because each inner html element is getting parsed as well. From what I remember of the last time I did SAX parsing in Java, I believe they had some method that would allow to pull out the inner contents of an element as if it were CDATA and then proceed with normal parsing. Is there anything similar in Nokogiri? I didn't see anything like it in the docs. Alternatively, does anyone have any suggestions for other ways I could get this accomplished? TIA, -Jesse From jeff at somethingsimilar.com Wed May 27 16:04:14 2009 From: jeff at somethingsimilar.com (Jeff Hodges) Date: Wed, 27 May 2009 13:04:14 -0700 Subject: [Nokogiri-talk] Nokogiri::XML::SAX::Document and elements with invalid xml contents In-Reply-To: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> References: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> Message-ID: Could you give an example? I imagine the answer would be checking specifically to see what tag you're on, and if it's the parent tag of the text you want, calling inner_html on that, flipping a boolean that says to the rest of your code "don't worry about this data" until the end tag event happens for the tag you care about when you flip it back. Since you've been handed a pile of angle brackets and told it was XML, you might have get a little dirty. -- Jeff On Wed, May 27, 2009 at 12:33 PM, Jesse Clark wrote: > Hi All, > > I have an xml document which I am trying to parse with > Nokogiri::XML::SAX::Parser which contains an element that contains unescaped > html fragments. I want to get the entire inner contents of this element but > #characters is never being called because each inner html element is getting > parsed as well. > > From what I remember of the last time I did SAX parsing in Java, I believe > they had some method that would allow to pull out the inner contents of an > element as if it were CDATA and then proceed with normal parsing. Is there > anything similar in Nokogiri? I didn't see anything like it in the docs. > > Alternatively, does anyone have any suggestions for other ways I could get > this accomplished? > > TIA, > -Jesse > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk > From jesse at jesseclark.com Wed May 27 18:55:55 2009 From: jesse at jesseclark.com (Jesse Clark) Date: Wed, 27 May 2009 15:55:55 -0700 Subject: [Nokogiri-talk] Nokogiri::XML::SAX::Document and elements with invalid xml contents In-Reply-To: References: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> Message-ID: <0068B52B-6C03-4925-9059-755D4AD92CD1@jesseclark.com> Yes, that is what I was planning on doing. My problem is that since SAX parsing is event driven: http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/SAX/Document.html there isn't anything to call inner_text or inner_html on. So, in my SAX::Document class I would have to set a flag and create a buffer somewhere that can be used to accumulate the characters from all of the inlined html tags which seems icky... In my situation all this parsing is happening out of band anyway so I am going to sanitize the incoming document at a higher level before it gets passed in to the SAX parser. -Jesse On May 27, 2009, at 1:04 PM, Jeff Hodges wrote: > Could you give an example? I imagine the answer would be checking > specifically to see what tag you're on, and if it's the parent tag of > the text you want, calling inner_html on that, flipping a boolean that > says to the rest of your code "don't worry about this data" until the > end tag event happens for the tag you care about when you flip it > back. Since you've been handed a pile of angle brackets and told it > was XML, you might have get a little dirty. > -- > Jeff > > On Wed, May 27, 2009 at 12:33 PM, Jesse Clark > wrote: >> Hi All, >> >> I have an xml document which I am trying to parse with >> Nokogiri::XML::SAX::Parser which contains an element that contains >> unescaped >> html fragments. I want to get the entire inner contents of this >> element but >> #characters is never being called because each inner html element >> is getting >> parsed as well. >> >> From what I remember of the last time I did SAX parsing in Java, I >> believe >> they had some method that would allow to pull out the inner >> contents of an >> element as if it were CDATA and then proceed with normal parsing. >> Is there >> anything similar in Nokogiri? I didn't see anything like it in the >> docs. >> >> Alternatively, does anyone have any suggestions for other ways I >> could get >> this accomplished? >> >> TIA, >> -Jesse >> _______________________________________________ >> Nokogiri-talk mailing list >> Nokogiri-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/nokogiri-talk >> > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk From jeff at somethingsimilar.com Wed May 27 19:05:53 2009 From: jeff at somethingsimilar.com (Jeff Hodges) Date: Wed, 27 May 2009 16:05:53 -0700 Subject: [Nokogiri-talk] Nokogiri::XML::SAX::Document and elements with invalid xml contents In-Reply-To: <0068B52B-6C03-4925-9059-755D4AD92CD1@jesseclark.com> References: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> <0068B52B-6C03-4925-9059-755D4AD92CD1@jesseclark.com> Message-ID: Sorry, I'm dumb. Of course you couldn't do that. Lemme get back to you. I had to solve a problem just like this a little while ago. -- Jeff On Wed, May 27, 2009 at 3:55 PM, Jesse Clark wrote: > Yes, that is what I was planning on doing. My problem is that since SAX > parsing is event driven: > http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/SAX/Document.html > there isn't anything to call inner_text or inner_html on. > > So, in my SAX::Document class I would have to set a flag and create a buffer > somewhere that can be used to accumulate the characters from all of the > inlined html tags which seems icky... > > In my situation all this parsing is happening out of band anyway so I am > going to sanitize the incoming document at a higher level before it gets > passed in to the SAX parser. > > -Jesse > > On May 27, 2009, at 1:04 PM, Jeff Hodges wrote: > >> Could you give an example? I imagine the answer would be checking >> specifically to see what tag you're on, and if it's the parent tag of >> the text you want, calling inner_html on that, flipping a boolean that >> says to the rest of your code "don't worry about this data" until the >> end tag event happens for the tag you care about when you flip it >> back. Since you've been handed a pile of angle brackets and told it >> was XML, you might have get a little dirty. >> -- >> Jeff >> >> On Wed, May 27, 2009 at 12:33 PM, Jesse Clark >> wrote: >>> >>> Hi All, >>> >>> I have an xml document which I am trying to parse with >>> Nokogiri::XML::SAX::Parser which contains an element that contains >>> unescaped >>> html fragments. I want to get the entire inner contents of this element >>> but >>> #characters is never being called because each inner html element is >>> getting >>> parsed as well. >>> >>> From what I remember of the last time I did SAX parsing in Java, I >>> believe >>> they had some method that would allow to pull out the inner contents of >>> an >>> element as if it were CDATA and then proceed with normal parsing. Is >>> there >>> anything similar in Nokogiri? I didn't see anything like it in the docs. >>> >>> Alternatively, does anyone have any suggestions for other ways I could >>> get >>> this accomplished? >>> >>> TIA, >>> -Jesse >>> _______________________________________________ >>> Nokogiri-talk mailing list >>> Nokogiri-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/nokogiri-talk >>> >> _______________________________________________ >> Nokogiri-talk mailing list >> Nokogiri-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/nokogiri-talk > > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk > From jesse at jesseclark.com Wed May 27 19:25:25 2009 From: jesse at jesseclark.com (Jesse Clark) Date: Wed, 27 May 2009 16:25:25 -0700 Subject: [Nokogiri-talk] Nokogiri::XML::SAX::Document and elements with invalid xml contents In-Reply-To: References: <62F4FF55-6041-4E59-BD6D-402DAE401B60@jesseclark.com> <0068B52B-6C03-4925-9059-755D4AD92CD1@jesseclark.com> Message-ID: <9B2180A2-E6CC-4635-9E24-1AEC294746C4@jesseclark.com> No worries. Like I said I am currently sanitizing the document before it gets to the SAX parser but I would still be interested to see any solutions you have come up with... Thanks, -Jesse On May 27, 2009, at 4:05 PM, Jeff Hodges wrote: > Sorry, I'm dumb. Of course you couldn't do that. Lemme get back to > you. I had to solve a problem just like this a little while ago. > -- > Jeff > > On Wed, May 27, 2009 at 3:55 PM, Jesse Clark > wrote: >> Yes, that is what I was planning on doing. My problem is that since >> SAX >> parsing is event driven: >> http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/SAX/Document.html >> there isn't anything to call inner_text or inner_html on. >> >> So, in my SAX::Document class I would have to set a flag and create >> a buffer >> somewhere that can be used to accumulate the characters from all of >> the >> inlined html tags which seems icky... >> >> In my situation all this parsing is happening out of band anyway so >> I am >> going to sanitize the incoming document at a higher level before it >> gets >> passed in to the SAX parser. >> >> -Jesse >> >> On May 27, 2009, at 1:04 PM, Jeff Hodges wrote: >> >>> Could you give an example? I imagine the answer would be checking >>> specifically to see what tag you're on, and if it's the parent tag >>> of >>> the text you want, calling inner_html on that, flipping a boolean >>> that >>> says to the rest of your code "don't worry about this data" until >>> the >>> end tag event happens for the tag you care about when you flip it >>> back. Since you've been handed a pile of angle brackets and told it >>> was XML, you might have get a little dirty. >>> -- >>> Jeff >>> >>> On Wed, May 27, 2009 at 12:33 PM, Jesse Clark >>> wrote: >>>> >>>> Hi All, >>>> >>>> I have an xml document which I am trying to parse with >>>> Nokogiri::XML::SAX::Parser which contains an element that contains >>>> unescaped >>>> html fragments. I want to get the entire inner contents of this >>>> element >>>> but >>>> #characters is never being called because each inner html element >>>> is >>>> getting >>>> parsed as well. >>>> >>>> From what I remember of the last time I did SAX parsing in Java, I >>>> believe >>>> they had some method that would allow to pull out the inner >>>> contents of >>>> an >>>> element as if it were CDATA and then proceed with normal parsing. >>>> Is >>>> there >>>> anything similar in Nokogiri? I didn't see anything like it in >>>> the docs. >>>> >>>> Alternatively, does anyone have any suggestions for other ways I >>>> could >>>> get >>>> this accomplished? >>>> >>>> TIA, >>>> -Jesse >>>> _______________________________________________ >>>> Nokogiri-talk mailing list >>>> Nokogiri-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/nokogiri-talk >>>> >>> _______________________________________________ >>> Nokogiri-talk mailing list >>> Nokogiri-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/nokogiri-talk >> >> _______________________________________________ >> Nokogiri-talk mailing list >> Nokogiri-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/nokogiri-talk >> > _______________________________________________ > Nokogiri-talk mailing list > Nokogiri-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/nokogiri-talk From aaron at tenderlovemaking.com Sat May 30 17:10:40 2009 From: aaron at tenderlovemaking.com (Aaron Patterson) Date: Sat, 30 May 2009 14:10:40 -0700 Subject: [Nokogiri-talk] [ANN] nokogiri 1.3.0 Released Message-ID: <20090530211040.GA30510@kroche-x60.pace.edu> nokogiri version 1.3.0 has been released! * * * * * Nokogiri (?) is an HTML, XML, SAX, and Reader parser. Changes: ### 1.3.0 / 2009-05-30 * New Features * Builder changes scope based on block arity * Builder supports methods ending in underscore similar to tagz * Nokogiri::XML::Node#<=> compares nodes based on Document position * Nokogiri::XML::Node#matches? returns true if Node can be found with given selector. * Nokogiri::XML::Node#ancestors now returns an Nokogiri::XML::NodeSet * Nokogiri::XML::Node#ancestors will match parents against optional selector * Nokogiri::HTML::Document#meta_encoding for getting the meta encoding * Nokogiri::HTML::Document#meta_encoding= for setting the meta encoding * Nokogiri::XML::Document#encoding= to set the document encoding * Nokogiri::XML::Schema for validating documents against XSD schema * Nokogiri::XML::RelaxNG for validating documents against RelaxNG schema * Nokogiri::HTML::ElementDescription for fetching HTML element descriptions * Nokogiri::XML::Node#description to fetch the node description * Nokogiri::XML::Node#accept implements Visitor pattern * bin/nokogiri for easily examining documents (Thanks Yutaka HARA!) * Nokogiri::XML::NodeSet now supports more Array and Enumerable operators: index, delete, slice, - (difference), + (concatenation), & (intersection), push, pop, shift, == * Nokogiri.XML, Nokogiri.HTML take blocks that receive Nokogiri::XML::ParseOptions objects * Nokogiri::XML::Node#namespace returns a Nokogiri::XML::Namespace * Nokogiri::XML::Node#namespace= for setting a node's namespace * Nokogiri::XML::DocumentFragment and Nokogiri::HTML::DocumentFragment have a sensible API and a more robust implementation. * JRuby 1.3.0 support via FFI. * Bugfixes * Fixed a problem with nil passed to CDATA constructor * Fragment method deals with regular expression characters (Thanks Joel!) LH #73 * Fixing builder scope issues LH #61, LH #74, LH #70 * Fixed a problem when adding a child could remove the child namespace LH#78 * Fixed bug with unlinking a node then reparenting it. (GH#22) * Fixed failure to catch errors during XSLT parsing (GH#32) * Fixed a bug with attribute conditions in CSS selectors (GH#36) * Fixed intolerance of HTML attributes without values in Node#before/after/inner_html=. (GH#35) * * * * * -- Aaron Patterson http://tenderlovemaking.com/