From dorikick at gmail.com Mon May 3 01:14:31 2010 From: dorikick at gmail.com (doridori Jo) Date: Sun, 2 May 2010 22:14:31 -0700 Subject: [Celerity-users] how to just mute all annoying popup ads ? Message-ID: is there any way to have javascript and ajax support, but be able to block all popup ads ? browser.ignore_pattern, simply returns an empty page, if the URL matches say google ads anyways, it would be very convenient to be able to block off all popups without having to turn off javascript support. currently, i am relying on retry methods, by matching browser url against any popup ad urls like ad.doubleclick.net and etc.... browserurl = "" begin browserz = linkz.click_and_attach browserurl << browserz.url rescue retry if browserurl =~ /doubleclick/ end -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorikick at gmail.com Mon May 3 05:36:37 2010 From: dorikick at gmail.com (doridori Jo) Date: Mon, 3 May 2010 02:36:37 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> Message-ID: okay, so i have the page i want....not clear exactly what web_window_event listener is doing... how to set it to browser.page ? i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can i set this into browser.page ? browser = Celerity::Browser.new() browser.add_listener(:web_window_event) { |window| if window.getNewPage == nil #this might be a popup window else browser.page = window.getNewPage #Got the page i need here.... puts window end } puts browser linkz = browser.element_by_xpath(someXpath) newbrowser = linkz.click_and_attach newbrowser should now contain the filtered page....but it doesn't On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: > Jari, i've submitted a new issue ticket on github. > > Tomasz, how will i use new_browser ? do i just declare it in the very > beginning, and just continue using browser.link.click_and_attach ? kinda > confused. > > For now, i think best thing is if i just don't use click_and_attach, maybe > just use click....but click_and_attach is very useful ! > > > > 2010/4/30 Tomasz Kalkosi?ski > > On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >> wrote: >> > > >> > > So any way to deal with this ? i would very much prefer to use >> > > click_and_attach, and be able to filter windows (like if URL contains, >> > > ad.doubleclick, ignore it) >> > > >> > >> > That sounds like a sensible feature request, even though it's rarely >> requested. >> > Could you add an issue to the tracker on GitHub? Or even better, write >> > a patch :) >> >> I've dealt with it in my project. You have to look up for windows >> collection. Something like snippet below, you have to experiment for >> yourself. >> >> def new_browser >> >> @browser = Celerity::Browser.new >> >> @browser.add_listener(:web_window_event) { |window| >> >> if >> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >> && >> window.getEventType == >> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >> >> @old_top_window = window.getWebWindow.getParentWindow >> set_actual_page window.page >> >> end >> >> if window.getWebWindow.getName == "YourPreferredName" && >> window.getEventType == >> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >> >> ... >> >> >> end >> >> } >> >> def set_actual_page(page) >> # Add to collection on top >> @pages << @browser.page >> >> # Set actual page >> @browser.page = page >> end >> >> >> Greetings, >> Tomasz Kalkosi?ski >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jasoninclass at googlemail.com Mon May 3 07:02:18 2010 From: jasoninclass at googlemail.com (jason franklin-stokes) Date: Mon, 3 May 2010 13:02:18 +0200 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> Message-ID: <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> you need to set the @page variable - in general it looks like you want to scrape pages from different websites - if you are celerity is not the right tool. you should consider getting rid of it and using HtmlUnit directly. best Jason. On May 3, 2010, at 11:36 AM, doridori Jo wrote: > okay, so i have the page i want....not clear exactly what web_window_event listener is doing... > > how to set it to browser.page ? > > i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can i set this into browser.page ? > browser = Celerity::Browser.new() > browser.add_listener(:web_window_event) { |window| > if window.getNewPage == nil #this might be a popup window > else > browser.page = window.getNewPage #Got the page i need here.... > puts window > end > } > puts browser > linkz = browser.element_by_xpath(someXpath) > newbrowser = linkz.click_and_attach > > newbrowser should now contain the filtered page....but it doesn't > > > On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: > Jari, i've submitted a new issue ticket on github. > > Tomasz, how will i use new_browser ? do i just declare it in the very beginning, and just continue using browser.link.click_and_attach ? kinda confused. > > For now, i think best thing is if i just don't use click_and_attach, maybe just use click....but click_and_attach is very useful ! > > > > 2010/4/30 Tomasz Kalkosi?ski > > On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: > > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo wrote: > > > > > > So any way to deal with this ? i would very much prefer to use > > > click_and_attach, and be able to filter windows (like if URL contains, > > > ad.doubleclick, ignore it) > > > > > > > That sounds like a sensible feature request, even though it's rarely requested. > > Could you add an issue to the tracker on GitHub? Or even better, write > > a patch :) > > I've dealt with it in my project. You have to look up for windows collection. Something like snippet below, you have to experiment for yourself. > > def new_browser > > @browser = Celerity::Browser.new > > @browser.add_listener(:web_window_event) { |window| > > if window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) && > window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN > > @old_top_window = window.getWebWindow.getParentWindow > set_actual_page window.page > > end > > if window.getWebWindow.getName == "YourPreferredName" && > window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE > > ... > > > end > > } > > def set_actual_page(page) > # Add to collection on top > @pages << @browser.page > > # Set actual page > @browser.page = page > end > > > Greetings, > Tomasz Kalkosi?ski > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Mon May 3 07:10:27 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Mon, 3 May 2010 13:10:27 +0200 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> Message-ID: On Mon, May 3, 2010 at 11:36 AM, doridori Jo wrote: > okay, so i have the page i want....not clear exactly what web_window_event > listener is doing... > > how to set it to browser.page ? > > i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can > i set this into browser.page ? > ??? ??? ??? browser = Celerity::Browser.new() > ??? ??? ??? browser.add_listener(:web_window_event) { |window| > ??? ??? ??? if window.getNewPage == nil #this might be a popup window > ??? ??? ??? else > ??? ??? ??? ??? browser.page = window.getNewPage #Got the page i need > here.... > ??? ??? ??? ??? puts window > ??? ??? ??? end > ??? ??? ??? } > ??? ??? ??? puts browser > ??????????? linkz = browser.element_by_xpath(someXpath) > ??? ??? ??? newbrowser = linkz.click_and_attach > > newbrowser should now contain the filtered page....but it doesn't > > click_and_attach creates a new browser instance, and your listener won't be attached to this instance. Celerity uses the :web_window_event to track page changes internally - if you look at the source for click_and_attach you'll see how it's disabled while the element is clicked: http://github.com/jarib/celerity/blob/master/lib/celerity/clickable_element.rb#L38 If you access the underlying HtmlUnit object and click() that, you will get the new page without Celerity knowing anything about it: element.object.click #=> HtmlPage However I'd much rather see a nice patch that exposes the needed functionality through a sane API, instead of HtmlUnit's internals :) > On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >> >> Jari, i've submitted a new issue ticket on github. >> >> Tomasz, how will i use new_browser ? do i just declare it in the very >> beginning, and just continue using browser.link.click_and_attach ? kinda >> confused. >> >> For now, i think best thing is if i just don't use click_and_attach, maybe >> just use click....but click_and_attach is very useful ! >> >> >> >> 2010/4/30 Tomasz Kalkosi?ski >>> >>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>> > wrote: >>> > > >>> > > So any way to deal with this ? i would very much prefer to use >>> > > click_and_attach, and be able to filter windows (like if URL >>> > > contains, >>> > > ad.doubleclick, ignore it) >>> > > >>> > >>> > That sounds like a sensible feature request, even though it's rarely >>> > requested. >>> > Could you add an issue to the tracker on GitHub? Or even better, write >>> > a patch :) >>> >>> I've dealt with it in my project. You have to look up for windows >>> collection. Something like snippet below, you have to experiment for >>> yourself. >>> >>> ? ?def new_browser >>> >>> ? ? ?@browser = Celerity::Browser.new >>> >>> ? ? ?@browser.add_listener(:web_window_event) { |window| >>> >>> ? ? ?if >>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>> && >>> ? ? ? ? ?window.getEventType == >>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>> >>> ? ? ?@old_top_window = window.getWebWindow.getParentWindow >>> ? ? ?set_actual_page window.page >>> >>> ? ? ?end >>> >>> ? ? ?if window.getWebWindow.getName == "YourPreferredName" && >>> ? ? ? ? ?window.getEventType == >>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>> >>> ? ? ?... >>> >>> >>> ? ? ?end >>> >>> ? ? ?} >>> >>> ?def set_actual_page(page) >>> ? ?# Add to collection on top >>> ? ?@pages << @browser.page >>> >>> ? ?# Set actual page >>> ? ?@browser.page = page >>> ?end >>> >>> >>> Greetings, >>> Tomasz Kalkosi?ski >>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >> > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > > From dorikick at gmail.com Mon May 3 07:40:25 2010 From: dorikick at gmail.com (doridori Jo) Date: Mon, 3 May 2010 04:40:25 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: aye you are right jason. but i dont know Java ! :( On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < jasoninclass at googlemail.com> wrote: > you need to set the @page variable > > - in general it looks like you want to scrape pages from different websites > - if you are celerity is not the right tool. > you should consider getting rid of it and using HtmlUnit directly. > > best Jason. > > > On May 3, 2010, at 11:36 AM, doridori Jo wrote: > > okay, so i have the page i want....not clear exactly what web_window_event > listener is doing... > > how to set it to browser.page ? > > i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can > i set this into browser.page ? > browser = Celerity::Browser.new() > browser.add_listener(:web_window_event) { |window| > if window.getNewPage == nil #this might be a popup window > else > browser.page = window.getNewPage #Got the page i need > here.... > puts window > end > } > puts browser > linkz = browser.element_by_xpath(someXpath) > newbrowser = linkz.click_and_attach > > newbrowser should now contain the filtered page....but it doesn't > > > On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: > >> Jari, i've submitted a new issue ticket on github. >> >> Tomasz, how will i use new_browser ? do i just declare it in the very >> beginning, and just continue using browser.link.click_and_attach ? kinda >> confused. >> >> For now, i think best thing is if i just don't use click_and_attach, maybe >> just use click....but click_and_attach is very useful ! >> >> >> >> 2010/4/30 Tomasz Kalkosi?ski >> >> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>> wrote: >>> > > >>> > > So any way to deal with this ? i would very much prefer to use >>> > > click_and_attach, and be able to filter windows (like if URL >>> contains, >>> > > ad.doubleclick, ignore it) >>> > > >>> > >>> > That sounds like a sensible feature request, even though it's rarely >>> requested. >>> > Could you add an issue to the tracker on GitHub? Or even better, write >>> > a patch :) >>> >>> I've dealt with it in my project. You have to look up for windows >>> collection. Something like snippet below, you have to experiment for >>> yourself. >>> >>> def new_browser >>> >>> @browser = Celerity::Browser.new >>> >>> @browser.add_listener(:web_window_event) { |window| >>> >>> if >>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>> && >>> window.getEventType == >>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>> >>> @old_top_window = window.getWebWindow.getParentWindow >>> set_actual_page window.page >>> >>> end >>> >>> if window.getWebWindow.getName == "YourPreferredName" && >>> window.getEventType == >>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>> >>> ... >>> >>> >>> end >>> >>> } >>> >>> def set_actual_page(page) >>> # Add to collection on top >>> @pages << @browser.page >>> >>> # Set actual page >>> @browser.page = page >>> end >>> >>> >>> Greetings, >>> Tomasz Kalkosi?ski >>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >>> >> >> > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorikick at gmail.com Mon May 3 07:58:53 2010 From: dorikick at gmail.com (doridori Jo) Date: Mon, 3 May 2010 04:58:53 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: that code complains about window.page....is that waht Tomasz meant ??? 2010/5/3 doridori Jo > okay still not working.... > > how do i set the page into celerity's browser.page ? > > set the @page variable ? where ? > > btw, i am now using this code now (i think i should learn java now): > > > browser.add_listener(:web_window_event) { |window| > if > window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) > && > window.getEventType == > Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE > > @old_top_window = > window.getWebWindow.getParentWindow > set_actual_page window.page > end > } > > 2010/5/3 doridori Jo > > aye you are right jason. but i dont know Java ! :( >> >> >> On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < >> jasoninclass at googlemail.com> wrote: >> >>> you need to set the @page variable >>> >>> - in general it looks like you want to scrape pages from different >>> websites - if you are celerity is not the right tool. >>> you should consider getting rid of it and using HtmlUnit directly. >>> >>> best Jason. >>> >>> >>> On May 3, 2010, at 11:36 AM, doridori Jo wrote: >>> >>> okay, so i have the page i want....not clear exactly what >>> web_window_event listener is doing... >>> >>> how to set it to browser.page ? >>> >>> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how >>> can i set this into browser.page ? >>> browser = Celerity::Browser.new() >>> browser.add_listener(:web_window_event) { |window| >>> if window.getNewPage == nil #this might be a popup window >>> else >>> browser.page = window.getNewPage #Got the page i need >>> here.... >>> puts window >>> end >>> } >>> puts browser >>> linkz = browser.element_by_xpath(someXpath) >>> newbrowser = linkz.click_and_attach >>> >>> newbrowser should now contain the filtered page....but it doesn't >>> >>> >>> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >>> >>>> Jari, i've submitted a new issue ticket on github. >>>> >>>> Tomasz, how will i use new_browser ? do i just declare it in the very >>>> beginning, and just continue using browser.link.click_and_attach ? kinda >>>> confused. >>>> >>>> For now, i think best thing is if i just don't use click_and_attach, >>>> maybe just use click....but click_and_attach is very useful ! >>>> >>>> >>>> >>>> 2010/4/30 Tomasz Kalkosi?ski >>>> >>>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>>>> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>>>> wrote: >>>>> > > >>>>> > > So any way to deal with this ? i would very much prefer to use >>>>> > > click_and_attach, and be able to filter windows (like if URL >>>>> contains, >>>>> > > ad.doubleclick, ignore it) >>>>> > > >>>>> > >>>>> > That sounds like a sensible feature request, even though it's rarely >>>>> requested. >>>>> > Could you add an issue to the tracker on GitHub? Or even better, >>>>> write >>>>> > a patch :) >>>>> >>>>> I've dealt with it in my project. You have to look up for windows >>>>> collection. Something like snippet below, you have to experiment for >>>>> yourself. >>>>> >>>>> def new_browser >>>>> >>>>> @browser = Celerity::Browser.new >>>>> >>>>> @browser.add_listener(:web_window_event) { |window| >>>>> >>>>> if >>>>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>>>> && >>>>> window.getEventType == >>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>>>> >>>>> @old_top_window = window.getWebWindow.getParentWindow >>>>> set_actual_page window.page >>>>> >>>>> end >>>>> >>>>> if window.getWebWindow.getName == "YourPreferredName" && >>>>> window.getEventType == >>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>>>> >>>>> ... >>>>> >>>>> >>>>> end >>>>> >>>>> } >>>>> >>>>> def set_actual_page(page) >>>>> # Add to collection on top >>>>> @pages << @browser.page >>>>> >>>>> # Set actual page >>>>> @browser.page = page >>>>> end >>>>> >>>>> >>>>> Greetings, >>>>> Tomasz Kalkosi?ski >>>>> _______________________________________________ >>>>> Celerity-users mailing list >>>>> Celerity-users at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>>> >>>> >>>> >>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >>> >>> >>> >>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorikick at gmail.com Mon May 3 07:58:08 2010 From: dorikick at gmail.com (doridori Jo) Date: Mon, 3 May 2010 04:58:08 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: okay still not working.... how do i set the page into celerity's browser.page ? set the @page variable ? where ? btw, i am now using this code now (i think i should learn java now): browser.add_listener(:web_window_event) { |window| if window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) && window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE @old_top_window = window.getWebWindow.getParentWindow set_actual_page window.page end } 2010/5/3 doridori Jo > aye you are right jason. but i dont know Java ! :( > > > On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < > jasoninclass at googlemail.com> wrote: > >> you need to set the @page variable >> >> - in general it looks like you want to scrape pages from different >> websites - if you are celerity is not the right tool. >> you should consider getting rid of it and using HtmlUnit directly. >> >> best Jason. >> >> >> On May 3, 2010, at 11:36 AM, doridori Jo wrote: >> >> okay, so i have the page i want....not clear exactly what web_window_event >> listener is doing... >> >> how to set it to browser.page ? >> >> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how >> can i set this into browser.page ? >> browser = Celerity::Browser.new() >> browser.add_listener(:web_window_event) { |window| >> if window.getNewPage == nil #this might be a popup window >> else >> browser.page = window.getNewPage #Got the page i need >> here.... >> puts window >> end >> } >> puts browser >> linkz = browser.element_by_xpath(someXpath) >> newbrowser = linkz.click_and_attach >> >> newbrowser should now contain the filtered page....but it doesn't >> >> >> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >> >>> Jari, i've submitted a new issue ticket on github. >>> >>> Tomasz, how will i use new_browser ? do i just declare it in the very >>> beginning, and just continue using browser.link.click_and_attach ? kinda >>> confused. >>> >>> For now, i think best thing is if i just don't use click_and_attach, >>> maybe just use click....but click_and_attach is very useful ! >>> >>> >>> >>> 2010/4/30 Tomasz Kalkosi?ski >>> >>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>>> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>>> wrote: >>>> > > >>>> > > So any way to deal with this ? i would very much prefer to use >>>> > > click_and_attach, and be able to filter windows (like if URL >>>> contains, >>>> > > ad.doubleclick, ignore it) >>>> > > >>>> > >>>> > That sounds like a sensible feature request, even though it's rarely >>>> requested. >>>> > Could you add an issue to the tracker on GitHub? Or even better, write >>>> > a patch :) >>>> >>>> I've dealt with it in my project. You have to look up for windows >>>> collection. Something like snippet below, you have to experiment for >>>> yourself. >>>> >>>> def new_browser >>>> >>>> @browser = Celerity::Browser.new >>>> >>>> @browser.add_listener(:web_window_event) { |window| >>>> >>>> if >>>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>>> && >>>> window.getEventType == >>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>>> >>>> @old_top_window = window.getWebWindow.getParentWindow >>>> set_actual_page window.page >>>> >>>> end >>>> >>>> if window.getWebWindow.getName == "YourPreferredName" && >>>> window.getEventType == >>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>>> >>>> ... >>>> >>>> >>>> end >>>> >>>> } >>>> >>>> def set_actual_page(page) >>>> # Add to collection on top >>>> @pages << @browser.page >>>> >>>> # Set actual page >>>> @browser.page = page >>>> end >>>> >>>> >>>> Greetings, >>>> Tomasz Kalkosi?ski >>>> _______________________________________________ >>>> Celerity-users mailing list >>>> Celerity-users at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>> >>> >>> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> >> >> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dorikick at gmail.com Mon May 3 08:24:36 2010 From: dorikick at gmail.com (doridori Jo) Date: Mon, 3 May 2010 05:24:36 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: 1) how to set the HtmlPage returned by HtmlUnit into Celerity's browser page ? set_actual_page window.getWebWindow.getParentWindow #<-- this is HtmlPage....need to convert it to <#Celerity::Browser> object.....dont know how def set_actual_page(page) # Add to collection on top @page << @browser.page #<-- did u mean fix @pages to @page ? # Set actual page @browser.page = page end okay well gonna sleep for now....been awake for 30 hours straight. 2010/5/3 doridori Jo > that code complains about window.page....is that waht Tomasz meant ??? > > > 2010/5/3 doridori Jo > >> okay still not working.... >> >> how do i set the page into celerity's browser.page ? >> >> set the @page variable ? where ? >> >> btw, i am now using this code now (i think i should learn java now): >> >> >> browser.add_listener(:web_window_event) { |window| >> if >> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >> && >> window.getEventType == >> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >> >> @old_top_window = >> window.getWebWindow.getParentWindow >> set_actual_page window.page >> end >> } >> >> 2010/5/3 doridori Jo >> >> aye you are right jason. but i dont know Java ! :( >>> >>> >>> On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < >>> jasoninclass at googlemail.com> wrote: >>> >>>> you need to set the @page variable >>>> >>>> - in general it looks like you want to scrape pages from different >>>> websites - if you are celerity is not the right tool. >>>> you should consider getting rid of it and using HtmlUnit directly. >>>> >>>> best Jason. >>>> >>>> >>>> On May 3, 2010, at 11:36 AM, doridori Jo wrote: >>>> >>>> okay, so i have the page i want....not clear exactly what >>>> web_window_event listener is doing... >>>> >>>> how to set it to browser.page ? >>>> >>>> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how >>>> can i set this into browser.page ? >>>> browser = Celerity::Browser.new() >>>> browser.add_listener(:web_window_event) { |window| >>>> if window.getNewPage == nil #this might be a popup window >>>> else >>>> browser.page = window.getNewPage #Got the page i need >>>> here.... >>>> puts window >>>> end >>>> } >>>> puts browser >>>> linkz = browser.element_by_xpath(someXpath) >>>> newbrowser = linkz.click_and_attach >>>> >>>> newbrowser should now contain the filtered page....but it doesn't >>>> >>>> >>>> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >>>> >>>>> Jari, i've submitted a new issue ticket on github. >>>>> >>>>> Tomasz, how will i use new_browser ? do i just declare it in the very >>>>> beginning, and just continue using browser.link.click_and_attach ? kinda >>>>> confused. >>>>> >>>>> For now, i think best thing is if i just don't use click_and_attach, >>>>> maybe just use click....but click_and_attach is very useful ! >>>>> >>>>> >>>>> >>>>> 2010/4/30 Tomasz Kalkosi?ski >>>>> >>>>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>>>>> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>>>>> wrote: >>>>>> > > >>>>>> > > So any way to deal with this ? i would very much prefer to use >>>>>> > > click_and_attach, and be able to filter windows (like if URL >>>>>> contains, >>>>>> > > ad.doubleclick, ignore it) >>>>>> > > >>>>>> > >>>>>> > That sounds like a sensible feature request, even though it's rarely >>>>>> requested. >>>>>> > Could you add an issue to the tracker on GitHub? Or even better, >>>>>> write >>>>>> > a patch :) >>>>>> >>>>>> I've dealt with it in my project. You have to look up for windows >>>>>> collection. Something like snippet below, you have to experiment for >>>>>> yourself. >>>>>> >>>>>> def new_browser >>>>>> >>>>>> @browser = Celerity::Browser.new >>>>>> >>>>>> @browser.add_listener(:web_window_event) { |window| >>>>>> >>>>>> if >>>>>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>>>>> && >>>>>> window.getEventType == >>>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>>>>> >>>>>> @old_top_window = window.getWebWindow.getParentWindow >>>>>> set_actual_page window.page >>>>>> >>>>>> end >>>>>> >>>>>> if window.getWebWindow.getName == "YourPreferredName" && >>>>>> window.getEventType == >>>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>>>>> >>>>>> ... >>>>>> >>>>>> >>>>>> end >>>>>> >>>>>> } >>>>>> >>>>>> def set_actual_page(page) >>>>>> # Add to collection on top >>>>>> @pages << @browser.page >>>>>> >>>>>> # Set actual page >>>>>> @browser.page = page >>>>>> end >>>>>> >>>>>> >>>>>> Greetings, >>>>>> Tomasz Kalkosi?ski >>>>>> _______________________________________________ >>>>>> Celerity-users mailing list >>>>>> Celerity-users at rubyforge.org >>>>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Celerity-users mailing list >>>> Celerity-users at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>> >>>> >>>> >>>> _______________________________________________ >>>> Celerity-users mailing list >>>> Celerity-users at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jasoninclass at googlemail.com Mon May 3 08:28:41 2010 From: jasoninclass at googlemail.com (jason franklin-stokes) Date: Mon, 3 May 2010 14:28:41 +0200 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: <03BD2307-1920-4AC8-BF4A-E0A068C05969@googlemail.com> me neither - you don't really need to when you are on jruby - java becomes very ruby like On May 3, 2010, at 1:40 PM, doridori Jo wrote: > aye you are right jason. but i dont know Java ! :( > > On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes wrote: > you need to set the @page variable > > - in general it looks like you want to scrape pages from different websites - if you are celerity is not the right tool. > you should consider getting rid of it and using HtmlUnit directly. > > best Jason. > > > On May 3, 2010, at 11:36 AM, doridori Jo wrote: > >> okay, so i have the page i want....not clear exactly what web_window_event listener is doing... >> >> how to set it to browser.page ? >> >> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can i set this into browser.page ? >> browser = Celerity::Browser.new() >> browser.add_listener(:web_window_event) { |window| >> if window.getNewPage == nil #this might be a popup window >> else >> browser.page = window.getNewPage #Got the page i need here.... >> puts window >> end >> } >> puts browser >> linkz = browser.element_by_xpath(someXpath) >> newbrowser = linkz.click_and_attach >> >> newbrowser should now contain the filtered page....but it doesn't >> >> >> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >> Jari, i've submitted a new issue ticket on github. >> >> Tomasz, how will i use new_browser ? do i just declare it in the very beginning, and just continue using browser.link.click_and_attach ? kinda confused. >> >> For now, i think best thing is if i just don't use click_and_attach, maybe just use click....but click_and_attach is very useful ! >> >> >> >> 2010/4/30 Tomasz Kalkosi?ski >> >> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo wrote: >> > > >> > > So any way to deal with this ? i would very much prefer to use >> > > click_and_attach, and be able to filter windows (like if URL contains, >> > > ad.doubleclick, ignore it) >> > > >> > >> > That sounds like a sensible feature request, even though it's rarely requested. >> > Could you add an issue to the tracker on GitHub? Or even better, write >> > a patch :) >> >> I've dealt with it in my project. You have to look up for windows collection. Something like snippet below, you have to experiment for yourself. >> >> def new_browser >> >> @browser = Celerity::Browser.new >> >> @browser.add_listener(:web_window_event) { |window| >> >> if window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) && >> window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >> >> @old_top_window = window.getWebWindow.getParentWindow >> set_actual_page window.page >> >> end >> >> if window.getWebWindow.getName == "YourPreferredName" && >> window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >> >> ... >> >> >> end >> >> } >> >> def set_actual_page(page) >> # Add to collection on top >> @pages << @browser.page >> >> # Set actual page >> @browser.page = page >> end >> >> >> Greetings, >> Tomasz Kalkosi?ski >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> >> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at iDIAcomputing.com Mon May 3 14:07:43 2010 From: lists at iDIAcomputing.com (George Dinwiddie) Date: Mon, 03 May 2010 14:07:43 -0400 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> Message-ID: <4BDF10EF.2010207@iDIAcomputing.com> doridori Jo wrote: > aye you are right jason. but i dont know Java ! :( You can use JRuby and work directly with HtmlUnit. See the way that Celerity does so. - George > > On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < > jasoninclass at googlemail.com> wrote: > >> you need to set the @page variable >> >> - in general it looks like you want to scrape pages from different websites >> - if you are celerity is not the right tool. >> you should consider getting rid of it and using HtmlUnit directly. >> >> best Jason. >> >> >> On May 3, 2010, at 11:36 AM, doridori Jo wrote: >> >> okay, so i have the page i want....not clear exactly what web_window_event >> listener is doing... >> >> how to set it to browser.page ? >> >> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can >> i set this into browser.page ? >> browser = Celerity::Browser.new() >> browser.add_listener(:web_window_event) { |window| >> if window.getNewPage == nil #this might be a popup window >> else >> browser.page = window.getNewPage #Got the page i need >> here.... >> puts window >> end >> } >> puts browser >> linkz = browser.element_by_xpath(someXpath) >> newbrowser = linkz.click_and_attach >> >> newbrowser should now contain the filtered page....but it doesn't >> >> >> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: >> >>> Jari, i've submitted a new issue ticket on github. >>> >>> Tomasz, how will i use new_browser ? do i just declare it in the very >>> beginning, and just continue using browser.link.click_and_attach ? kinda >>> confused. >>> >>> For now, i think best thing is if i just don't use click_and_attach, maybe >>> just use click....but click_and_attach is very useful ! >>> >>> >>> >>> 2010/4/30 Tomasz Kalkosi?ski >>> >>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>>>> On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>>> wrote: >>>>>> So any way to deal with this ? i would very much prefer to use >>>>>> click_and_attach, and be able to filter windows (like if URL >>>> contains, >>>>>> ad.doubleclick, ignore it) >>>>>> >>>>> That sounds like a sensible feature request, even though it's rarely >>>> requested. >>>>> Could you add an issue to the tracker on GitHub? Or even better, write >>>>> a patch :) >>>> I've dealt with it in my project. You have to look up for windows >>>> collection. Something like snippet below, you have to experiment for >>>> yourself. >>>> >>>> def new_browser >>>> >>>> @browser = Celerity::Browser.new >>>> >>>> @browser.add_listener(:web_window_event) { |window| >>>> >>>> if >>>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>>> && >>>> window.getEventType == >>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>>> >>>> @old_top_window = window.getWebWindow.getParentWindow >>>> set_actual_page window.page >>>> >>>> end >>>> >>>> if window.getWebWindow.getName == "YourPreferredName" && >>>> window.getEventType == >>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>>> >>>> ... >>>> >>>> >>>> end >>>> >>>> } >>>> >>>> def set_actual_page(page) >>>> # Add to collection on top >>>> @pages << @browser.page >>>> >>>> # Set actual page >>>> @browser.page = page >>>> end >>>> >>>> >>>> Greetings, >>>> Tomasz Kalkosi?ski >>>> _______________________________________________ >>>> Celerity-users mailing list >>>> Celerity-users at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>> >>> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> >> >> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> >> > > > ------------------------------------------------------------------------ > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users -- ---------------------------------------------------------------------- * George Dinwiddie * http://blog.gdinwiddie.com Software Development http://www.idiacomputing.com Consultant and Coach http://www.agilemaryland.org ---------------------------------------------------------------------- From dorikick at gmail.com Tue May 4 04:20:47 2010 From: dorikick at gmail.com (doridori Jo) Date: Tue, 4 May 2010 01:20:47 -0700 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: <4BDF10EF.2010207@iDIAcomputing.com> References: <201004301216.59802.tomasz2k@poczta.onet.pl> <9F76F1D7-39D6-4684-9F06-4FE9AA18283C@googlemail.com> <4BDF10EF.2010207@iDIAcomputing.com> Message-ID: okay, so any other suggestions ? i will be moving on htmlunit mailing list if i can't do anything anymore with Celerity. i have tried everything and it seems like it's just not saving the page. it would be nice if there could be some way to modify click_and_attach, to automatically disregard popups. On Mon, May 3, 2010 at 11:07 AM, George Dinwiddie wrote: > doridori Jo wrote: > >> aye you are right jason. but i dont know Java ! :( >> > > You can use JRuby and work directly with HtmlUnit. See the way that > Celerity does so. > > - George > > >> On Mon, May 3, 2010 at 4:02 AM, jason franklin-stokes < >> jasoninclass at googlemail.com> wrote: >> >> you need to set the @page variable >>> >>> - in general it looks like you want to scrape pages from different >>> websites >>> - if you are celerity is not the right tool. >>> you should consider getting rid of it and using HtmlUnit directly. >>> >>> best Jason. >>> >>> >>> On May 3, 2010, at 11:36 AM, doridori Jo wrote: >>> >>> okay, so i have the page i want....not clear exactly what >>> web_window_event >>> listener is doing... >>> >>> how to set it to browser.page ? >>> >>> i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how >>> can >>> i set this into browser.page ? >>> browser = Celerity::Browser.new() >>> browser.add_listener(:web_window_event) { |window| >>> if window.getNewPage == nil #this might be a popup window >>> else >>> browser.page = window.getNewPage #Got the page i need >>> here.... >>> puts window >>> end >>> } >>> puts browser >>> linkz = browser.element_by_xpath(someXpath) >>> newbrowser = linkz.click_and_attach >>> >>> newbrowser should now contain the filtered page....but it doesn't >>> >>> >>> On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo >>> wrote: >>> >>> Jari, i've submitted a new issue ticket on github. >>>> >>>> Tomasz, how will i use new_browser ? do i just declare it in the very >>>> beginning, and just continue using browser.link.click_and_attach ? kinda >>>> confused. >>>> >>>> For now, i think best thing is if i just don't use click_and_attach, >>>> maybe >>>> just use click....but click_and_attach is very useful ! >>>> >>>> >>>> >>>> 2010/4/30 Tomasz Kalkosi?ski >>>> >>>> On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: >>>> >>>>> On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo >>>>>> >>>>> wrote: >>>>> >>>>>> So any way to deal with this ? i would very much prefer to use >>>>>>> click_and_attach, and be able to filter windows (like if URL >>>>>>> >>>>>> contains, >>>>> >>>>>> ad.doubleclick, ignore it) >>>>>>> >>>>>>> That sounds like a sensible feature request, even though it's rarely >>>>>> >>>>> requested. >>>>> >>>>>> Could you add an issue to the tracker on GitHub? Or even better, write >>>>>> a patch :) >>>>>> >>>>> I've dealt with it in my project. You have to look up for windows >>>>> collection. Something like snippet below, you have to experiment for >>>>> yourself. >>>>> >>>>> def new_browser >>>>> >>>>> @browser = Celerity::Browser.new >>>>> >>>>> @browser.add_listener(:web_window_event) { |window| >>>>> >>>>> if >>>>> >>>>> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) >>>>> && >>>>> window.getEventType == >>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN >>>>> >>>>> @old_top_window = window.getWebWindow.getParentWindow >>>>> set_actual_page window.page >>>>> >>>>> end >>>>> >>>>> if window.getWebWindow.getName == "YourPreferredName" && >>>>> window.getEventType == >>>>> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE >>>>> >>>>> ... >>>>> >>>>> >>>>> end >>>>> >>>>> } >>>>> >>>>> def set_actual_page(page) >>>>> # Add to collection on top >>>>> @pages << @browser.page >>>>> >>>>> # Set actual page >>>>> @browser.page = page >>>>> end >>>>> >>>>> >>>>> Greetings, >>>>> Tomasz Kalkosi?ski >>>>> _______________________________________________ >>>>> Celerity-users mailing list >>>>> Celerity-users at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/celerity-users >>>>> >>>>> >>>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >>> >>> >>> >>> _______________________________________________ >>> Celerity-users mailing list >>> Celerity-users at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/celerity-users >>> >>> >>> >> >> ------------------------------------------------------------------------ >> >> >> _______________________________________________ >> Celerity-users mailing list >> Celerity-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/celerity-users >> > > -- > ---------------------------------------------------------------------- > * George Dinwiddie * http://blog.gdinwiddie.com > Software Development http://www.idiacomputing.com > Consultant and Coach http://www.agilemaryland.org > ---------------------------------------------------------------------- > > > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomasz2k at poczta.onet.pl Tue May 4 17:37:32 2010 From: tomasz2k at poczta.onet.pl (Tomasz =?utf-8?q?Kalkosi=C5=84ski?=) Date: Tue, 4 May 2010 23:37:32 +0200 Subject: [Celerity-users] click_and_attach and popup ads In-Reply-To: References: Message-ID: <201005042337.33241.tomasz2k@poczta.onet.pl> On Monday 03 of May 2010 11:36:37 doridori Jo wrote: > okay, so i have the page i want....not clear exactly what web_window_event > listener is doing... Listener is 'callback function' that get called when an event occurs. For example if you have a browser object and you add listener that gets called on alert - when alert() occurs on page your function is called. Like this: @my_browser = Celerity::Browser.new() @my_browser.add_listener(:alert) { |page| # your code } @my_browser.goto("http://page.with.alert.on.it") # your code that 'listens' on alert will be called Now, what I do is just keep track of all opened windows, since I know that there might be some windows that I don't want to deal with - just like your ad window. To do this I add_listener(:web_window_event) to my browser. This event occurs very often so I have some code to recognize is that new window, is that top level window, what is this window name etc. Then I check these rules against an event and eventually I change actual page. You don't need to get into Java, nor to use HtmlUnit directly - you would eventually come to this point anyway. Tell me if it's clear enough for you. Sorry for not responding earlier, we've have holidays here in Poland :) Greetings, Tomasz Kalkosi?ski General snippet is just like I've showed you: # @browser is a variable that is my browser # @pages is just a collection of pages I've visited, # you can remove it # Call a method that constructs new browser # and adds all event listeners that I need @browser = new_browser # Now do some real work @browser.goto("http://somewhere.to") # Just remember that when you click something # and a new window appears # my code that handles :web_window_event is called @browser.click(:id, "one") @browser.click(:id, "two") def new_browser @browser = Celerity::Browser.new @browser.add_listener(:web_window_event) { |window| if window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) && window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN @old_top_window = window.getWebWindow.getParentWindow set_actual_page window.page end if window.getWebWindow.getName == "YourPreferredName" && window.getEventType == Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE ... end } def set_actual_page(page) # Add to collection on top @pages << @browser.page # Set actual page @browser.page = page end > > how to set it to browser.page ? > > i ended up with Java::ComGargoylesoftwareHtmlunitHtml::HtmlPage but how can > i set this into browser.page ? > browser = Celerity::Browser.new() > browser.add_listener(:web_window_event) { |window| > if window.getNewPage == nil #this might be a popup window > else > browser.page = window.getNewPage #Got the page i need > here.... > puts window > end > } > puts browser > linkz = browser.element_by_xpath(someXpath) > newbrowser = linkz.click_and_attach > > newbrowser should now contain the filtered page....but it doesn't > > > On Fri, Apr 30, 2010 at 12:58 PM, doridori Jo wrote: > > > Jari, i've submitted a new issue ticket on github. > > > > Tomasz, how will i use new_browser ? do i just declare it in the very > > beginning, and just continue using browser.link.click_and_attach ? kinda > > confused. > > > > For now, i think best thing is if i just don't use click_and_attach, maybe > > just use click....but click_and_attach is very useful ! > > > > > > > > 2010/4/30 Tomasz Kalkosi?ski > > > > On Friday 30 of April 2010 11:32:35 Jari Bakken wrote: > >> > On Fri, Apr 30, 2010 at 11:14 AM, doridori Jo > >> wrote: > >> > > > >> > > So any way to deal with this ? i would very much prefer to use > >> > > click_and_attach, and be able to filter windows (like if URL contains, > >> > > ad.doubleclick, ignore it) > >> > > > >> > > >> > That sounds like a sensible feature request, even though it's rarely > >> requested. > >> > Could you add an issue to the tracker on GitHub? Or even better, write > >> > a patch :) > >> > >> I've dealt with it in my project. You have to look up for windows > >> collection. Something like snippet below, you have to experiment for > >> yourself. > >> > >> def new_browser > >> > >> @browser = Celerity::Browser.new > >> > >> @browser.add_listener(:web_window_event) { |window| > >> > >> if > >> window.getWebWindow.is_a?(Java::ComGargoylesoftwareHtmlunit::TopLevelWindow) > >> && > >> window.getEventType == > >> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::OPEN > >> > >> @old_top_window = window.getWebWindow.getParentWindow > >> set_actual_page window.page > >> > >> end > >> > >> if window.getWebWindow.getName == "YourPreferredName" && > >> window.getEventType == > >> Java::ComGargoylesoftwareHtmlunit::WebWindowEvent::CHANGE > >> > >> ... > >> > >> > >> end > >> > >> } > >> > >> def set_actual_page(page) > >> # Add to collection on top > >> @pages << @browser.page > >> > >> # Set actual page > >> @browser.page = page > >> end > >> > >> > >> Greetings, > >> Tomasz Kalkosi?ski > >> _______________________________________________ > >> Celerity-users mailing list > >> Celerity-users at rubyforge.org > >> http://rubyforge.org/mailman/listinfo/celerity-users > >> > > > > > From peter at hexagile.com Wed May 5 02:26:28 2010 From: peter at hexagile.com (Peter Szinek) Date: Wed, 5 May 2010 08:26:28 +0200 Subject: [Celerity-users] Celerity vs Mechanize Message-ID: Hey guys, I am just finishing the ground-up rewrite of my scraping framework, scRUBYt!. The last missing piece compared to the original version is scraping JS sites. In the old version I used mechanize for 'normal' sites and FireWatir for JS. Obviously I'd like to replace FireWatir with Celerity. However, I am wondering whether I should bother with mechanize at all - or just use celerity, with JS disabled for normal pages and 'full' celerity for the rest. Did anyone compare celerity with mechanize before? I think the API / functionality /stability / parsing capabilities of celerity are at least as good, if not better (?). What about speed - if celerity (JS disabled) is in the same ballpark, I'd go with it... Any other considerations I should take into account? In other words, any ideas/comments why I should NOT ditch mechanize for celerity? Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjur.kvammen at gmail.com Tue May 11 03:45:51 2010 From: sjur.kvammen at gmail.com (Sjur Kvammen) Date: Tue, 11 May 2010 09:45:51 +0200 Subject: [Celerity-users] jruby + celerity.jar(trunk) hangs forever Message-ID: Hello! I'm trying to make celerity run without installing jruby, and jruby-complete seems to solve the problem. When I found that Celerity in trunk also includes a rake-task to make a jar, I thought my problems was solved. However, the Celerity-script runs just fine, but it never exits.. Here is steps to reproduce it (jewler is a dependency of celerity, inclusion taken from http://blog.nicksieger.com/articles/2009/01/10/jruby-1-1-6-gems-in-a-jar): mkdir celerity_test cd celerity_test wget http://jruby.org.s3.amazonaws.com/downloads/1.4.1/jruby-complete-1.4.1.jar java -jar jruby-complete-1.4.1.jar -S gem install -i ./jeweler jeweler jar uf jruby-complete-1.4.1.jar -C jeweler/ . git clone git://github.com/jarib/celerity.git cd celerity/ java -jar ../jruby-complete-1.4.1.jar -S rake jar:fat cd .. java -jar jruby-complete-1.4.1.jar -r celerity/pkg/celerity-complete-0.7.9.jar test.rb java -jar jruby-complete-1.4.1.jar -r celerity/pkg/celerity-complete-0.7.9.jar google.rb where test.rb consists of puts "hello" which prints out "hello" when run, and google.rb consists of require "rubygems" require "celerity" browser = Celerity::Browser.new browser.goto('http://www.google.com') puts "done" which prints out "done", but never finishes/returns. Also tried requiring the celerity-jar in google.rb, but that gives me a LoadError. The same behaviour is displayed with installed jruby (prints "done", but hangs) jruby -r celerity/pkg/celerity-complete-0.7.9.jar google.rb Am I doing any obvious n00b mistakes? Platform: Ubuntu 10.04 (Lucid) -Sjur -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Tue May 11 04:30:27 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 11 May 2010 10:30:27 +0200 Subject: [Celerity-users] jruby + celerity.jar(trunk) hangs forever In-Reply-To: References: Message-ID: Hi Sjur, On Tue, May 11, 2010 at 9:45 AM, Sjur Kvammen wrote: > However, the Celerity-script runs just fine, but it never exits.. This is a (presumed) bug in the later HtmlUnit snapshots - you need to call browser.close for the JVM to exit. I've filed this here: http://sourceforge.net/tracker/?func=detail&aid=2985827 > > Also tried requiring the celerity-jar in google.rb, but that gives me a > LoadError. Could you post the complete LoadError? IIRC you need two requires when using the jar: require "celerity-complete.jar" require "celerity" Hope that helps. From sjur.kvammen at gmail.com Tue May 11 05:20:27 2010 From: sjur.kvammen at gmail.com (Sjur Kvammen) Date: Tue, 11 May 2010 11:20:27 +0200 Subject: [Celerity-users] jruby + celerity.jar(trunk) hangs forever In-Reply-To: References: Message-ID: Great! browser.close() solved it. Thank you! LoadError when using the following file as google.rb: require "rubygems" require "celerity-complete-0.7.9.jar" require "celerity" browser = Celerity::Browser.new browser.goto('http://www.google.com') browser.close() puts "done" jruby-complete: sjukva at t410s:~/temp/clean2$ java -jar jruby-complete-1.4.1.jar -r celerity/pkg/celerity-complete-0.7.9.jar google.rb file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require': no such file to load -- celerity-complete-0.7.9 (LoadError) from file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require' from google.rb:2 jruby (celerity not installed via gem): sjukva at t410s:~/temp/clean2$ jruby -r celerity/pkg/celerity-complete-0.7.9.jar google.rb /usr/lib/jruby//lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require': no such file to load -- celerity-complete-0.7.9 (LoadError) from /usr/lib/jruby//lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require' from google.rb:2 Works smoothly if I run without the jar-requirement, so not a problem for me. -Sjur -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Tue May 11 05:31:43 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Tue, 11 May 2010 11:31:43 +0200 Subject: [Celerity-users] jruby + celerity.jar(trunk) hangs forever In-Reply-To: References: Message-ID: On Tue, May 11, 2010 at 11:20 AM, Sjur Kvammen wrote: > sjukva at t410s:~/temp/clean2$ java -jar jruby-complete-1.4.1.jar -r celerity/pkg/celerity-complete-0.7.9.jar google.rb > file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in > `require': no such file to load -- celerity-complete-0.7.9 (LoadError) > ??????? from > file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in > `require' > ??????? from google.rb:2 Looks like you're requiring the jar both from the file and from the command line. Since 'celerity/pkg' is not on the $LOAD_PATH, that require will fail. Try this instead: java -jar jruby-complete-1.4.1.jar -I celerity/pkg google.rb The -I flag puts the given directory on the $LOAD_PATH. From sjur.kvammen at gmail.com Tue May 11 06:00:38 2010 From: sjur.kvammen at gmail.com (Sjur Kvammen) Date: Tue, 11 May 2010 12:00:38 +0200 Subject: [Celerity-users] jruby + celerity.jar(trunk) hangs forever In-Reply-To: References: Message-ID: Aaaaaaah! User-mistake :) -Sjur On Tue, May 11, 2010 at 11:31 AM, Jari Bakken wrote: > On Tue, May 11, 2010 at 11:20 AM, Sjur Kvammen > wrote: > > sjukva at t410s:~/temp/clean2$ java -jar jruby-complete-1.4.1.jar -r > celerity/pkg/celerity-complete-0.7.9.jar google.rb > > > file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in > > `require': no such file to load -- celerity-complete-0.7.9 (LoadError) > > from > > > file:/home/sjukva/temp/clean2/jruby-complete-1.4.1.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in > > `require' > > from google.rb:2 > > Looks like you're requiring the jar both from the file and from the > command line. Since 'celerity/pkg' is not on the $LOAD_PATH, that > require will fail. Try this instead: > > java -jar jruby-complete-1.4.1.jar -I celerity/pkg google.rb > > The -I flag puts the given directory on the $LOAD_PATH. > _______________________________________________ > Celerity-users mailing list > Celerity-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/celerity-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.gehring at gmail.com Mon May 17 09:03:11 2010 From: andrew.gehring at gmail.com (Andrew Gehring) Date: Mon, 17 May 2010 07:03:11 -0600 Subject: [Celerity-users] Downloads Message-ID: I have a link that downloads a file when selected (in the browser). How do I capture that data transfer with celerity? I've tried something like: browser.link(:id, 'download_xml_all_lb').download #=> output.xml But I never get the output... Thanks! From thirdreplicator at gmail.com Thu May 27 02:16:58 2010 From: thirdreplicator at gmail.com (David Beckwith) Date: Wed, 26 May 2010 23:16:58 -0700 Subject: [Celerity-users] pixel tracking with Celerity Message-ID: Is it possible to monitor asynchronous calls to other servers? For example, I want to test to make sure our API is being properly implemented on a partner site: 'notify' signals are to be sent to my company's site. I want to log that their JQuery is generatign the 'notify' signals. There are no changes to the browser. The signal is just a request for a pixel on my company's server with a bunch of GET params set which contains all the tracking info. Is this possible to test using Celerity? Thanks From jari.bakken at gmail.com Thu May 27 02:45:30 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Thu, 27 May 2010 08:45:30 +0200 Subject: [Celerity-users] pixel tracking with Celerity In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 8:16 AM, David Beckwith wrote: > Is it possible to monitor asynchronous calls to other servers? For > example, I want to test to make sure our API is being properly > implemented on a partner site: 'notify' signals are to be sent to my > company's site. I want to log that their JQuery is generatign the > 'notify' signals. There are no changes to the browser. The signal is > just a request for a pixel on my company's server with a bunch of GET > params set which contains all the tracking info. Is this possible to > test using Celerity? > There's no API in Celerity that does this, so you'll have to dig into HtmlUnit internals. If you implement your own WebConnection, and use that by doing browser.webclient.setWebConnection(your_connection) you'll be able to track all outgoing requests. We use this trick to implement Browser#ignore_pattern= [1]. Since you probably only want to enable this for a short duration, consider implementing a method similar to Browser#debug_web_connection [2], which only enables the alternate WebConnection for the duration of the block. At least this is the only way I know of - the HtmlUnit list could probably give you more input. Hope that helps. [1] http://github.com/jarib/celerity/blob/master/lib/celerity/ignoring_web_connection.rb [2] http://github.com/jarib/celerity/blob/master/lib/celerity/browser.rb#L484-492 From peter at hexagile.com Thu May 27 06:37:31 2010 From: peter at hexagile.com (Peter Szinek) Date: Thu, 27 May 2010 12:37:31 +0200 Subject: [Celerity-users] Parse document from string? Message-ID: Hi guys, I am looking for the celerity equivalent of Nokogiri::HTML.parse(html_string) Another question - how to evaluate an XPath if I have an element? i.e. I'd like to evaluate the XPath where the root element is a specific element, rather than the whole document. Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Thu May 27 07:28:09 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Thu, 27 May 2010 13:28:09 +0200 Subject: [Celerity-users] Parse document from string? In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 12:37 PM, Peter Szinek wrote: > Hi guys, > > I am looking for the celerity equivalent of > > Nokogiri::HTML.parse(html_string) > There's no API in Celerity for this, but is probably doable in HtmlUnit somehow. What's your use case? > Another question - how to evaluate an XPath if I have an element? i.e. I'd > like to evaluate the XPath where the root element is a specific element, > rather than the whole document. > You can build your xpath by appending to the absolute xpath of an existing element. Check Element#xpath in the docs. From peter at hexagile.com Thu May 27 08:36:48 2010 From: peter at hexagile.com (Peter Szinek) Date: Thu, 27 May 2010 14:36:48 +0200 Subject: [Celerity-users] Parse document from string? In-Reply-To: References: Message-ID: Hi Jari There's no API in Celerity for this, but is probably doable in > HtmlUnit somehow. What's your use case? > > The use case is the same for both cases, see below > You can build your xpath by appending to the absolute xpath of an > existing element. Check Element#xpath in the docs. > This does not solve my problem in general. I am not sure you ever used scRUBYt! (a web scraping framework using ton of XPath crunching), so a quick example: book "//div[@class='book']" do author "./span[1]/strong" price "./span[contains(.,'Price: ')]" end this snippet first extracts all books, then in each of them extracts author and price, yielding a result structure like [:book => {:author => 'x', :price => '1'}, :book => {:author => 'y', :price => '2'} ] etc. Now, the above code could be easily transformed to use absolute XPaths. The problem is that XPaths are just part of how scRUByt! matches elements. There might be (and quite often there are) other predicates invoved in finding 'book' (or other 'patterns' as they are called in scRUBYt!) not expressible with XPaths - the final result, however, is a set of nodes, on which the further XPaths (and other predicates/operators) are evaluated. Currently this is solved (with nokogiri) by evaluating all the elements that should belong to "book", and using their HTML code to evaluate the further XPaths on (that's why I was asking the first question). Or, the second possibility would be to run XPaths on the result elements (the second question). I hope the above is clear, didn't want to get into lengthy explanations - I am open to all ideas etc. Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From jari.bakken at gmail.com Thu May 27 09:09:49 2010 From: jari.bakken at gmail.com (Jari Bakken) Date: Thu, 27 May 2010 15:09:49 +0200 Subject: [Celerity-users] Parse document from string? In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 2:36 PM, Peter Szinek wrote: > > I am not sure you ever used scRUBYt! (a web scraping framework using ton of > XPath crunching), so a quick example: > Are you trying to implement the the scRUBYt API on top of Celerity? > Currently this is solved (with nokogiri) by evaluating all the elements that > should belong to "book", and using their HTML code to evaluate the further > XPaths on (that's why I was asking the first question). > Or, the second possibility would be to run XPaths on the result elements > (the second question). > I'm still a bit unclear on what you're trying to achieve. Celerity doesn't try to be a generic XPath interpreter - Nokogiri seems like the right tool for the job here. If you have a string of HTML you want to parse - why use Celerity at all? From peter at hexagile.com Thu May 27 10:43:56 2010 From: peter at hexagile.com (Peter Szinek) Date: Thu, 27 May 2010 16:43:56 +0200 Subject: [Celerity-users] Parse document from string? In-Reply-To: References: Message-ID: > > > Are you trying to implement the the scRUBYt API on top of Celerity? > Yes - I wrote a mail earlier asking if it's a good idea, but got no response... > > > Currently this is solved (with nokogiri) by evaluating all the elements > that > > should belong to "book", and using their HTML code to evaluate the > further > > XPaths on (that's why I was asking the first question). > > Or, the second possibility would be to run XPaths on the result elements > > (the second question). > > > > I'm still a bit unclear on what you're trying to achieve. Celerity > doesn't try to be a generic XPath interpreter - Nokogiri seems like > the right tool for the job here. That's true, and I have dozens of scrapers in production with current (nokogiri based) scrubyt - everything is cool, until we hit a Javascript site. I also have quite a few production scrapers that are using celerity (for JS sites) - I found celerity as good for this as nokogiri (it seems to me that celerity's XPath support is as good as nokogiri's - isn't that the case?). Except that it's much faster and more maintainable to create scrubyt scrapers than either nokogiri or celerity. > If you have a string of HTML you want > to parse - why use Celerity at all? Because of javascript support. An earlier version of scrubyt supported mechanize/nokogiri and firewatir (for JS) and it worked well - but I think celerity is much faster and has a better API than firewatir. It's true that firewatir handles 100% of JS sites while celerity has some problems here and there, but I could put up with that. Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at hexagile.com Thu May 27 10:48:01 2010 From: peter at hexagile.com (Peter Szinek) Date: Thu, 27 May 2010 16:48:01 +0200 Subject: [Celerity-users] Parse document from string? In-Reply-To: References: Message-ID: Maybe a hybrid solution would be better? (ie celerity would be used just to click JS links, load a JS page etc. then pass the HTML to nokogiri - and thus celerity would never care about scraping / XPaths / HTML parsing etc? Just navigation / getting response if it's Javascripty, and always pass the HTML to nokogiri)? On Thu, May 27, 2010 at 4:43 PM, Peter Szinek wrote: > > >> >> Are you trying to implement the the scRUBYt API on top of Celerity? >> > > Yes - I wrote a mail earlier asking if it's a good idea, but got no > response... > >> >> > Currently this is solved (with nokogiri) by evaluating all the elements >> that >> > should belong to "book", and using their HTML code to evaluate the >> further >> > XPaths on (that's why I was asking the first question). >> > Or, the second possibility would be to run XPaths on the result elements >> > (the second question). >> > >> >> I'm still a bit unclear on what you're trying to achieve. Celerity >> doesn't try to be a generic XPath interpreter - Nokogiri seems like >> the right tool for the job here. > > > That's true, and I have dozens of scrapers in production with current > (nokogiri based) scrubyt - everything is cool, until we hit a Javascript > site. > I also have quite a few production scrapers that are using celerity (for JS > sites) - I found celerity as good for this as nokogiri (it seems to me that > celerity's XPath support is as good as nokogiri's - isn't that the case?). > Except that it's much faster and more maintainable to create scrubyt > scrapers than either nokogiri or celerity. > > >> If you have a string of HTML you want >> to parse - why use Celerity at all? > > > Because of javascript support. > > An earlier version of scrubyt supported mechanize/nokogiri and firewatir > (for JS) and it worked well - but I think celerity is much faster and has a > better API than firewatir. It's true that firewatir handles 100% of JS sites > while celerity has some problems here and there, but I could put up with > that. > > Cheers, > Peter > > -------------- next part -------------- An HTML attachment was scrubbed... URL: