From barjunk at attglobal.net Fri Jul 1 21:29:31 2011 From: barjunk at attglobal.net (barsalou) Date: Fri, 01 Jul 2011 17:29:31 -0800 Subject: [Mechanize-users] [ANN] Mechanize 2.0 In-Reply-To: <6BD7DA9B-F270-461A-A02E-3BFDE7FFBD34@segment7.net> References: <6BD7DA9B-F270-461A-A02E-3BFDE7FFBD34@segment7.net> Message-ID: <20110701172931.buvws3qrr40gw0kc@192.168.0.101> Quoting Eric Hodel : > mechanize version 2.0 has been released! > > * > * > > The Mechanize library is used for automating interaction with websites. > Mechanize automatically stores and sends cookies, follows redirects, > can follow links, and submit forms. Form fields can be populated and > submitted. Mechanize also keeps track of the sites that you have visited as > a history. > > Changes: > > ### 2.0 / 2011-06-27 > > Mechanize is now under the MIT license > > * API changes > * Mechanize#put now accepts headers instead of an options Hash as the last > argument > * Would someone be willing to post an example of how to use headers instead of the options hash? Thanks. Mike B. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From barjunk at attglobal.net Sat Jul 2 17:11:01 2011 From: barjunk at attglobal.net (barsalou) Date: Sat, 02 Jul 2011 13:11:01 -0800 Subject: [Mechanize-users] Doing a post using Mechanize 2.0 - a solution Message-ID: <20110702131101.qq4r8bcmookgosss@192.168.0.101> I'm working with a web page where it is required that you log in, then use a POST to submit the information via a form. Since this isn't part of my everyday life, it was a struggle to figure out the pieces that were needed to make this happen. We needed to: - login - get the cookie info - make a new request that included the returned cookie info - post the request with necessary form data The example also shows how to add custom headers to the posted request. Firefox's TamperData Add-on was needed to see what the server was expecting in order to put this together. Here is my solution: https://gist.github.com/1061645 Suggestions for slimming this down or other changes are welcome. Mike B. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From barjunk at attglobal.net Sat Jul 2 19:42:02 2011 From: barjunk at attglobal.net (barsalou) Date: Sat, 02 Jul 2011 15:42:02 -0800 Subject: [Mechanize-users] Good examples Message-ID: <20110702154202.kgzm441es00og80s@192.168.0.101> Found these examples while looking for information about posting with Mechanize: http://snippets.dzone.com/posts/show/9725 Mike B. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From ghislain.de.fresnoye at gmail.com Sun Jul 3 19:39:11 2011 From: ghislain.de.fresnoye at gmail.com (Ghislain de Fresnoye) Date: Mon, 4 Jul 2011 01:39:11 +0200 Subject: [Mechanize-users] encoding issue Message-ID: Hello Very happy to see this new version of mechanize :) But I've got a problem with an old script that logs on a website, whitch dont work anymore. By taking a tcpdump, i saw that '*' characters containend in the password are encoded to their charcode (%2A) when posting the login form.. So the authentification fails. Any idea?? Thanks Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From lonny6 at gmail.com Sun Jul 3 20:54:20 2011 From: lonny6 at gmail.com (Lonny Eachus) Date: Sun, 3 Jul 2011 17:54:20 -0700 Subject: [Mechanize-users] encoding issue In-Reply-To: References: Message-ID: <0F26B27D-49E3-4495-B8A1-620B9778F2C7@gmail.com> I don't think Mechanize should be escaping your strings if you aren't telling it to. In any case, if it is, you can do this: require 'cgi' Then CGI::unescape(my_login_string) before logging in. Lonny Eachus ============ On [Jul03], at 16:39 , Ghislain de Fresnoye wrote: > Hello > > Very happy to see this new version of mechanize :) > But I've got a problem with an old script that logs on a website, whitch dont work anymore. > By taking a tcpdump, i saw that '*' characters containend in the password are encoded to their charcode (%2A) when posting the login form.. > So the authentification fails. > > Any idea?? > > Thanks > Regards > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghislain.de.fresnoye at gmail.com Mon Jul 4 19:14:09 2011 From: ghislain.de.fresnoye at gmail.com (Ghislain de Fresnoye) Date: Tue, 5 Jul 2011 01:14:09 +0200 Subject: [Mechanize-users] encoding issue In-Reply-To: <0F26B27D-49E3-4495-B8A1-620B9778F2C7@gmail.com> References: <0F26B27D-49E3-4495-B8A1-620B9778F2C7@gmail.com> Message-ID: Thanks Lonny, I tried what do you suggested, but unfortunaly it did'nt work. But I achieved sending the good data with a simple post, so it's ok for me... Regards 2011/7/4 Lonny Eachus > > I don't think Mechanize should be escaping your strings if you aren't > telling it to. > > In any case, if it is, you can do this: > > require 'cgi' > > > Then > > CGI::unescape(my_login_string) > > before logging in. > > > Lonny Eachus > ============ > > > On [Jul03], at 16:39 , Ghislain de Fresnoye wrote: > > Hello > > Very happy to see this new version of mechanize :) > But I've got a problem with an old script that logs on a website, whitch > dont work anymore. > By taking a tcpdump, i saw that '*' characters containend in the password > are encoded to their charcode (%2A) when posting the login form.. > So the authentification fails. > > Any idea?? > > Thanks > Regards > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users > > > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eggie5 at gmail.com Sun Jul 10 23:11:51 2011 From: eggie5 at gmail.com (Alex Egg) Date: Sun, 10 Jul 2011 20:11:51 -0700 Subject: [Mechanize-users] Mechanize doesn't add cookies to request Message-ID: These are my cookies: pp agent.cookies [SESS0d5421b77d41416bd1d57db5eba42384=aaa86QA6Nf_ISbkcjkxet, bbsessionhash=b5af88de76a01ee20340a88dc433793d, bblastvisit=1310353559, bblastactivity=1310353559, bbuserid=6425, bbpassword=4afd254205b86456d8e4d7ce969f634d, JSESSIONID=aaaJhfsv_RHIWfp2ikxet] However when I make request: page = agent.get("https://asdf.com/fdsa") The cookies aren't added to the request: D, [2011-07-10T20:07:35.523103 #47281] DEBUG -- : saved cookie: SESS0d5421b77d41416bd1d57db5eba42384=aaaXsYX8MGC8XdPzGkxet D, [2011-07-10T20:07:35.523477 #47281] DEBUG -- : saved cookie: SESS0d5421b77d41416bd1d57db5eba42384=aaaqEBLpnek9AgkBGkxet D, [2011-07-10T20:07:35.523822 #47281] DEBUG -- : saved cookie: bbsessionhash=bd5f66648bbfeb3e6ad09d3cba6d0a29 D, [2011-07-10T20:07:35.524147 #47281] DEBUG -- : saved cookie: bblastvisit=1310353655 D, [2011-07-10T20:07:35.524466 #47281] DEBUG -- : saved cookie: bblastactivity=1310353655 D, [2011-07-10T20:07:35.524788 #47281] DEBUG -- : saved cookie: bbuserid=6425 D, [2011-07-10T20:07:35.525111 #47281] DEBUG -- : saved cookie: bbpassword=4afd254205b86456d8e4d7ce969f634d I, [2011-07-10T20:07:35.525216 #47281] INFO -- : follow redirect to: https://asdf.com/fdsa I, [2011-07-10T20:07:35.529606 #47281] INFO -- : Net::HTTP::Get: /fdsa D, [2011-07-10T20:07:35.529671 #47281] DEBUG -- : request-header: accept => */* D, [2011-07-10T20:07:35.529712 #47281] DEBUG -- : request-header: user-agent => Mechanize/2.0.1 Ruby/1.9.2p180 (http://github.com/tenderlove/mechanize/) D, [2011-07-10T20:07:35.529751 #47281] DEBUG -- : request-header: accept-encoding => gzip,deflate,identity D, [2011-07-10T20:07:35.529790 #47281] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7 D, [2011-07-10T20:07:35.529828 #47281] DEBUG -- : request-header: accept-language => en-us,en;q=0.5 D, [2011-07-10T20:07:35.529866 #47281] DEBUG -- : request-header: cookie => JSESSIONID=aaa8ieMR1av0S-XwGkxet D, [2011-07-10T20:07:35.529903 #47281] DEBUG -- : request-header: host => asdf.com I, [2011-07-10T20:07:35.854720 #47281] INFO -- : status: Net::HTTPForbidden 1.1 403 Forbidden The login fails b/c the 7 cookies aren't in the request. Why is this happening? Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From sandy at broadgairhill.com Sun Jul 17 07:46:00 2011 From: sandy at broadgairhill.com (Sandy Reid) Date: Sun, 17 Jul 2011 12:46:00 +0100 Subject: [Mechanize-users] Get a page without css/js ? Message-ID: <1A6EA402A1864D208CD6603AE40BFE08@SandyReidPC> When getting a page using mechanize is it possible to avoid downloading css/js ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sirbeep at gmail.com Wed Jul 20 09:32:56 2011 From: sirbeep at gmail.com (Brian Kennedy) Date: Wed, 20 Jul 2011 09:32:56 -0400 Subject: [Mechanize-users] file_upload stripping characters? In-Reply-To: References: Message-ID: Heya Mechanics, I'm hoping someone has stumbled across this and knows an answer. I am scraping an upload to send a .csv to a .Net app. ?I've verified that the .csv I've created is: ?file update.csv update.csv: ASCII text, with very long lines, with CRLF line terminators but those CR's are getting stripped and what winds up at the other end is only LF line terminations. ?It being a brainless .Net app on the other end, that's not good enough and it's broken. I've been through the Mechanize code and it appears to be handling it all binary all the time, so I can't find anywhere it'd be stripping the CRs. It occurs with both Mech 1.0 and 2.0, ruby 1.8.7 and 1.9.2. Does anyone have a clue where my CRs are going and how to stop it? Thanks, Brian From drbrain at segment7.net Wed Jul 20 14:50:02 2011 From: drbrain at segment7.net (Eric Hodel) Date: Wed, 20 Jul 2011 11:50:02 -0700 Subject: [Mechanize-users] file_upload stripping characters? In-Reply-To: References: Message-ID: <573FA9F0-5503-4C76-92BF-EF339B8F9A6C@segment7.net> On Jul 20, 2011, at 6:32 AM, Brian Kennedy wrote: > Heya Mechanics, > > I'm hoping someone has stumbled across this and knows an answer. > > I am scraping an upload to send a .csv to a .Net app. I've verified > that the .csv I've created is: > file update.csv > update.csv: ASCII text, with very long lines, with CRLF line terminators > but those CR's are getting stripped and what winds up at the other end > is only LF line terminations. It being a brainless .Net app on the > other end, that's not good enough and it's broken. > > I've been through the Mechanize code and it appears to be handling it > all binary all the time, so I can't find anywhere it'd be stripping > the CRs. > > It occurs with both Mech 1.0 and 2.0, ruby 1.8.7 and 1.9.2. > > Does anyone have a clue where my CRs are going and how to stop it? Are you on windows? How are you reading in update.csv, File.read? Try File.binread instead. From sirbeep at gmail.com Wed Jul 20 15:11:31 2011 From: sirbeep at gmail.com (Brian Kennedy) Date: Wed, 20 Jul 2011 15:11:31 -0400 Subject: [Mechanize-users] file_upload stripping characters? In-Reply-To: <573FA9F0-5503-4C76-92BF-EF339B8F9A6C@segment7.net> References: <573FA9F0-5503-4C76-92BF-EF339B8F9A6C@segment7.net> Message-ID: On Wed, Jul 20, 2011 at 2:50 PM, Eric Hodel wrote: > On Jul 20, 2011, at 6:32 AM, Brian Kennedy wrote: > >> Heya Mechanics, >> >> I'm hoping someone has stumbled across this and knows an answer. >> >> I am scraping an upload to send a .csv to a .Net app. ?I've verified >> that the .csv I've created is: >> ?file update.csv >> update.csv: ASCII text, with very long lines, with CRLF line terminators >> but those CR's are getting stripped and what winds up at the other end >> is only LF line terminations. ?It being a brainless .Net app on the >> other end, that's not good enough and it's broken. >> >> I've been through the Mechanize code and it appears to be handling it >> all binary all the time, so I can't find anywhere it'd be stripping >> the CRs. >> >> It occurs with both Mech 1.0 and 2.0, ruby 1.8.7 and 1.9.2. >> >> Does anyone have a clue where my CRs are going and how to stop it? > > Are you on windows? > > How are you reading in update.csv, File.read? > > Try File.binread instead. > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users > I'm not, my server is SUSE, but obviously the receiving end is windows with the .Net app. I've tried using upload_form.file_uploads.first.file_name = TMPFILE for mechanize to handle it itself (which I verified in source does do a binread) and upload_form.file_uploads.first.file_data = File.binread(TMPFILE) The result is same stuff, different fan. Thanks for your thought into my matter, Brian From drbrain at segment7.net Thu Jul 21 02:47:21 2011 From: drbrain at segment7.net (Eric Hodel) Date: Wed, 20 Jul 2011 23:47:21 -0700 Subject: [Mechanize-users] Get a page without css/js ? In-Reply-To: <1A6EA402A1864D208CD6603AE40BFE08@SandyReidPC> References: <1A6EA402A1864D208CD6603AE40BFE08@SandyReidPC> Message-ID: <02E32618-13C3-4616-B89D-0231CA3128BC@segment7.net> On Jul 17, 2011, at 4:46 AM, Sandy Reid wrote: > When getting a page using mechanize is it possible to avoid downloading css/js ? Mechanize doesn't download linked CSS or javascript unless you specifically request it. From godfreykfc at gmail.com Thu Jul 21 04:27:50 2011 From: godfreykfc at gmail.com (Godfrey Chan) Date: Thu, 21 Jul 2011 01:27:50 -0700 Subject: [Mechanize-users] Get a page without css/js ? In-Reply-To: <02E32618-13C3-4616-B89D-0231CA3128BC@segment7.net> References: <1A6EA402A1864D208CD6603AE40BFE08@SandyReidPC> <02E32618-13C3-4616-B89D-0231CA3128BC@segment7.net> Message-ID: Do we currently lazy load them? If not, it might make sense to do that like we did with iframes.. I could take a look at that at some point Sent from my phone On 2011-07-20, at 11:47 PM, Eric Hodel wrote: > On Jul 17, 2011, at 4:46 AM, Sandy Reid wrote: >> When getting a page using mechanize is it possible to avoid downloading css/js ? > > Mechanize doesn't download linked CSS or javascript unless you specifically request it. > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users