[Mechanize-users] BUG: Possible issue with escaped hrefs

Mat Schaffer schapht at gmail.com
Mon Sep 18 09:46:46 EDT 2006


I noticed an interesting problem today when scripting against a web  
app.  The application contained a link in it that used %20 instead of  
spaces.  After running mechanize through the Charles debugging proxy  
I found that mechanize was converting %20 to %2520 (double escaping  
the %20).

This appears to happen under both 0.5.4 and 0.6.0.

Here's a simple set of files that demonstrate the issue:

--- start.html
<html>
<body>
<a href="link%20with%20spaces.html">This link has spaces</a>
</body>
</html>
--- end start.html

--- 'link with spaces.html'
This page is after the link.
--- end 'link with spaces.html'

--- test.rb
require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
# Un-comment to debug using Charles
# agent.set_proxy('localhost', '8888')
first = agent.get('http://localhost/~schapht/link_test/start.html')
second = agent.click(first.links.first)
puts second.body
--- end test.rb

Expected: This page is after the link.
Actual: /opt/local/lib/ruby/1.8/net/http.rb:1049:in `request':  
Unhandled response (WWW::Mechanize::ResponseCodeError)  [likely due  
to 404]

Finally, I'm attaching a CSV of Charles' output to this email.   
Hopefully it'll work on the list.
-Mat

-------------- next part --------------
A non-text attachment was scrubbed...
Name: link_debug.csv
Type: application/octet-stream
Size: 507 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/mechanize-users/attachments/20060918/2b7a80a7/attachment.obj 


More information about the Mechanize-users mailing list