[Mechanize-users] BUG: Possible issue with escaped hrefs
Mat Schaffer
schapht at gmail.com
Mon Sep 18 09:46:46 EDT 2006
I noticed an interesting problem today when scripting against a web
app. The application contained a link in it that used %20 instead of
spaces. After running mechanize through the Charles debugging proxy
I found that mechanize was converting %20 to %2520 (double escaping
the %20).
This appears to happen under both 0.5.4 and 0.6.0.
Here's a simple set of files that demonstrate the issue:
--- start.html
<html>
<body>
<a href="link%20with%20spaces.html">This link has spaces</a>
</body>
</html>
--- end start.html
--- 'link with spaces.html'
This page is after the link.
--- end 'link with spaces.html'
--- test.rb
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
# Un-comment to debug using Charles
# agent.set_proxy('localhost', '8888')
first = agent.get('http://localhost/~schapht/link_test/start.html')
second = agent.click(first.links.first)
puts second.body
--- end test.rb
Expected: This page is after the link.
Actual: /opt/local/lib/ruby/1.8/net/http.rb:1049:in `request':
Unhandled response (WWW::Mechanize::ResponseCodeError) [likely due
to 404]
Finally, I'm attaching a CSV of Charles' output to this email.
Hopefully it'll work on the list.
-Mat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: link_debug.csv
Type: application/octet-stream
Size: 507 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/mechanize-users/attachments/20060918/2b7a80a7/attachment.obj
More information about the Mechanize-users
mailing list