[Umlaut-general] updated Amazon adapter in svn

Jonathan Rochkind rochkind at jhu.edu
Tue Sep 6 16:09:55 EDT 2011


I've committed updates to the Amazon adapter, it now screen-scrapes the 
latest Amazon interface instead of an old Amazon interface at a 
different URL.

The Amazon adapter needs to screen-scrape to determine "look inside" and 
"search inside" availability, to link to. To do this, it predicts the 
URL for the "look inside" page, and then screen-scrapes whats' there.

The previous version was screen-scraping an old version of the "look 
inside" page, that was somehow still available at an alternate URL from 
Amazon, but was behaving increasingly erratically.

Fixed it to screen scrape the latest version of this interface instead. 
It's a bit tricky to reverse engineer the HTML to figure out what to 
look for in HTML to determine availabiltiy of "look inside" and "search 
inside", but I think I figured it out properly, tested it on a bunch of 
sample data, appears to be working.

If you update the lib/service_adapters/amazon.rb file from svn, you'll 
have the new code.


More information about the Umlaut-general mailing list