[Wtr-general] W3C validation problems
Paul Rogers
paul.rogers at shaw.ca
Mon Apr 3 23:26:08 EDT 2006
If I remember correctly, the html you will see using that method is Ies
version of the html, not what it has actually received. So if you right
click and then do a view source, and then in watir do a
ie.document.body.parentelement.outerhtml
You are likely to see different things
I dont think there was a way to find the 'raw' html
Paul
-----Original Message-----
From: wtr-general-bounces at rubyforge.org
[mailto:wtr-general-bounces at rubyforge.org] On Behalf Of Jørgen Bang
Erichsen
Sent: 03 April 2006 15:01
To: wtr-general at rubyforge.org
Subject: [Wtr-general] W3C validation problems
Hi,
Inspired by
http://redgreenblu.com/svn/projects/assert_valid_markup/lib/assert_valid
_markup.rb
I would like to have an easy way to validate the html on the page IE is
currently showing. Unfortunately, I have a problem with the html that
ie.document.body.parentelement.outerhtml outputs :-(
Take a look at the following example:
require 'test/unit'
require 'watir'
require 'net/http'
require 'cgi'
require 'xmlsimple'
class ValidationExample < Test::Unit::TestCase
include Watir
def test_w3c_validate
ie = IE.new
ie.goto 'validator.w3.org/'
html = ie.document.body.parentelement.outerhtml
response = Net::HTTP.start('validator.w3.org').post2('/check',
"fragment=#{CGI.escape(html)}&output=xml")
markup_is_valid = response['x-w3c-validator-status']=='Valid'
message = markup_is_valid ? '' :
XmlSimple.xml_in(response.body)['messages'][0]['msg'].collect{ |m|
"Invalid markup: line #{m['line']}: #{CGI.unescapeHTML(m['content'])}"
}.join("\n")
assert markup_is_valid, message
ie.close
end
end
When I run the example I get stuff like:
Invalid markup: line 1: no document type declaration; implying
"<!DOCTYPE HTML SYSTEM>" Invalid markup: line 1: there is no attribute
"XML:LANG" Invalid markup: line 1: there is no attribute "XMLNS"
The html returned by ie.document.body.parentelement.outerhtml is
<HTML lang=en xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<HEAD>
<TITLE>The W3C Markup Validation Service</TITLE>
<LINK rev=made href="mailto:www-validator at w3.org">
<LINK title="Home Page" rev=start href="./">
but if I view the source from IE itself it is something like
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>The W3C Markup Validation Service</title>
<link rev="made" href="mailto:www-validator at w3.org" />
<link rev="start" href="./" title="Home Page" />
...
The DOCTYPE line and several quotes are missing. Is there any way to get
the unmodified html for the current page?
If people are doing automatic validation any other way I am open to
suggestions.
Best regards,
Jørgen
_______________________________________________
Wtr-general mailing list
Wtr-general at rubyforge.org
http://rubyforge.org/mailman/listinfo/wtr-general
More information about the Wtr-general
mailing list