[Wtr-general] get the meta tag out of a page
Jeff Wood
jeff at dark-light.com
Sun Aug 27 06:52:52 EDT 2006
Alien Ruby wrote:
> Thank you so much, Jeff.
>
> This is the key, right here - ie.document.getElementsByTagName( "meta" ).item(i).outerHTML
>
> I couldnt figure this out what function to use by myself, but now it's worked out very well.
>
> But, my other question is why '...outerHTML.getContents' or '...outerHTML.getText' or some similar functions cant just spit out the value/content?
>
>
> # I leave it to you to come up with good regular expressions for parsing
> things out.
>
> # these work but are overly greedy.
> matches = body.scan( /name="(.+)"/ )
> name = matches[1] || ""
> matches = body.scan( /content="(.+)"/ )
> content = matches[1] || ""
> matches = body.scan( /http-equiv="{0,1}(.+)"{0,1}/ )
> http_equiv = matches[1] || ""
>
> http_equivs[http_equiv] = content if http_equiv != ""
> metas[name] = content if name != ""
> ---------------------------------------------------------------------
> Posted via Jive Forums
> http://forums.openqa.org/thread.jspa?threadID=3625&messageID=10467#10467
> _______________________________________________
> Wtr-general mailing list
> Wtr-general at rubyforge.org
> http://rubyforge.org/mailman/listinfo/wtr-general
>
Well, you could always create a small function specifically to parse
meta tags ... then you would use it like
meta = parseMeta( ie.document.getElementsByTagName( "meta"
).item(i).outerHTML )
meta[:content]
meta[:content-type]
... etc.
you get the idea ... just requires a little ingenuity.
jd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/wtr-general/attachments/20060827/61891a06/attachment.html
More information about the Wtr-general
mailing list