[sup-talk] [PATCH] Unwrap br0ken URLs.

William Morgan wmorgan-sup at masanjin.net
Thu Feb 28 12:29:49 EST 2008


I would love to have a feature like this in Sup. This patch still has
some issues in terms of being over-aggressive. What I would really like
to see as a starting point is a corpus of broken URL examples that we
can build unit tests of. Then we can tweak these regexes until we get
something that has both high precision and high recall.

Also, have you looked at URI.regexp? I think that can do a lot of the
dirty work.

-- 
William <wmorgan-sup at masanjin.net>


More information about the sup-talk mailing list