如何检查给定的字符串是否是有效的URL地址?

我对正则表达式的知识是基本的,不允许我从我已经在网上看到的数百个正则表达式中进行选择。


当前回答

下面的表达式将适用于所有流行域。它将接受以下网址:

www.yourwebsite.com http://www.yourwebsite.com www.yourwebsite.com yourwebsite.com yourwebsite.co.in

此外,它将使消息与url作为链接也 例如,请访问你的网站 在上面的例子中,它将使yourwebsite.com作为超链接

if (new RegExp("([-a-z0-9]{1,63}\\.)*?[a-z0-9][-a-z0-9]{0,61}[a-z0-9]\\.(com|com/|org|gov|cm|net|online|live|biz|us|uk|co.us|co.uk|in|co.in|int|info|edu|mil|ca|co|co.au|org/|gov/|cm/|net/|online/|live/|biz/|us/|uk/|co.us/|co.uk/|in/|co.in/|int/|info/|edu/|mil/|ca/|co/|co.au/)(/[-\\w@\\+\\.~#\\?*&/=% ]*)?$").test(strMessage) || (new RegExp("^[a-z ]+[\.]?[a-z ]+?[\.]+[a-z ]+?[\.]+[a-z ]+?[-\\w@\\+\\.~#\\?*&/=% ]*").test(strMessage) && new RegExp("([a-zA-Z0-9]+://)?([a-zA-Z0-9_]+:[a-zA-Z0-9_]+@)?([a-zA-Z0-9.-]+\\.[A-Za-z]{2,4})(:[0-9]+)?(/.*)?").test(strMessage)) || (new RegExp("^[a-z ]+[\.]?[a-z ]+?[-\\w@\\+\\.~#\\?*&/=% ]*").test(strMessage) && new RegExp("([a-zA-Z0-9]+://)?([a-zA-Z0-9_]+:[a-zA-Z0-9_]+@)?([a-zA-Z0-9.-]+\\.[A-Za-z]{2,4})(:[0-9]+)?(/.*)?").test(strMessage))) {
  if (new RegExp("^[a-z ]+[\.]?[a-z ]+?[\.]+[a-z ]+?[\.]+[a-z ]+?$").test(strMessage) && new RegExp("([a-zA-Z0-9]+://)?([a-zA-Z0-9_]+:[a-zA-Z0-9_]+@)?([a-zA-Z0-9.-]+\\.[A-Za-z]{2,4})(:[0-9]+)?(/.*)?").test(strMessage)) {
    var url1 = /(^|<|\s)([\w\.]+\.(?:com|org|gov|cm|net|online|live|biz|us|uk|co.us|co.uk|in|co.in|int|info|edu|mil|ca|co|co.au))(\s|>|$)/g;
    var html = $.trim(strMessage);
    if (html) {
      html = html.replace(url1, '$1<a style="color:blue; text-decoration:underline;" target="_blank"  href="http://$2">$2</a>$3');
    }
    returnString = html;
    return returnString;
  } else {
    var url1 = /(^|&lt;|\s)(www\..+?\.(?:com|org|gov|cm|net|online|live|biz|us|uk|co.us|co.uk|in|co.in|int|info|edu|mil|ca|co|co.au)[^,\s]*)(\s|&gt;|$)/g,
      url2 = /(^|&lt;|\s)(((https?|ftp):\/\/|mailto:).+?\.(?:com|org|gov|cm|net|online|live|biz|us|uk|co.us|co.uk|in|co.in|int|info|edu|mil|ca|co|co.au)[^,\s]*)(\s|&gt;|$)/g,
      url3 = /(^|&lt;|\s)([\w\.]+\.(?:com|org|gov|cm|net|online|live|biz|us|uk|co.us|co.uk|in|co.in|int|info|edu|mil|ca|co|co.au)[^,\s]*)(\s|&gt;|$)/g;

    var html = $.trim(strMessage);
    if (html) {
      html = html.replace(url1, '$1<a style="color:blue; text-decoration:underline;" target="_blank"  href="http://$2">$2</a>$3').replace(url2, '$1<a style="color:blue; text-decoration:underline;" target="_blank"  href="$2">$2</a>$5').replace(url3, '$1<a style="color:blue; text-decoration:underline;" target="_blank"  href="http://$2">$2</a>$3');
    }
    returnString = html;

    return returnString;
  }
}

其他回答

function validateURL(textval) {
            var urlregex = new RegExp(
            "^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&amp;%\$#\=~])*$");
            return urlregex.test(textval);
        }

匹配 http://www.asdah.com/~joe | ftp://ftp.asdah.co.uk:2828/asdah%20asdah.gif | https://asdah.gov/asdh-ah.as

对于Python,这是Django 1.5.1中使用的验证正则表达式的实际URL:

import re
regex = re.compile(
        r'^(?:http|ftp)s?://'  # http:// or https://
        r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|'  # domain...
        r'localhost|'  # localhost...
        r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|'  # ...or ipv4
        r'\[?[A-F0-9]*:[A-F0-9:]+\]?)'  # ...or ipv6
        r'(?::\d+)?'  # optional port
        r'(?:/?|[/?]\S+)$', re.IGNORECASE)

这既处理ipv4和ipv6地址,也处理端口和GET参数。

在代码44行中找到。

我无法找到我正在寻找的正则表达式,所以我修改了一个正则表达式来满足我的要求,显然现在它似乎工作得很好。我的要求是:

匹配带有协议的url (www.gooogle.com) 使用查询参数和路径匹配url (http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e) 不要匹配有不可接受字符的url(例如。' '£),例如:(www.google.com/somthing"/somethingmore)

以下是我的想法,任何建议都很感激:

@Test
    public void testWebsiteUrl(){
        String regularExpression = "((http|ftp|https):\\/\\/)?[\\w\\-_]+(\\.[\\w\\-_]+)+([\\w\\-\\.,@?^=%&amp;:/~\\+#]*[\\w\\-\\@?^=%&amp;/~\\+#])?";

        assertTrue("www.google.com".matches(regularExpression));
        assertTrue("www.google.co.uk".matches(regularExpression));
        assertTrue("http://www.google.com".matches(regularExpression));
        assertTrue("http://www.google.co.uk".matches(regularExpression));
        assertTrue("https://www.google.com".matches(regularExpression));
        assertTrue("https://www.google.co.uk".matches(regularExpression));
        assertTrue("google.com".matches(regularExpression));
        assertTrue("google.co.uk".matches(regularExpression));
        assertTrue("google.mu".matches(regularExpression));
        assertTrue("mes.intnet.mu".matches(regularExpression));
        assertTrue("cse.uom.ac.mu".matches(regularExpression));

        assertTrue("http://www.google.com/path".matches(regularExpression));
        assertTrue("http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e".matches(regularExpression));
        assertTrue("http://www.google.com/?queryparam=123".matches(regularExpression));
        assertTrue("http://www.google.com/path?queryparam=123".matches(regularExpression));

        assertFalse("www..dr.google".matches(regularExpression));

        assertFalse("www:google.com".matches(regularExpression));

        assertFalse("https://www@.google.com".matches(regularExpression));

        assertFalse("https://www.google.com\"".matches(regularExpression));
        assertFalse("https://www.google.com'".matches(regularExpression));

        assertFalse("http://www.google.com/path'".matches(regularExpression));
        assertFalse("http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e'".matches(regularExpression));
        assertFalse("http://www.google.com/?queryparam=123'".matches(regularExpression));
        assertFalse("http://www.google.com/path?queryparam=12'3".matches(regularExpression));

    }

改进的

检测像这样的url:

https://www.example.pl http://www.example.com www.example.pl example.com http://blog.example.com http://www.example.com/product http://www.example.com/products?id=1&page=2 http://www.example.com#up http://255.255.255.255 255.255.255.255 http:// www.site.com: 8008

正则表达式:

/^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$/gm

有趣的是,上面的答案都不能满足我的需要,所以我想我可以提供我的解决方案。我需要做到以下几点:

匹配http(s)://www.google.com, http://google.com, www.google.com和google.com 匹配Github降价风格的链接,如[谷歌](http://www.google.com) 匹配所有可能的域名扩展名,比如。com,或。io,或。guru等。基本上长度在2-6个字符之间 将所有内容分成适当的组,以便我可以根据需要访问每个部分。

解决办法是这样的:

/^(\[[A-z0-9 _]*\]\()?((?:(http|https):\/\/)?(?:[\w-]+\.)+[a-z]{2,6})(\))?$

这就满足了上述所有要求。如果需要,你可以选择添加ftp和file功能:

/^(\[[A-z0-9 _]*\]\()?((?:(http|https|ftp|file):\/\/)?(?:[\w-]+\.)+[a-z]{2,6})(\))?$