如何在.NET正则表达式中访问命名捕获组?

我很难找到一个好的资源来解释如何在c#中使用命名捕获组。这是我到目前为止的代码:

string page = Encoding.ASCII.GetString(bytePage);
Regex qariRegex = new Regex("<td><a href=\"(?<link>.*?)\">(?<name>.*?)</a></td>");
MatchCollection mc = qariRegex.Matches(page);
CaptureCollection cc = mc[0].Captures;
MessageBox.Show(cc[0].ToString());

然而，这总是只显示整行:

<td><a href="/path/to/file">Name of File</a></td>

我在不同的网站上找到了一些其他的“方法”，但我总是得到相同的结果。

如何访问在正则表达式中指定的命名捕获组?

当前回答

这个答案改进了潘迪特的答案，在某种程度上比其他答案更好，因为它似乎完全解决了问题中详细描述的问题。

不好的部分是效率很低，并且没有一致地使用IgnoreCase选项。

低效的部分是因为regex的构建和执行成本很高，在这个答案中，它可以只构造一次(调用regex。IsMatch只是在幕后构造正则表达式)。Match方法只能被调用一次并存储在一个变量中，然后linkand name应该从该变量调用Result。

IgnoreCase选项只在Match部分使用，而在Regex中没有使用。IsMatch部分。

我还将Regex定义移到方法之外，以便只构造一次(如果我们使用RegexOptions存储程序集，我认为这是明智的方法。编译选项)。

private static Regex hrefRegex = new Regex("<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>",  RegexOptions.IgnoreCase | RegexOptions.Compiled);

public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    var matches = hrefRegex.Match(htmlTd);
    if (matches.Success)
    {
        link = matches.Result("${link}");
        name = matches.Result("${name}");
        return true;
    }
    else
    {
        link = null;
        name = null;
        return false;
    }
}

2019-10-29 18:20:57

其他回答

这个答案改进了潘迪特的答案，在某种程度上比其他答案更好，因为它似乎完全解决了问题中详细描述的问题。

不好的部分是效率很低，并且没有一致地使用IgnoreCase选项。

IgnoreCase选项只在Match部分使用，而在Regex中没有使用。IsMatch部分。

我还将Regex定义移到方法之外，以便只构造一次(如果我们使用RegexOptions存储程序集，我认为这是明智的方法。编译选项)。

private static Regex hrefRegex = new Regex("<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>",  RegexOptions.IgnoreCase | RegexOptions.Compiled);

public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    var matches = hrefRegex.Match(htmlTd);
    if (matches.Success)
    {
        link = matches.Result("${link}");
        name = matches.Result("${name}");
        return true;
    }
    else
    {
        link = null;
        name = null;
        return false;
    }
}

2019-10-29 18:20:57

此外，如果有人有一个用例，他需要组名在执行搜索Regex对象之前，他可以使用:

var regex = new Regex(pattern); // initialized somewhere
// ...
var groupNames = regex.GetGroupNames();

2017-07-28 10:24:26

下面的代码示例即使在空格字符之间也会匹配模式。例如:

<td><a href='/path/to/file'>Name of File</a></td>

以及:

<td> <a      href='/path/to/file' >Name of File</a>  </td>

方法返回true或false，取决于输入的htmlTd字符串是否匹配模式。如果匹配，out参数分别包含链接和名称。

/// <summary>
/// Assigns proper values to link and name, if the htmlId matches the pattern
/// </summary>
/// <returns>true if success, false otherwise</returns>
public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    link = null;
    name = null;

    string pattern = "<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>";

    if (Regex.IsMatch(htmlTd, pattern))
    {
        Regex r = new Regex(pattern,  RegexOptions.IgnoreCase | RegexOptions.Compiled);
        link = r.Match(htmlTd).Result("${link}");
        name = r.Match(htmlTd).Result("${name}");
        return true;
    }
    else
        return false;
}

我已经测试过了，它可以正常工作。

2009-05-25 14:01:39

通过将命名的捕获组字符串传递给结果Match对象的Groups属性的索引器，可以指定该捕获组字符串。

这里有一个小例子:

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        String sample = "hello-world-";
        Regex regex = new Regex("-(?<test>[^-]*)-");

        Match match = regex.Match(sample);

        if (match.Success)
        {
            Console.WriteLine(match.Groups["test"].Value);
        }
    }
}

2009-05-25 12:18:09

使用Match对象的组集合，用捕获组名对其进行索引。

foreach (Match m in mc){
    MessageBox.Show(m.Groups["link"].Value);
}

2009-05-25 12:18:00

如何在.NET正则表达式中访问命名捕获组?

推荐文章

最新文章

标签