匹配平衡括号的正则表达式

我需要一个正则表达式来选择两个外括号之间的所有文本。

例子: START_TEXT(这里的文本(可能的文本)文本(可能的文本(更多的文本))END_TXT ^ ^

结果: (此处文本(可能的文本)文本(可能的文本(更多的文本)))

当前回答

[^\(]*(\(.*\))[^\)]*

[^\(]*匹配字符串开头不是右括号的所有内容，(\(.*\))捕获括在括号中的所需子字符串，[^\)]*匹配字符串末尾不是右括号的所有内容。注意，这个表达式不会试图匹配括号;一个简单的解析器(参见dehmann的回答)将更适合于此。

2009-02-13 15:51:55

其他回答

你可以使用regex递归:

\(([^()]|(?R))*\)

2013-11-08 16:22:24

除了bobble bubble的答案之外，还有其他类型的正则表达式支持递归结构。

Lua

使用%b() (%b{} / %b[]作为大括号/方括号):

对于字符串中的s。gmatch(“提取(a (b) c)和f (g)) ((d)”,“% b()”)做打印(s)结束(见演示)

Raku(前Perl6):

不重叠的多个平衡括号匹配:

my regex paren_any { '(' ~ ')' [ <-[()]>+ || <&paren_any> ]* }
say "Extract (a(b)c) and ((d)f(g))" ~~ m:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢((d)f(g))｣)

重叠多个平衡括号匹配:

say "Extract (a(b)c) and ((d)f(g))" ~~ m:ov:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢(b)｣ ｢((d)f(g))｣ ｢(d)｣ ｢(g)｣)

看到演示。

Python的非正则表达式解决方案

参见poke对如何在平衡括号之间获取表达式的回答。

Java可定制的非正则表达式解决方案

下面是一个可定制的解决方案，允许在Java中使用单个字符文字分隔符:

public static List<String> getBalancedSubstrings(String s, Character markStart, 
                                 Character markEnd, Boolean includeMarkers) 

{
        List<String> subTreeList = new ArrayList<String>();
        int level = 0;
        int lastOpenDelimiter = -1;
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (c == markStart) {
                level++;
                if (level == 1) {
                    lastOpenDelimiter = (includeMarkers ? i : i + 1);
                }
            }
            else if (c == markEnd) {
                if (level == 1) {
                    subTreeList.add(s.substring(lastOpenDelimiter, (includeMarkers ? i + 1 : i)));
                }
                if (level > 0) level--;
            }
        }
        return subTreeList;
    }
}

示例用法:

String s = "some text(text here(possible text)text(possible text(more text)))end text";
List<String> balanced = getBalancedSubstrings(s, '(', ')', true);
System.out.println("Balanced substrings:\n" + balanced);
// => [(text here(possible text)text(possible text(more text)))]

2016-05-13 10:40:20

这并没有完全解决OP问题，但我认为它可能对一些来这里搜索嵌套结构regexp的人有用:

在javascript中从函数字符串(带有嵌套结构)解析参数

匹配结构如下:

匹配方括号、方括号、圆括号、单引号和双引号

在这里您可以看到生成的regexp正在运行

/**
 * get param content of function string.
 * only params string should be provided without parentheses
 * WORK even if some/all params are not set
 * @return [param1, param2, param3]
 */
exports.getParamsSAFE = (str, nbParams = 3) => {
    const nextParamReg = /^\s*((?:(?:['"([{](?:[^'"()[\]{}]*?|['"([{](?:[^'"()[\]{}]*?|['"([{][^'"()[\]{}]*?['")}\]])*?['")}\]])*?['")}\]])|[^,])*?)\s*(?:,|$)/;
    const params = [];
    while (str.length) { // this is to avoid a BIG performance issue in javascript regexp engine
        str = str.replace(nextParamReg, (full, p1) => {
            params.push(p1);
            return '';
        });
    }
    return params;
};

2019-06-02 13:58:06

使用Ruby(1.9.3或更高版本)的正则表达式:

/(?<match>\((?:\g<match>|[^()]++)*\))/

关节演示

2013-08-21 08:38:47

这是最终的正则表达式:

\(
(?<arguments> 
(  
  ([^\(\)']*) |  
  (\([^\(\)']*\)) |
  '(.*?)'

)*
)
\)

例子:

input: ( arg1, arg2, arg3, (arg4), '(pip' )

output: arg1, arg2, arg3, (arg4), '(pip'

注意，'(pip'被正确地管理为字符串。 (在调节器试过:http://sourceforge.net/projects/regulator/)

2012-05-15 07:53:26

匹配平衡括号的正则表达式

推荐文章

最新文章

标签