我想使用.replace函数替换多个字符串。

我目前有

string.replace("condition1", "")

但想要一些像

string.replace("condition1", "").replace("condition2", "text")

尽管这样的语法感觉不太好

正确的做法是什么?有点像在grep/regex中,你可以用\1和\2来替换某些搜索字符串的字段


当前回答

这是我的0.02美元。它基于Andrew Clark的答案,只是更清楚一点,它还涵盖了当一个字符串被替换为另一个字符串的子字符串时的情况(更长的字符串胜出)

def multireplace(string, replacements):
    """
    Given a string and a replacement map, it returns the replaced string.

    :param str string: string to execute replacements on
    :param dict replacements: replacement dictionary {value to find: value to replace}
    :rtype: str

    """
    # Place longer ones first to keep shorter substrings from matching
    # where the longer ones should take place
    # For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against 
    # the string 'hey abc', it should produce 'hey ABC' and not 'hey ABc'
    substrs = sorted(replacements, key=len, reverse=True)

    # Create a big OR regex that matches any of the substrings to replace
    regexp = re.compile('|'.join(map(re.escape, substrs)))

    # For each match, look up the new string in the replacements
    return regexp.sub(lambda match: replacements[match.group(0)], string)

这就是这个要点,如果你有任何建议,请随意修改。

其他回答

我觉得这个问题需要一个单行递归lambda函数的答案,只是因为。所以有:

>>> mrep = lambda s, d: s if not d else mrep(s.replace(*d.popitem()), d)

用法:

>>> mrep('abcabc', {'a': '1', 'c': '2'})
'1b21b2'

注:

这将消耗输入字典。 Python字典保留3.6起的键顺序;其他答案中的相应警告不再相关。为了向后兼容,可以使用基于元组的版本:

>>> mrep = lambda s, d: s if not d else mrep(s.replace(*d.pop()), d)
>>> mrep('abcabc', [('a', '1'), ('c', '2')])

注意:与python中的所有递归函数一样,太大的递归深度(即替换字典太大)将导致错误。请看这里。

注意:测试你的案例,见注释。

这里有一个例子,它在长弦上更有效,有许多小的替换。

source = "Here is foo, it does moo!"

replacements = {
    'is': 'was', # replace 'is' with 'was'
    'does': 'did',
    '!': '?'
}

def replace(source, replacements):
    finder = re.compile("|".join(re.escape(k) for k in replacements.keys())) # matches every string we want replaced
    result = []
    pos = 0
    while True:
        match = finder.search(source, pos)
        if match:
            # cut off the part up until match
            result.append(source[pos : match.start()])
            # cut off the matched part and replace it in place
            result.append(replacements[source[match.start() : match.end()]])
            pos = match.end()
        else:
            # the rest after the last match
            result.append(source[pos:])
            break
    return "".join(result)

print replace(source, replacements)

关键是要避免长字符串的多次连接。我们将源字符串切成片段,在我们形成列表时替换一些片段,然后将整个字符串连接回字符串。

在我的情况下,我需要一个简单的唯一键替换名称,所以我想到了这个:

a = 'This is a test string.'
b = {'i': 'I', 's': 'S'}
for x,y in b.items():
    a = a.replace(x, y)
>>> a
'ThIS IS a teSt StrIng.'

你真的不应该这样做,但我觉得这太酷了:

>>> replacements = {'cond1':'text1', 'cond2':'text2'}
>>> cmd = 'answer = s'
>>> for k,v in replacements.iteritems():
>>>     cmd += ".replace(%s, %s)" %(k,v)
>>> exec(cmd)

现在,答案是所有替换的结果

再说一次,这是非常俗气的,不是你应该经常使用的东西。但我很高兴知道如果你需要的话,你可以这样做。

从安德鲁的宝贵答案开始,我开发了一个脚本,从一个文件加载字典,并详细说明所有文件上打开的文件夹做替换。脚本从一个外部文件加载映射,您可以在该文件中设置分隔符。我是一个初学者,但我发现这个脚本在多个文件中做多个替换时非常有用。它在几秒钟内加载了一个包含1000多个条目的字典。这并不优雅,但对我来说很管用

import glob
import re

mapfile = input("Enter map file name with extension eg. codifica.txt: ")
sep = input("Enter map file column separator eg. |: ")
mask = input("Enter search mask with extension eg. 2010*txt for all files to be processed: ")
suff = input("Enter suffix with extension eg. _NEW.txt for newly generated files: ")

rep = {} # creation of empy dictionary

with open(mapfile) as temprep: # loading of definitions in the dictionary using input file, separator is prompted
    for line in temprep:
        (key, val) = line.strip('\n').split(sep)
        rep[key] = val

for filename in glob.iglob(mask): # recursion on all the files with the mask prompted

    with open (filename, "r") as textfile: # load each file in the variable text
        text = textfile.read()

        # start replacement
        #rep = dict((re.escape(k), v) for k, v in rep.items()) commented to enable the use in the mapping of re reserved characters
        pattern = re.compile("|".join(rep.keys()))
        text = pattern.sub(lambda m: rep[m.group(0)], text)

        #write of te output files with the prompted suffice
        target = open(filename[:-4]+"_NEW.txt", "w")
        target.write(text)
        target.close()