我想使用.replace函数替换多个字符串。

我目前有

string.replace("condition1", "")

但想要一些像

string.replace("condition1", "").replace("condition2", "text")

尽管这样的语法感觉不太好

正确的做法是什么?有点像在grep/regex中,你可以用\1和\2来替换某些搜索字符串的字段


当前回答

从Python 3.8开始,并引入赋值表达式(PEP 572)(:=运算符),我们可以在一个列表理解式中应用替换:

# text = "The quick brown fox jumps over the lazy dog"
# replacements = [("brown", "red"), ("lazy", "quick")]
[text := text.replace(a, b) for a, b in replacements]
# text = 'The quick red fox jumps over the quick dog'

其他回答

您可以使用pandas库和replace函数,它既支持精确匹配,也支持正则表达式替换。例如:

df = pd.DataFrame({'text': ['Billy is going to visit Rome in November', 'I was born in 10/10/2010', 'I will be there at 20:00']})

to_replace=['Billy','Rome','January|February|March|April|May|June|July|August|September|October|November|December', '\d{2}:\d{2}', '\d{2}/\d{2}/\d{4}']
replace_with=['name','city','month','time', 'date']

print(df.text.replace(to_replace, replace_with, regex=True))

修改后的文本为:

0    name is going to visit city in month
1                      I was born in date
2                 I will be there at time

你可以在这里找到一个例子。请注意,文本上的替换是按照它们在列表中出现的顺序进行的

另一个例子: 输入列表

error_list = ['[br]', '[ex]', 'Something']
words = ['how', 'much[ex]', 'is[br]', 'the', 'fish[br]', 'noSomething', 'really']

期望的输出将是

words = ['how', 'much', 'is', 'the', 'fish', 'no', 'really']

代码:

[n[0][0] if len(n[0]) else n[1] for n in [[[w.replace(e,"") for e in error_list if e in w],w] for w in words]] 

你可以做一个漂亮的循环函数。

def replace_all(text, dic):
    for i, j in dic.iteritems():
        text = text.replace(i, j)
    return text

其中text是完整的字符串,dic是字典-每个定义都是一个字符串,将替换与术语匹配的字符串。

注意:在Python 3中,iteritems()已被items()取代


注意:Python字典没有迭代的可靠顺序。此解决方案仅在以下情况下解决您的问题:

替换的顺序无关紧要 替换者可以改变之前替换者的结果

更新:上述与插入顺序相关的语句不适用于大于或等于3.6的Python版本,因为标准字典已更改为使用插入顺序进行迭代。

例如:

d = { "cat": "dog", "dog": "pig"}
my_sentence = "This is my cat and this is my dog."
replace_all(my_sentence, d)
print(my_sentence)

可能输出#1:

"This is my pig and this is my pig."

可能的输出#2

"This is my dog and this is my pig."

一个可能的解决方法是使用OrderedDict。

from collections import OrderedDict
def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text
od = OrderedDict([("cat", "dog"), ("dog", "pig")])
my_sentence = "This is my cat and this is my dog."
replace_all(my_sentence, od)
print(my_sentence)

输出:

"This is my pig and this is my pig."

注意事项#2:如果你的文本字符串太大或字典中有很多对,效率就会很低。

下面是一个简短的例子,应该做的技巧与正则表达式:

import re

rep = {"condition1": "", "condition2": "text"} # define desired replacements here

# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems()) 
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)

例如:

>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'

这里有一个使用reduce的第一个解决方案的变体,如果你喜欢功能性的。:)

repls = {'hello' : 'goodbye', 'world' : 'earth'}
s = 'hello, world'
reduce(lambda a, kv: a.replace(*kv), repls.iteritems(), s)

马蒂诺的版本更好:

repls = ('hello', 'goodbye'), ('world', 'earth')
s = 'hello, world'
reduce(lambda a, kv: a.replace(*kv), repls, s)