我想使用.replace函数替换多个字符串。
我目前有
string.replace("condition1", "")
但想要一些像
string.replace("condition1", "").replace("condition2", "text")
尽管这样的语法感觉不太好
正确的做法是什么?有点像在grep/regex中,你可以用\1和\2来替换某些搜索字符串的字段
我想使用.replace函数替换多个字符串。
我目前有
string.replace("condition1", "")
但想要一些像
string.replace("condition1", "").replace("condition2", "text")
尽管这样的语法感觉不太好
正确的做法是什么?有点像在grep/regex中,你可以用\1和\2来替换某些搜索字符串的字段
当前回答
下面是一个简短的例子,应该做的技巧与正则表达式:
import re
rep = {"condition1": "", "condition2": "text"} # define desired replacements here
# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems())
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)
例如:
>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'
其他回答
我需要一个解决方案,其中字符串可以被替换为正则表达式, 例如,通过将多个空格字符替换为一个空格字符来帮助规范化长文本。根据其他人(包括MiniQuark和mmj)的一系列答案,我得出了以下结论:
def multiple_replace(string, reps, re_flags = 0):
""" Transforms string, replacing keys from re_str_dict with values.
reps: dictionary, or list of key-value pairs (to enforce ordering;
earlier items have higher priority).
Keys are used as regular expressions.
re_flags: interpretation of regular expressions, such as re.DOTALL
"""
if isinstance(reps, dict):
reps = reps.items()
pattern = re.compile("|".join("(?P<_%d>%s)" % (i, re_str[0])
for i, re_str in enumerate(reps)),
re_flags)
return pattern.sub(lambda x: reps[int(x.lastgroup[1:])][1], string)
它适用于其他答案中给出的例子,例如:
>>> multiple_replace("(condition1) and --condition2--",
... {"condition1": "", "condition2": "text"})
'() and --text--'
>>> multiple_replace('hello, world', {'hello' : 'goodbye', 'world' : 'earth'})
'goodbye, earth'
>>> multiple_replace("Do you like cafe? No, I prefer tea.",
... {'cafe': 'tea', 'tea': 'cafe', 'like': 'prefer'})
'Do you prefer tea? No, I prefer cafe.'
对我来说,最重要的是你也可以使用正则表达式,例如只替换整个单词,或规范化空白:
>>> s = "I don't want to change this name:\n Philip II of Spain"
>>> re_str_dict = {r'\bI\b': 'You', r'[\n\t ]+': ' '}
>>> multiple_replace(s, re_str_dict)
"You don't want to change this name: Philip II of Spain"
如果你想使用字典键作为普通字符串, 你可以在调用multiple_replace之前转义这些,例如使用下面的函数:
def escape_keys(d):
""" transform dictionary d by applying re.escape to the keys """
return dict((re.escape(k), v) for k, v in d.items())
>>> multiple_replace(s, escape_keys(re_str_dict))
"I don't want to change this name:\n Philip II of Spain"
下面的函数可以帮助在你的字典键中找到错误的正则表达式(因为来自multiple_replace的错误消息不是很明显):
def check_re_list(re_list):
""" Checks if each regular expression in list is well-formed. """
for i, e in enumerate(re_list):
try:
re.compile(e)
except (TypeError, re.error):
print("Invalid regular expression string "
"at position {}: '{}'".format(i, e))
>>> check_re_list(re_str_dict.keys())
请注意,它没有链接替换,而是同时执行它们。这使得它更有效率,而不会限制它能做什么。为了模仿链接的效果,你可能只需要添加更多的字符串替换对,并确保这些对的预期顺序:
>>> multiple_replace("button", {"but": "mut", "mutton": "lamb"})
'mutton'
>>> multiple_replace("button", [("button", "lamb"),
... ("but", "mut"), ("mutton", "lamb")])
'lamb'
我把这句话建立在fj的精彩回答上:
import re
def multiple_replacer(*key_values):
replace_dict = dict(key_values)
replacement_function = lambda match: replace_dict[match.group(0)]
pattern = re.compile("|".join([re.escape(k) for k, v in key_values]), re.M)
return lambda string: pattern.sub(replacement_function, string)
def multiple_replace(string, *key_values):
return multiple_replacer(*key_values)(string)
一针用法:
>>> replacements = (u"café", u"tea"), (u"tea", u"café"), (u"like", u"love")
>>> print multiple_replace(u"Do you like café? No, I prefer tea.", *replacements)
Do you love tea? No, I prefer café.
注意,由于替换只在一次传递中完成,“café”会变成“tea”,但不会变回“café”。
如果你需要做相同的替换多次,你可以很容易地创建一个替换函数:
>>> my_escaper = multiple_replacer(('"','\\"'), ('\t', '\\t'))
>>> many_many_strings = (u'This text will be escaped by "my_escaper"',
u'Does this work?\tYes it does',
u'And can we span\nmultiple lines?\t"Yes\twe\tcan!"')
>>> for line in many_many_strings:
... print my_escaper(line)
...
This text will be escaped by \"my_escaper\"
Does this work?\tYes it does
And can we span
multiple lines?\t\"Yes\twe\tcan!\"
改进:
将代码转换为函数 增加了多线支持 修正了逃跑的错误 容易创建一个函数,用于特定的多个替换
享受吧!: -)
我在学校作业中也做过类似的练习。这就是我的解
dictionary = {1: ['hate', 'love'],
2: ['salad', 'burger'],
3: ['vegetables', 'pizza']}
def normalize(text):
for i in dictionary:
text = text.replace(dictionary[i][0], dictionary[i][1])
return text
自己查看测试字符串上的结果
string_to_change = 'I hate salad and vegetables'
print(normalize(string_to_change))
下面是一个简短的例子,应该做的技巧与正则表达式:
import re
rep = {"condition1": "", "condition2": "text"} # define desired replacements here
# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems())
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)
例如:
>>> pattern.sub(lambda m: rep[re.escape(m.group(0))], "(condition1) and --condition2--")
'() and --text--'
另一个例子: 输入列表
error_list = ['[br]', '[ex]', 'Something']
words = ['how', 'much[ex]', 'is[br]', 'the', 'fish[br]', 'noSomething', 'really']
期望的输出将是
words = ['how', 'much', 'is', 'the', 'fish', 'no', 'really']
代码:
[n[0][0] if len(n[0]) else n[1] for n in [[[w.replace(e,"") for e in error_list if e in w],w] for w in words]]