从字符串中删除标点符号的最佳方法

似乎应该有一种比以下更简单的方法：

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("",""), string.punctuation)

有？

当前回答

在不太严格的情况下，单行线可能会有所帮助：

''.join([c for c in s if c.isalnum() or c.isspace()])

2015-10-17 23:03:59

其他回答

不一定更简单，但如果你更熟悉re家族的话，就另辟蹊径。

import re, string
s = "string. With. Punctuation?" # Sample string 
out = re.sub('[%s]' % re.escape(string.punctuation), '', s)

2008-11-05 17:39:55

myString.translate(None, string.punctuation)

2010-03-08 15:19:09

对于严肃的自然语言处理（NLP），您应该让像SpaCy这样的库通过标记化处理标点符号，然后您可以根据需要手动调整。

例如，您希望如何处理单词中的连字符？例外情况，如缩写？开始和结束引号？URL？在NLP中，将“let’s”这样的收缩分隔为“let”和“s”以进行进一步处理通常很有用。

2022-03-31 01:53:41

下面是Python 3.5的一行代码：

import string
"l*ots! o(f. p@u)n[c}t]u[a'ti\"on#$^?/".translate(str.maketrans({a:None for a in string.punctuation}))

2016-03-21 02:46:47

这是我写的一个函数。它不是很有效，但很简单，您可以添加或删除任何您想要的标点符号：

def stripPunc(wordList):
    """Strips punctuation from list of words"""
    puncList = [".",";",":","!","?","/","\\",",","#","@","$","&",")","(","\""]
    for punc in puncList:
        for word in wordList:
            wordList=[word.replace(punc,'') for word in wordList]
    return wordList

2015-09-22 14:30:47

从字符串中删除标点符号的最佳方法

推荐文章

最新文章

标签