例子:
>>> convert('CamelCase')
'camel_case'
例子:
>>> convert('CamelCase')
'camel_case'
当前回答
就我个人而言,我不确定在python中使用正则表达式的任何东西都可以被描述为优雅。这里的大多数答案只是做“代码高尔夫”类型的RE技巧。优雅的编码应该是容易理解的。
def to_snake_case(not_snake_case):
final = ''
for i in xrange(len(not_snake_case)):
item = not_snake_case[i]
if i < len(not_snake_case) - 1:
next_char_will_be_underscored = (
not_snake_case[i+1] == "_" or
not_snake_case[i+1] == " " or
not_snake_case[i+1].isupper()
)
if (item == " " or item == "_") and next_char_will_be_underscored:
continue
elif (item == " " or item == "_"):
final += "_"
elif item.isupper():
final += "_"+item.lower()
else:
final += item
if final[0] == "_":
final = final[1:]
return final
>>> to_snake_case("RegularExpressionsAreFunky")
'regular_expressions_are_funky'
>>> to_snake_case("RegularExpressionsAre Funky")
'regular_expressions_are_funky'
>>> to_snake_case("RegularExpressionsAre_Funky")
'regular_expressions_are_funky'
其他回答
我认为这个解决方案比之前的答案更直接:
import re
def convert (camel_input):
words = re.findall(r'[A-Z]?[a-z]+|[A-Z]{2,}(?=[A-Z][a-z]|\d|\W|$)|\d+', camel_input)
return '_'.join(map(str.lower, words))
# Let's test it
test_strings = [
'CamelCase',
'camelCamelCase',
'Camel2Camel2Case',
'getHTTPResponseCode',
'get200HTTPResponseCode',
'getHTTP200ResponseCode',
'HTTPResponseCode',
'ResponseHTTP',
'ResponseHTTP2',
'Fun?!awesome',
'Fun?!Awesome',
'10CoolDudes',
'20coolDudes'
]
for test_string in test_strings:
print(convert(test_string))
输出:
camel_case
camel_camel_case
camel_2_camel_2_case
get_http_response_code
get_200_http_response_code
get_http_200_response_code
http_response_code
response_http
response_http_2
fun_awesome
fun_awesome
10_cool_dudes
20_cool_dudes
正则表达式匹配三种模式:
[a - z]吗?[a-z]+:连续小写字母,可选以大写字母开头。 [a - z] {2,} (? = [a - z] [a - z] | | \ \ d W | $):两个或两个以上的连续大写字母。如果最后一个大写字母后面跟着一个小写字母,它使用一个超前来排除它。 \d+:连续数字。
通过使用re.findall,我们得到了一个单独的“单词”列表,这些单词可以转换为小写字母并用下划线连接。
在包索引中有一个inflection库可以为您处理这些事情。在这种情况下,您将寻找inflection.underscore():
>>> inflection.underscore('CamelCase')
'camel_case'
哇,我刚从django片段中偷了这个。ref http://djangosnippets.org/snippets/585/
很优雅
camelcase_to_underscore = lambda str: re.sub(r'(?<=[a-z])[A-Z]|[A-Z](?=[^A-Z])', r'_\g<0>', str).lower().strip('_')
例子:
camelcase_to_underscore('ThisUser')
返回:
'this_user'
REGEX演示
使用正则表达式可能是最短的,但这个解决方案更具可读性:
def to_snake_case(s):
snake = "".join(["_"+c.lower() if c.isupper() else c for c in s])
return snake[1:] if snake.startswith("_") else snake
为了好玩:
>>> def un_camel(input):
... output = [input[0].lower()]
... for c in input[1:]:
... if c in ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
... output.append('_')
... output.append(c.lower())
... else:
... output.append(c)
... return str.join('', output)
...
>>> un_camel("camel_case")
'camel_case'
>>> un_camel("CamelCase")
'camel_case'
或者,更有趣的是:
>>> un_camel = lambda i: i[0].lower() + str.join('', ("_" + c.lower() if c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" else c for c in i[1:]))
>>> un_camel("camel_case")
'camel_case'
>>> un_camel("CamelCase")
'camel_case'