我有一个字符串列表,我想执行一个自然的字母排序。

例如,下面的列表是自然排序(我想要的):

['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']

下面是上面列表的“排序”版本(我使用sorted()得到的):

['Elm11', 'Elm12', 'Elm2', 'elm0', 'elm1', 'elm10', 'elm13', 'elm9']

我在寻找一个排序函数它的行为和第一个一样。


当前回答

这是一个更高级的解决方案,由Claudiu和Mark Byers改进:

它使用casefold()而不是lower()来匹配字符串 您可以传递另一个键lambda来选择一个内部元素(就像您习惯使用普通排序函数一样) 它当然适用于列表。Sort, sorted, max,等等。

def natural_sort(key=None, _nsre=re.compile('([0-9]+)')):
    return lambda x: [int(text) if text.isdigit() else text.casefold()
            for text in _nsre.split(key(x) if key else x)]

使用示例:

# Original solution
data.sort(key=natural_sort)

# Select an additional key
image_files.sort(key=natural_sort(lambda x: x.original_filename))

其他回答

>>> import re
>>> sorted(lst, key=lambda x: int(re.findall(r'\d+$', x)[0]))
['elm0', 'elm1', 'Elm2', 'elm9', 'elm10', 'Elm11', 'Elm12', 'elm13']
a = ['H1', 'H100', 'H10', 'H3', 'H2', 'H6', 'H11', 'H50', 'H5', 'H99', 'H8']
b = ''
c = []

def bubble(bad_list):#bubble sort method
        length = len(bad_list) - 1
        sorted = False

        while not sorted:
                sorted = True
                for i in range(length):
                        if bad_list[i] > bad_list[i+1]:
                                sorted = False
                                bad_list[i], bad_list[i+1] = bad_list[i+1], bad_list[i] #sort the integer list 
                                a[i], a[i+1] = a[i+1], a[i] #sort the main list based on the integer list index value

for a_string in a: #extract the number in the string character by character
        for letter in a_string:
                if letter.isdigit():
                        #print letter
                        b += letter
        c.append(b)
        b = ''

print 'Before sorting....'
print a
c = map(int, c) #converting string list into number list
print c
bubble(c)

print 'After sorting....'
print c
print a

应答:

气泡排序作业

如何在python中一次读一个字母的字符串

下面是马克·拜尔斯的另一个版本的回答。这个版本演示了如何传入一个属性名,该属性名将用于计算列表中的对象。

def natural_sort(l, attrib):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key.__dict__[attrib])]
    return sorted(l, key=alphanum_key)

results = natural_sort(albums, 'albumid')

其中albums是一个Album实例列表,albumid是一个字符串属性,名义上包含数字。

上面的答案对于上面给出的具体例子是有用的,但对于更普遍的自然排序问题,却遗漏了几个有用的例子。我刚刚被其中一个案例咬了一口,所以想出了一个更彻底的解决方案:

def natural_sort_key(string_or_number):
    """
    by Scott S. Lawton <scott@ProductArchitect.com> 2014-12-11; public domain and/or CC0 license

    handles cases where simple 'int' approach fails, e.g.
        ['0.501', '0.55'] floating point with different number of significant digits
        [0.01, 0.1, 1]    already numeric so regex and other string functions won't work (and aren't required)
        ['elm1', 'Elm2']  ASCII vs. letters (not case sensitive)
    """

    def try_float(astring):
        try:
            return float(astring)
        except:
            return astring

    if isinstance(string_or_number, basestring):
        string_or_number = string_or_number.lower()

        if len(re.findall('[.]\d', string_or_number)) <= 1:
            # assume a floating point value, e.g. to correctly sort ['0.501', '0.55']
            # '.' for decimal is locale-specific, e.g. correct for the Anglosphere and Asia but not continental Europe
            return [try_float(s) for s in re.split(r'([\d.]+)', string_or_number)]
        else:
            # assume distinct fields, e.g. IP address, phone number with '.', etc.
            # caveat: might want to first split by whitespace
            # TBD: for unicode, replace isdigit with isdecimal
            return [int(s) if s.isdigit() else s for s in re.split(r'(\d+)', string_or_number)]
    else:
        # consider: add code to recurse for lists/tuples and perhaps other iterables
        return string_or_number

测试代码和几个链接(在StackOverflow上和关闭)在这里: http://productarchitect.com/code/better-natural-sort.py

欢迎您的反馈。这并不是一个明确的解决方案;只是向前迈出了一步。

这是一个更高级的解决方案,由Claudiu和Mark Byers改进:

它使用casefold()而不是lower()来匹配字符串 您可以传递另一个键lambda来选择一个内部元素(就像您习惯使用普通排序函数一样) 它当然适用于列表。Sort, sorted, max,等等。

def natural_sort(key=None, _nsre=re.compile('([0-9]+)')):
    return lambda x: [int(text) if text.isdigit() else text.casefold()
            for text in _nsre.split(key(x) if key else x)]

使用示例:

# Original solution
data.sort(key=natural_sort)

# Select an additional key
image_files.sort(key=natural_sort(lambda x: x.original_filename))