如何在Python中实现通用树?这有内置的数据结构吗?


当前回答

您可以使用Python中的dataclasses模块创建Tree数据结构。

iter方法可用于使树可迭代,允许您通过改变yield语句的顺序来遍历树。

contains方法可用于检查树中是否存在特定值。

from dataclasses import dataclass

#               A
#              / \
#             B   C
#            / \   \
#           D   E   F
#          / \
#         G   H

@dataclass
class Node:
    data: str
    left: Node = None
    right: Node = None
    
    def __iter__(self):
        if self.left:
            yield from self.left
        
        yield self

        if self.right:
            yield from self.right

    def __contains__(self, other):
        for node in self:
            if node.data == other:
                return True
        return False
    

t = Node(
    'A', 
    Node(
        'B', 
        Node(
            'D', 
            Node('G'),
            Node('H'),
        ),
        Node('E'),
    ),  
    Node(
        'C', 
        right=Node('F'),
    ),
)
assert ('A' in t) is True
assert ('I' in t) is not True
for node in t:
    print(node.data, ' -> ', end='')
# G  -> D  -> H  -> B  -> E  -> A  -> C  -> F  -> 

其他回答

Treelib也很方便完成这项任务。文档可以在treelib找到。

from treelib import Node, Tree
tree = Tree() # creating an object
tree.create_node("Harry", "harry")  # root node 
tree.create_node("Jane", "jane", parent="harry") #adding nodes
tree.create_node("Bill", "bill", parent="harry")
tree.create_node("Diane", "diane", parent="jane")
tree.create_node("Mary", "mary", parent="diane")
tree.create_node("Mark", "mark", parent="jane")
tree.show()

Harry
├── Bill
└── Jane
    ├── Diane
    │   └── Mary
    └── Mark

并没有内置树,但是可以通过从List继承Node类型并编写遍历方法来轻松地构造一个树。如果你这样做,我发现平分法很有用。

您还可以浏览PyPi上的许多实现。

如果我没记错的话,Python标准库不包含树数据结构,原因和。net基类库不包含树数据结构是一样的:内存的局部性降低了,导致缓存丢失更多。在现代处理器上,将大量内存放入缓存通常会更快,而“指针丰富”的数据结构会抵消这种好处。

Python不像Java那样具有相当广泛的“内置”数据结构。但是,因为Python是动态的,所以很容易创建通用树。例如,二叉树可能是:

class Tree:
    def __init__(self):
        self.left = None
        self.right = None
        self.data = None

你可以这样使用它:

root = Tree()
root.data = "root"
root.left = Tree()
root.left.data = "left"
root.right = Tree()
root.right.data = "right"

如果每个节点需要任意数量的子节点,则使用子节点列表:

class Tree:
    def __init__(self, data):
        self.children = []
        self.data = data

left = Tree("left")
middle = Tree("middle")
right = Tree("right")
root = Tree("root")
root.children = [left, middle, right]

另一个基于Bruno回答的树的实现:

class Node:
    def __init__(self):
        self.name: str = ''
        self.children: List[Node] = []
        self.parent: Node = self

    def __getitem__(self, i: int) -> 'Node':
        return self.children[i]

    def add_child(self):
        child = Node()
        self.children.append(child)
        child.parent = self
        return child

    def __str__(self) -> str:
        def _get_character(x, left, right) -> str:
            if x < left:
                return '/'
            elif x >= right:
                return '\\'
            else:
                return '|'

        if len(self.children):
            children_lines: Sequence[List[str]] = list(map(lambda child: str(child).split('\n'), self.children))
            widths: Sequence[int] = list(map(lambda child_lines: len(child_lines[0]), children_lines))
            max_height: int = max(map(len, children_lines))
            total_width: int = sum(widths) + len(widths) - 1
            left: int = (total_width - len(self.name) + 1) // 2
            right: int = left + len(self.name)

            return '\n'.join((
                self.name.center(total_width),
                ' '.join(map(lambda width, position: _get_character(position - width // 2, left, right).center(width),
                             widths, accumulate(widths, add))),
                *map(
                    lambda row: ' '.join(map(
                        lambda child_lines: child_lines[row] if row < len(child_lines) else ' ' * len(child_lines[0]),
                        children_lines)),
                    range(max_height))))
        else:
            return self.name

还有一个如何使用它的例子:

tree = Node()
tree.name = 'Root node'
tree.add_child()
tree[0].name = 'Child node 0'
tree.add_child()
tree[1].name = 'Child node 1'
tree.add_child()
tree[2].name = 'Child node 2'
tree[1].add_child()
tree[1][0].name = 'Grandchild 1.0'
tree[2].add_child()
tree[2][0].name = 'Grandchild 2.0'
tree[2].add_child()
tree[2][1].name = 'Grandchild 2.1'
print(tree)

它应该输出:

                        Root node                        
     /             /                      \              
Child node 0  Child node 1           Child node 2        
                   |              /              \       
             Grandchild 1.0 Grandchild 2.0 Grandchild 2.1

嗨,你可以试试itertree(我是作者)。

该包与任何树包的方向相同,但关注点略有不同。在巨大的树(>100000个项目)上的性能要好得多,它处理迭代器具有有效的过滤机制。

>>>from itertree import *
>>>root=iTree('root')

>>># add some children:
>>>root.append(iTree('Africa',data={'surface':30200000,'inhabitants':1257000000}))
>>>root.append(iTree('Asia', data={'surface': 44600000, 'inhabitants': 4000000000}))
>>>root.append(iTree('America', data={'surface': 42549000, 'inhabitants': 1009000000}))
>>>root.append(iTree('Australia&Oceania', data={'surface': 8600000, 'inhabitants': 36000000}))
>>>root.append(iTree('Europe', data={'surface': 10523000 , 'inhabitants': 746000000}))
>>># you might use __iadd__ operator for adding too:
>>>root+=iTree('Antarktika', data={'surface': 14000000, 'inhabitants': 1100})

>>># for building next level we select per index:
>>>root[0]+=iTree('Ghana',data={'surface':238537,'inhabitants':30950000})
>>>root[0]+=iTree('Niger', data={'surface': 1267000, 'inhabitants': 23300000})
>>>root[1]+=iTree('China', data={'surface': 9596961, 'inhabitants': 1411780000})
>>>root[1]+=iTree('India', data={'surface': 3287263, 'inhabitants': 1380004000})
>>>root[2]+=iTree('Canada', data={'type': 'country', 'surface': 9984670, 'inhabitants': 38008005})    
>>>root[2]+=iTree('Mexico', data={'surface': 1972550, 'inhabitants': 127600000 })
>>># extend multiple items:
>>>root[3].extend([iTree('Australia', data={'surface': 7688287, 'inhabitants': 25700000 }), iTree('New Zealand', data={'surface': 269652, 'inhabitants': 4900000 })])
>>>root[4]+=iTree('France', data={'surface': 632733, 'inhabitants': 67400000 }))
>>># select parent per TagIdx - remember in itertree you might put items with same tag multiple times:
>>>root[TagIdx('Europe'0)]+=iTree('Finland', data={'surface': 338465, 'inhabitants': 5536146 })

创建的树可以被渲染:

>>>root.render()
iTree('root')
     └──iTree('Africa', data=iTData({'surface': 30200000, 'inhabitants': 1257000000}))
         └──iTree('Ghana', data=iTData({'surface': 238537, 'inhabitants': 30950000}))
         └──iTree('Niger', data=iTData({'surface': 1267000, 'inhabitants': 23300000}))
     └──iTree('Asia', data=iTData({'surface': 44600000, 'inhabitants': 4000000000}))
         └──iTree('China', data=iTData({'surface': 9596961,  'inhabitants': 1411780000}))
         └──iTree('India', data=iTData({'surface': 3287263, 'inhabitants': 1380004000}))
     └──iTree('America', data=iTData({'surface': 42549000, 'inhabitants': 1009000000}))
         └──iTree('Canada', data=iTData({'surface': 9984670, 'inhabitants': 38008005}))
         └──iTree('Mexico', data=iTData({'surface': 1972550, 'inhabitants': 127600000}))
     └──iTree('Australia&Oceania', data=iTData({'surface': 8600000, 'inhabitants': 36000000}))
         └──iTree('Australia', data=iTData({'surface': 7688287, 'inhabitants': 25700000}))
         └──iTree('New Zealand', data=iTData({'surface': 269652, 'inhabitants': 4900000}))
     └──iTree('Europe', data=iTData({'surface': 10523000, 'inhabitants': 746000000}))
         └──iTree('France', data=iTData({'surface': 632733, 'inhabitants': 67400000}))
         └──iTree('Finland', data=iTData({'surface': 338465, 'inhabitants': 5536146}))
     └──iTree('Antarktika', data=iTData({'surface': 14000000, 'inhabitants': 1100}))

过滤可以这样做:

>>>item_filter = Filter.iTFilterData(data_key='inhabitants', data_value=iTInterval(0, 20000000))
>>>iterator=root.iter_all(item_filter=item_filter)
>>>for i in iterator:
>>>    print(i)
iTree("'New Zealand'", data=iTData({'surface': 269652, 'inhabitants': 4900000}), subtree=[])
iTree("'Finland'", data=iTData({'surface': 338465, 'inhabitants': 5536146}), subtree=[])
iTree("'Antarktika'", data=iTData({'surface': 14000000, 'inhabitants': 1100}), subtree=[])