在b-树中,您可以将键和数据存储在内部节点和叶节点中,但在b+树中,您必须仅将数据存储在叶节点中。
在b+树中这样做有什么好处吗?
为什么不在所有地方都使用b-树而不是b+树,因为直觉上它们看起来更快?
我的意思是,为什么需要在b+树中复制键(数据)?
在b-树中,您可以将键和数据存储在内部节点和叶节点中,但在b+树中,您必须仅将数据存储在叶节点中。
在b+树中这样做有什么好处吗?
为什么不在所有地方都使用b-树而不是b+树,因为直觉上它们看起来更快?
我的意思是,为什么需要在b+树中复制键(数据)?
当前回答
B+树是一种平衡的树,其中从树的根到叶子的每条路径都是相同的长度,树的每个非叶子节点都有[n/2]到[n]个子节点,其中n对于特定的树是固定的。它包含索引页和数据页。 二叉树的每个父节点只有两个子节点,而B+树的每个父节点可以有不同数量的子节点
其他回答
下图有助于显示B+树和B树之间的区别。
B+树的优点:
Because B+ trees don't have data associated with interior nodes, more keys can fit on a page of memory. Therefore, it will require fewer cache misses in order to access data that is on a leaf node. The leaf nodes of B+ trees are linked, so doing a full scan of all objects in a tree requires just one linear pass through all the leaf nodes. A B tree, on the other hand, would require a traversal of every level in the tree. This full-tree traversal will likely involve more cache misses than the linear traversal of B+ leaves.
B树的优点:
因为B树包含每个键的数据,所以经常访问的节点可以位于更靠近根的位置,因此可以更快地访问。
In a B tree search keys and data are stored in internal or leaf nodes. But in a B+-tree data is stored only in leaf nodes. Full scan of a B+ tree is very easy because all data are found in leaf nodes. Full scan of a B tree requires a full traversal. In a B tree, data may be found in leaf nodes or internal nodes. Deletion of internal nodes is very complicated. In a B+ tree, data is only found in leaf nodes. Deletion of leaf nodes is easy. Insertion in B tree is more complicated than B+ tree. B+ trees store redundant search keys but B tree has no redundant value. In a B+ tree, leaf node data is ordered as a sequential linked list but in a B tree the leaf node cannot be stored using a linked list. Many database systems' implementations prefer the structural simplicity of a B+ tree.
由于终端节点形成了一个链表,B+树更容易进行全面扫描,而且性能更高,可以查看树索引的每一块数据。要使用B-Tree进行完整扫描,您需要进行完整的树遍历以查找所有数据。
另一方面,当您执行seek(按键查找特定数据段)时,B-Trees可以更快,特别是当树驻留在RAM或其他非块存储中时。由于可以提升树中常用的节点,因此获取数据所需的比较较少。
B+树尤其适用于基于块的存储(例如:硬盘)。考虑到这一点,你会得到几个优势,例如(从我的脑海中):
high fanout / low depth: that means you have to get less blocks to get to the data. with data intermingled with the pointers, each read gets less pointers, so you need more seeks to get to the data simple and consistent block storage: an inner node has N pointers, nothing else, a leaf node has data, nothing else. that makes it easy to parse, debug and even reconstruct. high key density means the top nodes are almost certainly on cache, in many cases all inner nodes get quickly cached, so only the data access has to go to disk.
数据库系统概念示例
B+树
相应的b -树