如何在现代 C++ 中实现经典排序算法？

小编典典

如何在现代 C++ 中实现经典排序算法？

all

C++标准库中的std::sort算法（及其表亲std::partial_sort和std::nth_element）在大多数实现中是更基本的排序算法的复杂和混合合并，例如选择排序、插入排序、快速排序、合并排序或堆排序。

这里和姊妹网站（例如 https://codereview.stackexchange.com/
）上存在许多与这些经典排序算法实现的错误、复杂性和其他方面相关的问题。大多数提供的实现由原始循环、使用索引操作和具体类型组成，并且通常在正确性和效率方面进行分析并非易事。

问题：如何使用现代 C++ 实现上述经典排序算法？

没有原始循环 ，而是结合了标准库的算法构建块<algorithm>
迭代器接口 和模板的使用，而不是索引操作和具体类型
C++14 风格 ，包括完整的标准库，以及语法降噪器，例如auto模板别名、透明比较器和多态 lambda。

备注：

有关排序算法实现的进一步参考，请参见Wikipedia、Rosetta Code或 http://www.sorting-algorithms.com/
根据 Sean Parent 的惯例 （幻灯片 39），原始循环是一个for比使用运算符组合两个函数更长的循环。所以f(g(x));or f(x); g(x);orf(x) + g(x);不是原始循环，里面和下面的循环也不selection_sort是insertion_sort。
我按照 Scott Meyers 的术语将当前的 C1y 表示为 C14，并将 C98 和 C03 都表示为 C++98，所以不要因此而激怒我。
正如@Mehrdad 在评论中所建议的那样，我在答案末尾提供了四个实现作为实时示例：C14、C11、C98 和 Boost 和 C98。
答案本身仅以 C++14 的形式呈现。在相关的地方，我表示各种语言版本不同的句法和库差异。

阅读 86

2022-04-06

共1个答案

小编典典

算法构建块

我们首先从标准库中组装算法构建块：

#include <algorithm>    // min_element, iter_swap, 
                        // upper_bound, rotate, 
                        // partition, 
                        // inplace_merge,
                        // make_heap, sort_heap, push_heap, pop_heap,
                        // is_heap, is_sorted
#include <cassert>      // assert 
#include <functional>   // less
#include <iterator>     // distance, begin, end, next

非成员std::begin()/std::end()和 with等迭代器工具std::next()仅在 C11 及更高版本中可用。对于 C98，需要自己编写这些。boost::begin()/boost::end()中的 Boost.Range 和中的 Boost.Utility有替代品boost::next()。
该std::is_sorted算法仅适用于 C11 及更高版本。对于 C98，这可以通过std::adjacent_find一个手写的函数对象来实现。Boost.Algorithm 也提供了一个boost::algorithm::is_sorted作为替代。
该std::is_heap算法仅适用于 C++11 及更高版本。

语法好东西

C14 提供了 transparent comparators，std::less<>它们以多态方式作用于它们的参数。这避免了必须提供迭代器的类型。这可以与
C11 的默认函数模板参数
结合使用，为作为比较的排序算法和具有用户定义的比较函数对象的排序算法创建 一个重载。<

template<class It, class Compare = std::less<>>
void xxx_sort(It first, It last, Compare cmp = Compare{});

在 C++11 中，可以定义一个可重用的模板别名来提取迭代器的值类型，这会给排序算法的签名添加少量混乱：

template<class It>
using value_type_t = typename std::iterator_traits<It>::value_type;

template<class It, class Compare = std::less<value_type_t<It>>>
void xxx_sort(It first, It last, Compare cmp = Compare{});

在 C++98 中，需要编写两个重载并使用详细typename xxx<yyy>::type语法

template<class It, class Compare>
void xxx_sort(It first, It last, Compare cmp); // general implementation

template<class It>
void xxx_sort(It first, It last)
{
    xxx_sort(first, last, std::less<typename std::iterator_traits<It>::value_type>());
}

另一个语法上的好处是 C++14 有助于通过 多态 lambda 包装用户定义的比较器（使用auto像函数模板参数一样推导出的参数）。
C++11 只有单态 lambda，需要使用上述模板别名value_type_t。
在 C++98 中，要么需要编写一个独立的函数对象，要么求助于冗长的//std::bind1st类型的语法。std::bind2nd``std::not1
Boost.Bind 使用boost::bindand _1/_2占位符语法改进了这一点。
C11 及更高版本也有std::find_if_not，而 C98 需要std::find_if一个std::not1函数对象。

C++ 风格

目前还没有普遍接受的 C14 风格。无论好坏，我都密切关注 Scott Meyers 的 **草稿 Effective Modern
C**
和 Herb Sutter 修改后的 GotW 。我使用以下样式建议：

Herb Sutter 的 “Almost Always Auto” 和 Scott Meyers 的“Prefer auto to specific type declarations” 建议的简洁性是无与伦比的，尽管其清晰度有时 存在争议 。
Scott Meyers 的 “区分()和{}创建对象时”并始终选择支撑初始化{}而不是旧的带括号的初始化()（为了回避通用代码中所有最棘手的解析问题）。
Scott Meyers 的“Prefer alias declarations to typedefs”。对于模板来说，无论如何这是必须的，并且在任何地方都使用它而不是typedef节省时间并增加一致性。
我for (auto it = first; it != last; ++it)在某些地方使用了一种模式，以便允许对已排序的子范围进行循环不变检查。在生产代码中，在循环中使用while (first != last)and a++first可能会稍微好一些。

选择排序

选择排序
不会以任何方式适应数据，所以它的运行时间总是O(N虏). 但是，选择排序具有 最小化交换次数的
特性。在交换项目的成本很高的应用程序中，选择排序非常可能是首选算法。

要使用标准库实现它，请重复使用std::min_element以查找剩余的最小元素，并将iter_swap其交换到位：

template<class FwdIt, class Compare = std::less<>>
void selection_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
    for (auto it = first; it != last; ++it) {
        auto const selection = std::min_element(it, last, cmp);
        std::iter_swap(selection, it); 
        assert(std::is_sorted(first, std::next(it), cmp));
    }
}

请注意，selection_sort已处理的范围已[first, it)排序为其循环不变量。与的随机访问迭代器相比，最低要求是
前向迭代器。std::sort

细节省略 ：

选择排序可以通过早期测试if (std::distance(first, last) <= 1) return;（或前向/双向迭代器：）进行优化if (first == last || std::next(first) == last) return;。
对于 双向迭代器 ，上述测试可以与区间上的循环结合使用[first, std::prev(last))，因为最后一个元素保证是最小的剩余元素并且不需要交换。

插入排序

尽管它是具有O(N虏)最坏情况时间的基本排序算法之一，但
插入排序 是当数据接近排序（因为它是 自适应
的）或当问题规模很小（因为它具有低开销）时的首选算法。由于这些原因，并且因为它也是稳定
的，插入排序经常被用作递归基本情况（当问题规模较小时），用于更高开销的分治排序算法，例如合并排序或快速排序。

insertion_sort用标准库来实现，重复使用std::upper_bound找到当前元素需要去的位置，使用std::rotate将剩余元素在输入范围内向上移动：

template<class FwdIt, class Compare = std::less<>>
void insertion_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
    for (auto it = first; it != last; ++it) {
        auto const insertion = std::upper_bound(first, it, *it, cmp);
        std::rotate(insertion, it, std::next(it)); 
        assert(std::is_sorted(first, std::next(it), cmp));
    }
}

请注意，insertion_sort已处理的范围已[first, it)排序为其循环不变量。插入排序也适用于前向迭代器。

细节省略 ：

插入排序可以通过早期测试if (std::distance(first, last) <= 1) return;（或用于前向/双向迭代器：）if (first == last || std::next(first) == last) return;和在区间上的循环进行优化[std::next(first), last)，因为第一个元素保证就位并且不需要旋转。
对于 双向迭代器 ，查找插入点的二进制搜索可以使用标准库的算法替换为 反向线性搜索。std::find_if_not

以下片段的四个 实时示例 （ C++14 、 C++11 、 C++98 和 Boost 、 C++98 ）：

using RevIt = std::reverse_iterator<BiDirIt>;
auto const insertion = std::find_if_not(RevIt(it), RevIt(first), 
    [=](auto const& elem){ return cmp(*it, elem); }
).base();

对于随机输入，这提供了O(N虏)比较，但这改进了O(N)对几乎排序的输入的比较。二分查找总是使用O(N log N)比较。
对于较小的输入范围，线性搜索的更好的内存局部性（缓存、预取）也可能主导二分搜索（当然，应该对此进行测试）。

快速排序

如果仔细实施， 快速排序 是健壮的并且具有O(N log N)预期的复杂性，但O(N虏)最坏情况下的复杂性可以通过对抗性选择的输入数据触发。当不需要稳定排序时，快速排序是一种出色的通用排序。

即使对于最简单的版本，使用标准库实现快速排序也比其他经典排序算法要复杂得多。下面的方法使用一些迭代器实用程序将输入范围的 中间元素[first, last)定位为枢轴，然后使用两次调用std::partition(which are O(N))
将输入范围三路划分为小于、等于、和大于选定的枢轴，分别。最后，对元素小于和大于枢轴的两个外部段进行递归排序：

template<class FwdIt, class Compare = std::less<>>
void quick_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
    auto const N = std::distance(first, last);
    if (N <= 1) return;
    auto const pivot = *std::next(first, N / 2);
    auto const middle1 = std::partition(first, last, [=](auto const& elem){ 
        return cmp(elem, pivot); 
    });
    auto const middle2 = std::partition(middle1, last, [=](auto const& elem){ 
        return !cmp(pivot, elem);
    });
    quick_sort(first, middle1, cmp); // assert(std::is_sorted(first, middle1, cmp));
    quick_sort(middle2, last, cmp);  // assert(std::is_sorted(middle2, last, cmp));
}

然而，快速排序要获得正确和高效是相当棘手的，因为上述每个步骤都必须仔细检查并针对生产级代码进行优化。特别是，为了O(N log N)复杂性，枢轴必须导致输入数据的平衡分区，这通常不能保证O(1)枢轴，但如果将枢轴设置为O(N)输入范围的中值，则可以保证。

细节省略 ：

上述实现特别容易受到特殊输入的影响，例如“ 风琴管* ”输入具有O(N^2)复杂性（因为中间总是大于所有其他元素）。 *1, 2, 3, ..., N/2, ... 3, 2, 1
从输入范围中随机选择的元素中选择3 的中值可以防止几乎排序的输入，否则复杂性会恶化到O(N^2).
如两个调用所示的3 路分区 （分隔小于、等于和大于枢轴的元素）实现此结果std::partition的最有效O(N)
对于 随机访问迭代器O(N log N)，可以通过使用 中值枢轴选择std::nth_element(first, middle, last)，然后递归调用quick_sort(first, middle, cmp)和来实现有保证的复杂性quick_sort(middle, last, cmp)。
然而，这种保证是有代价的，因为O(N)复杂性的常数因子std::nth_element可能比中O(1)位数为 3 的枢轴然后O(N)调用std::partition(这是一个缓存友好的单前向传递数据）。

合并排序

如果不考虑使用O(N)额外的空间，那么 归并排序
是一个很好的选择：它是唯一稳定 O(N log N)的排序算法。

使用标准算法很容易实现：使用一些迭代器实用程序来定位输入范围的中间[first, last)并将两个递归排序的段与 a
组合std::inplace_merge：

template<class BiDirIt, class Compare = std::less<>>
void merge_sort(BiDirIt first, BiDirIt last, Compare cmp = Compare{})
{
    auto const N = std::distance(first, last);
    if (N <= 1) return;                   
    auto const middle = std::next(first, N / 2);
    merge_sort(first, middle, cmp); // assert(std::is_sorted(first, middle, cmp));
    merge_sort(middle, last, cmp);  // assert(std::is_sorted(middle, last, cmp));
    std::inplace_merge(first, middle, last, cmp); // assert(std::is_sorted(first, last, cmp));
}

合并排序需要双向迭代器，瓶颈是std::inplace_merge. 请注意，在对链表进行排序时，归并排序只需要O(log N)额外的空间（用于递归）。后一种算法std::list<T>::sort在标准库中实现。

堆排序

堆排序 很容易实现，执行O(N log N)就地排序，但不稳定。

第一个循环，O(N)“heapify”阶段，将数组放入堆顺序。第二个循环，O(N log N“排序”阶段，反复提取最大值并恢复堆顺序。标准库使这变得非常简单：

template<class RandomIt, class Compare = std::less<>>
void heap_sort(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
    lib::make_heap(first, last, cmp); // assert(std::is_heap(first, last, cmp));
    lib::sort_heap(first, last, cmp); // assert(std::is_sorted(first, last, cmp));
}

如果您认为使用std::make_heapand是“作弊” std::sort_heap，您可以更深入一层，分别根据 and
编写这些std::push_heap函数std::pop_heap：

namespace lib {

// NOTE: is O(N log N), not O(N) as std::make_heap
template<class RandomIt, class Compare = std::less<>>
void make_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
    for (auto it = first; it != last;) {
        std::push_heap(first, ++it, cmp); 
        assert(std::is_heap(first, it, cmp));           
    }
}

template<class RandomIt, class Compare = std::less<>>
void sort_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
    for (auto it = last; it != first;) {
        std::pop_heap(first, it--, cmp);
        assert(std::is_heap(first, it, cmp));           
    } 
}

}   // namespace lib

标准库将push_heap和都指定pop_heap为复杂性O(log N)。但是请注意，范围内的外部循环会[first, last)导致
的O(N log N)复杂性make_heap，而std::make_heap只有O(N)复杂性。对于它的整体O(N log N)复杂性heap_sort并不重要。

测试

这里有四个 实时示例 （ C++14 、 C++11 、 C++98 和 Boost 、 C++98 ）在各种输入上测试所有五种算法（并不意味着详尽或严格）。请注意 LOC
的巨大差异：C11/C14 需要大约 130 LOC，C98 和 Boost 190 (+50%) 和 C98 超过 270
(+100%)。

2022-04-06