我相信有一种方法可以找到长度为n的O(n)无序数组中第k大的元素。也可能是期望O(n)之类的。我们该怎么做呢?


当前回答

下面是一个随机化快速选择的c++实现。这个想法是随机选择一个主元。为了实现随机分区,我们使用一个随机函数rand()来生成l和r之间的索引,将随机生成索引处的元素与最后一个元素交换,最后调用以最后一个元素为枢轴的标准分区过程。

#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;

int randomPartition(int arr[], int l, int r);

// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method.  ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
    // If k is smaller than number of elements in array
    if (k > 0 && k <= r - l + 1)
    {
        // Partition the array around a random element and
        // get position of pivot element in sorted array
        int pos = randomPartition(arr, l, r);

        // If position is same as k
        if (pos-l == k-1)
            return arr[pos];
        if (pos-l > k-1)  // If position is more, recur for left subarray
            return kthSmallest(arr, l, pos-1, k);

        // Else recur for right subarray
        return kthSmallest(arr, pos+1, r, k-pos+l-1);
    }

    // If k is more than number of elements in array
    return INT_MAX;
}

void swap(int *a, int *b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

// Standard partition process of QuickSort().  It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
    int x = arr[r], i = l;
    for (int j = l; j <= r - 1; j++)
    {
        if (arr[j] <= x) //arr[i] is bigger than arr[j] so swap them
        {
            swap(&arr[i], &arr[j]);
            i++;
        }
    }
    swap(&arr[i], &arr[r]); // swap the pivot
    return i;
}

// Picks a random pivot element between l and r and partitions
// arr[l..r] around the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
    int n = r-l+1;
    int pivot = rand() % n;
    swap(&arr[l + pivot], &arr[r]);
    return partition(arr, l, r);
}

// Driver program to test above methods
int main()
{
    int arr[] = {12, 3, 5, 7, 4, 19, 26};
    int n = sizeof(arr)/sizeof(arr[0]), k = 3;
    cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
    return 0;
}

上述解的最坏情况时间复杂度仍为O(n2)。在最坏的情况下,随机函数可能总是选择一个角元素。上述随机化QuickSelect的期望时间复杂度为Θ(n)

其他回答

在那个('第k大元素数组')上快速谷歌返回这个:http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17

"Make one pass through tracking the three largest values so far." 

(它是专门为3d最大)

这个答案是:

Build a heap/priority queue.  O(n)
Pop top element.  O(log n)
Pop top element.  O(log n)
Pop top element.  O(log n)

Total = O(n) + 3 O(log n) = O(n)

c++标准库几乎完全有这个函数调用nth_element,尽管它确实会修改数据。它有线性运行时间,O(N),它也做部分排序。

const int N = ...;
double a[N];
// ... 
const int m = ...; // m < N
nth_element (a, a + m, a + N);
// a[m] contains the mth element in a

虽然不是很确定O(n)复杂度,但肯定在O(n)和nLog(n)之间。也肯定更接近于O(n)而不是nLog(n)函数是用Java编写的

public int quickSelect(ArrayList<Integer>list, int nthSmallest){
    //Choose random number in range of 0 to array length
    Random random =  new Random();
    //This will give random number which is not greater than length - 1
    int pivotIndex = random.nextInt(list.size() - 1); 

    int pivot = list.get(pivotIndex);

    ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
    ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();

    //Split list into two. 
    //Value smaller than pivot should go to smallerNumberList
    //Value greater than pivot should go to greaterNumberList
    //Do nothing for value which is equal to pivot
    for(int i=0; i<list.size(); i++){
        if(list.get(i)<pivot){
            smallerNumberList.add(list.get(i));
        }
        else if(list.get(i)>pivot){
            greaterNumberList.add(list.get(i));
        }
        else{
            //Do nothing
        }
    }

    //If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list 
    if(nthSmallest < smallerNumberList.size()){
        return quickSelect(smallerNumberList, nthSmallest);
    }
    //If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
    //The step is bit tricky. If confusing, please see the above loop once again for clarification.
    else if(nthSmallest > (list.size() - greaterNumberList.size())){
        //nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in 
        //smallerNumberList
        nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
        return quickSelect(greaterNumberList,nthSmallest);
    }
    else{
        return pivot;
    }
}

我实现了在n个未排序元素中寻找第k个最小值的动态规划,特别是竞赛方法。执行时间为O(n + klog(n))。所使用的机制在维基百科关于选择算法的页面上被列为方法之一(如上面的帖子之一所示)。你可以阅读算法,也可以在我的博客页面“查找k个最小值”上找到代码(java)。此外,逻辑可以对列表进行部分排序——在O(klog(n))时间内返回第一个K min(或max)。

虽然代码提供了第k个最小值的结果,但可以使用类似的逻辑来查找O(klog(n))中的第k个最大值,忽略创建比赛树的前期工作。

我会这样做:

initialize empty doubly linked list l
for each element e in array
    if e larger than head(l)
        make e the new head of l
        if size(l) > k
            remove last element from l

the last element of l should now be the kth largest element

您可以简单地存储指向链表中第一个和最后一个元素的指针。它们只在更新列表时更改。

更新:

initialize empty sorted tree l
for each element e in array
    if e between head(l) and tail(l)
        insert e into l // O(log k)
        if size(l) > k
            remove last element from l

the last element of l should now be the kth largest element