在c#中从List<T>中删除重复项

谁有一个快速的方法去重复在c#的泛型列表?

当前回答

简单地用相同类型的List初始化HashSet:

var noDupes = new HashSet<T>(withDupes);

或者，如果你想返回一个List:

var noDupsList = new HashSet<T>(withDupes).ToList();

2009-11-24 20:05:03

其他回答

在Java中(我认为c#或多或少是相同的):

list = new ArrayList<T>(new HashSet<T>(list))

如果你真的想改变原来的列表:

List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);

为了保持顺序，只需将HashSet替换为LinkedHashSet。

2008-09-06 19:29:41

所有的答案要么复制列表，要么创建一个新列表，要么使用慢函数，要么就是慢得令人痛苦。

据我所知，这是我所知道的最快和最便宜的方法(同时，还得到了一个非常有经验的实时物理优化程序员的支持)。

// Duplicates will be noticed after a sort O(nLogn)
list.Sort();

// Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
int lastItem = -1;
int currItem = -1;

int size = list.Count;

// Store the index pointing to the last item we want to keep in the list
int last = size - 1;

// Travel the items from last to first O(n)
for (int i = last; i >= 0; --i)
{
    currItem = list[i];

    // If this item was the same as the previous one, we don't want it
    if (currItem == lastItem)
    {
        // Overwrite last in current place. It is a swap but we don't need the last
       list[i] = list[last];

        // Reduce the last index, we don't want that one anymore
        last--;
    }

    // A new item, we store it and continue
    else
        lastItem = currItem;
}

// We now have an unsorted list with the duplicates at the end.

// Remove the last items just once
list.RemoveRange(last + 1, size - last - 1);

// Sort again O(n logn)
list.Sort();

最终成本为:

nlogn + n + nlogn = n + 2nlogn = O(nlogn)非常漂亮。

关于RemoveRange注意事项: 由于我们不能设置列表的计数并避免使用Remove函数，我不知道这个操作的确切速度，但我猜这是最快的方法。

2019-05-28 14:55:51

简单地用相同类型的List初始化HashSet:

var noDupes = new HashSet<T>(withDupes);

或者，如果你想返回一个List:

var noDupsList = new HashSet<T>(withDupes).ToList();

2009-11-24 20:05:03

使用HashSet可以很容易地做到这一点。

List<int> listWithDuplicates = new List<int> { 1, 2, 1, 2, 3, 4, 5 };
HashSet<int> hashWithoutDuplicates = new HashSet<int> ( listWithDuplicates );
List<int> listWithoutDuplicates = hashWithoutDuplicates.ToList();

2021-06-20 15:09:22

正如kronoz在. net 3.5中所说，您可以使用Distinct()。

在。net 2中，你可以模仿它:

public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input) 
{
    var passedValues = new HashSet<T>();

    // Relatively simple dupe check alg used as example
    foreach(T item in input)
        if(passedValues.Add(item)) // True if item is new
            yield return item;
}

这可用于删除任何集合，并将以原始顺序返回值。

通常，过滤一个集合(Distinct()和这个示例都是这样做的)比从其中删除项要快得多。

2008-09-07 09:44:26

在c#中从List<T>中删除重复项

推荐文章

最新文章

标签