谁有一个快速的方法去重复在c#的泛型列表?


当前回答

在。net 2.0中还有另一种方法

    static void Main(string[] args)
    {
        List<string> alpha = new List<string>();

        for(char a = 'a'; a <= 'd'; a++)
        {
            alpha.Add(a.ToString());
            alpha.Add(a.ToString());
        }

        Console.WriteLine("Data :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t); });

        alpha.ForEach(delegate (string v)
                          {
                              if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
                                  alpha.Remove(v);
                          });

        Console.WriteLine("Unique Result :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
        Console.ReadKey();
    }

其他回答

根据删除重复,我们必须应用下面的逻辑,所以它将以快速的方式删除重复。

public class Program
{

    public static void Main(string[] arges)
    {
        List<string> cities = new List<string>() { "Chennai", "Kolkata", "Mumbai", "Mumbai","Chennai", "Delhi", "Delhi", "Delhi", "Chennai", "Kolkata", "Mumbai", "Chennai" };
        cities = RemoveDuplicate(cities);

        foreach (var city in cities)
        {
            Console.WriteLine(city);
        }
    }

    public static List<string> RemoveDuplicate(List<string> cities)
    {
        if (cities.Count < 2)
        {
            return cities;
        }

        int size = cities.Count;
        for (int i = 0; i < size; i++)
        {
            for (int j = i+1; j < size; j++)
            {
                if (cities[i] == cities[j])
                {
                    cities.RemoveAt(j);
                    size--;
                    j--;
                }
            }
        }
        return cities;
    }
}

所有的答案要么复制列表,要么创建一个新列表,要么使用慢函数,要么就是慢得令人痛苦。

据我所知,这是我所知道的最快和最便宜的方法(同时,还得到了一个非常有经验的实时物理优化程序员的支持)。

// Duplicates will be noticed after a sort O(nLogn)
list.Sort();

// Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
int lastItem = -1;
int currItem = -1;

int size = list.Count;

// Store the index pointing to the last item we want to keep in the list
int last = size - 1;

// Travel the items from last to first O(n)
for (int i = last; i >= 0; --i)
{
    currItem = list[i];

    // If this item was the same as the previous one, we don't want it
    if (currItem == lastItem)
    {
        // Overwrite last in current place. It is a swap but we don't need the last
       list[i] = list[last];

        // Reduce the last index, we don't want that one anymore
        last--;
    }

    // A new item, we store it and continue
    else
        lastItem = currItem;
}

// We now have an unsorted list with the duplicates at the end.

// Remove the last items just once
list.RemoveRange(last + 1, size - last - 1);

// Sort again O(n logn)
list.Sort();

最终成本为:

nlogn + n + nlogn = n + 2nlogn = O(nlogn)非常漂亮。

关于RemoveRange注意事项: 由于我们不能设置列表的计数并避免使用Remove函数,我不知道这个操作的确切速度,但我猜这是最快的方法。

在Java中(我认为c#或多或少是相同的):

list = new ArrayList<T>(new HashSet<T>(list))

如果你真的想改变原来的列表:

List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);

为了保持顺序,只需将HashSet替换为LinkedHashSet。

使用Linq的Union方法。

注意:这个解决方案不需要了解Linq,只需要知道它存在。

Code

首先将以下内容添加到类文件的顶部:

using System.Linq;

现在,你可以使用下面的方法从一个名为obj1的对象中删除重复项:

obj1 = obj1.Union(obj1).ToList();

注意:将obj1重命名为对象的名称。

它是如何工作的

Union命令列出两个源对象的每个条目中的一个。由于obj1都是源对象,这将把obj1减少为每个条目中的一个。 ToList()返回一个新的List。这是必要的,因为像Union这样的Linq命令将结果返回为IEnumerable结果,而不是修改原来的List或返回一个新的List。

如果你有两个类Product和Customer,我们想从它们的列表中删除重复的项

public class Product
{
    public int Id { get; set; }
    public string ProductName { get; set; }
}

public class Customer
{
    public int Id { get; set; }
    public string CustomerName { get; set; }

}

您必须在下面的表单中定义一个泛型类

public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    private readonly PropertyInfo _propertyInfo;

    public ItemEqualityComparer(string keyItem)
    {
        _propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
    }

    public bool Equals(T x, T y)
    {
        var xValue = _propertyInfo?.GetValue(x, null);
        var yValue = _propertyInfo?.GetValue(y, null);
        return xValue != null && yValue != null && xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        var propertyValue = _propertyInfo.GetValue(obj, null);
        return propertyValue == null ? 0 : propertyValue.GetHashCode();
    }
}

然后,你可以删除列表中重复的项目。

var products = new List<Product>
            {
                new Product{ProductName = "product 1" ,Id = 1,},
                new Product{ProductName = "product 2" ,Id = 2,},
                new Product{ProductName = "product 2" ,Id = 4,},
                new Product{ProductName = "product 2" ,Id = 4,},
            };
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();

var customers = new List<Customer>
            {
                new Customer{CustomerName = "Customer 1" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
            };
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();

这段代码通过Id删除重复项,如果你想通过其他属性删除重复项,你可以更改名称(YourClass.DuplicateProperty)和名称(Customer.CustomerName),然后通过CustomerName属性删除重复项。