
在第一个列表中出现但在第二个列表中没有出现的项目 出现在第二个列表中但不在第一个列表中的项目


var list1 = list.Where(i => !list2.Contains(i)).ToList();
var list2 = list2.Where(i => !list.Contains(i)).ToList();

但这并不像我想的那样好。 有什么想法使这更快和更少的资源密集,因为我需要处理很多列表?


var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();

我怀疑有一些方法实际上会比这个稍微快一点,但即使是这个方法也会比O(N * M)方法快得多。


return !firstNotSecond.Any() && !secondNotFirst.Any();




var inListButNotInList2 = list.Except(list2);
var inList2ButNotInList = list2.Except(list);


var first10 = inListButNotInList2.Take(10);



var difList = list1.Where(a => !list2.Any(a1 => a1.id == a.id))
            .Union(list2.Where(a => !list1.Any(a1 => a1.id == a.id)));


public class EquatableList<T> : List<T>, IEquatable<EquatableList<T>> where    T : IEquatable<T>

/// <summary>
/// True, if this contains element with equal property-values
/// </summary>
/// <param name="element">element of Type T</param>
/// <returns>True, if this contains element</returns>
public new Boolean Contains(T element)
    return this.Any(t => t.Equals(element));

/// <summary>
/// True, if list is equal to this
/// </summary>
/// <param name="list">list</param>
/// <returns>True, if instance equals list</returns>
public Boolean Equals(EquatableList<T> list)
    if (list == null) return false;
    return this.All(list.Contains) && list.All(this.Contains);


List<string> list1 = new List<string> { "a.dll", "b1.dll" };
List<string> list2 = new List<string> { "A.dll", "b2.dll" };

var firstNotSecond = list1.Except(list2, StringComparer.OrdinalIgnoreCase).ToList();
var secondNotFirst = list2.Except(list1, StringComparer.OrdinalIgnoreCase).ToList();





    //Method to compare two list of string
    private List<string> Contains(List<string> list1, List<string> list2)
        List<string> result = new List<string>();

        result.AddRange(list1.Except(list2, StringComparer.OrdinalIgnoreCase));
        result.AddRange(list2.Except(list1, StringComparer.OrdinalIgnoreCase));

        return result;


string.Join("",List1) != string.Join("", List2)


var list3 = list1.Where(l => list2.ToList().Contains(l));


var set1 = new HashSet<T>(list1);
var set2 = new HashSet<T>(list2);
var areEqual = set1.SetEquals(set2);


using System.Collections.Generic;
using System.Linq;

namespace YourProject.Extensions
    public static class ListExtensions
        public static bool SetwiseEquivalentTo<T>(this List<T> list, List<T> other)
            where T: IEquatable<T>
            if (list.Except(other).Any())
                return false;
            if (other.Except(list).Any())
                return false;
            return true;



public sealed class Car : IEquatable<Car>
    public Price Price { get; }
    public List<Component> Components { get; }

    public override bool Equals(object obj)
        => obj is Car other && Equals(other);

    public bool Equals(Car other)
        => Price == other.Price
            && Components.SetwiseEquivalentTo(other.Components);

    public override int GetHashCode()
        => Components.Aggregate(
            (code, next) => code ^ next.GetHashCode()); // Bitwise XOR


注意我们如何编写GetHashCode是非常重要的。为了正确地实现IEquatable, Equals和GetHashCode必须以逻辑兼容的方式操作实例的属性。

Two lists with the same contents are still different objects, and will produce different hash codes. Since we want these two lists to be treated as equal, we must let GetHashCode produce the same value for each of them. We can accomplish this by delegating the hashcode to every element in the list, and using the standard bitwise XOR to combine them all. XOR is order-agnostic, so it doesn't matter if the lists are sorted differently. It only matters that they contain nothing but equivalent members.




tmp = []

for i in range(len(x)) and range(len(y)):
    if x[i]>y[i]:

可列举的。SequenceEqual方法 根据相等比较器确定两个序列是否相等。 MS.Docs

Enumerable.SequenceEqual(list1, list2);



IEqualityComparer接口 定义方法以支持相等的对象比较。 MS.Docs for IEqualityComparer

While Jon Skeet's answer is an excellent advice for everyday's practice with small to moderate number of elements (up to a few millions) it is nevertheless not the fastest approach and not very resource efficient. An obvious drawback is the fact that getting the full difference requires two passes over the data (even three if the elements that are equal are of interest as well). Clearly, this can be avoided by a customized reimplementation of the Except method, but it remains that the creation of a hash set requires a lot of memory and the computation of hashes requires time.

对于非常大的数据集(数十亿个元素),考虑特定的情况通常是有好处的。这里有一些想法可能会给你一些启发: 如果元素可以比较(在实践中几乎总是这样),那么对列表进行排序并应用以下zip方法是值得考虑的:

/// <returns>The elements of the specified (ascendingly) sorted enumerations that are
/// contained only in one of them, together with an indicator,
/// whether the element is contained in the reference enumeration (-1)
/// or in the difference enumeration (+1).</returns>
public static IEnumerable<Tuple<T, int>> FindDifferences<T>(IEnumerable<T> sortedReferenceObjects,
    IEnumerable<T> sortedDifferenceObjects, IComparer<T> comparer)
    var refs  = sortedReferenceObjects.GetEnumerator();
    var diffs = sortedDifferenceObjects.GetEnumerator();
    bool hasNext = refs.MoveNext() && diffs.MoveNext();
    while (hasNext)
        int comparison = comparer.Compare(refs.Current, diffs.Current);
        if (comparison == 0)
            // insert code that emits the current element if equal elements should be kept
            hasNext = refs.MoveNext() && diffs.MoveNext();

        else if (comparison < 0)
            yield return Tuple.Create(refs.Current, -1);
            hasNext = refs.MoveNext();
            yield return Tuple.Create(diffs.Current, 1);
            hasNext = diffs.MoveNext();


const int N = <Large number>;
const int omit1 = 231567;
const int omit2 = 589932;
IEnumerable<int> numberSequence1 = Enumerable.Range(0, N).Select(i => i < omit1 ? i : i + 1);
IEnumerable<int> numberSequence2 = Enumerable.Range(0, N).Select(i => i < omit2 ? i : i + 1);
var numberDiffs = FindDifferences(numberSequence1, numberSequence2, Comparer<int>.Default);

在我的计算机上对N = 1M进行基准测试,得到以下结果:

Method Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
DiffLinq 115.19 ms 0.656 ms 0.582 ms 1.00 2800.0000 2800.0000 2800.0000 67110744 B
DiffZip 23.48 ms 0.018 ms 0.015 ms 0.20 - - - 720 B

对于N = 100M:

Method Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
DiffLinq 12.146 s 0.0427 s 0.0379 s 1.00 13000.0000 13000.0000 13000.0000 8589937032 B
DiffZip 2.324 s 0.0019 s 0.0018 s 0.19 - - - 720 B


A few further comments: The speed of the comparison function is clearly relevant for the overall performance, so it may be beneficial to optimize it. The flexibility to do so is a benefit of the zipping approach. Furthermore, parallelization seems more feasible to me, although by no means easy and maybe not worth the effort and the overhead. Nevertheless, a simple way to speed up the process by roughly a factor of 2, is to split the lists respectively in two halfs (if it can be efficiently done) and compare the parts in parallel, one processing from front to back and the other in reverse order.

我比较了3种不同的方法来比较不同的数据集。下面的测试创建了一个包含从0到length - 1的所有数字的字符串集合,然后是另一个具有相同范围但包含偶数的集合。然后我从第一个集合中挑出奇数。


public void TestExcept()
    WriteLine($"Except {DateTime.Now}");
    int length = 20000000;
    var dateTime = DateTime.Now;
    var array = new string[length];
    for (int i = 0; i < length; i++)
        array[i] = i.ToString();
    Write("Populate set processing time: ");
    WriteLine(DateTime.Now - dateTime);
    var newArray = new string[length/2];
    int j = 0;
    for (int i = 0; i < length; i+=2)
        newArray[j++] = i.ToString();
    dateTime = DateTime.Now;
    Write("Count of items: ");
    Write("Count processing time: ");
    WriteLine(DateTime.Now - dateTime);


Except 2021-08-14 11:43:03 AM
Populate set processing time: 00:00:03.7230479
2021-08-14 11:43:09 AM
Count of items: 10000000
Count processing time: 00:00:02.9720879


public void TestHashSet()
    WriteLine($"HashSet {DateTime.Now}");
    int length = 20000000;
    var dateTime = DateTime.Now;
    var hashSet = new HashSet<string>();
    for (int i = 0; i < length; i++)
    Write("Populate set processing time: ");
    WriteLine(DateTime.Now - dateTime);
    var newHashSet = new HashSet<string>();
    for (int i = 0; i < length; i+=2)
    dateTime = DateTime.Now;
    Write("Count of items: ");
    // HashSet Add returns true if item is added successfully (not previously existing)
    WriteLine(hashSet.Where(s => newHashSet.Add(s)).Count());
    Write("Count processing time: ");
    WriteLine(DateTime.Now - dateTime);


HashSet 2021-08-14 11:42:43 AM
Populate set processing time: 00:00:05.6000625
Count of items: 10000000
Count processing time: 00:00:01.7703057


public void TestLoadingHashSet()
    int length = 20000000;
    var array = new string[length];
    for (int i = 0; i < length; i++)
       array[i] = i.ToString();
    var dateTime = DateTime.Now;
    var hashSet = new HashSet<string>(array);
    Write("Time to load hashset: ");
    WriteLine(DateTime.Now - dateTime);
> TestLoadingHashSet()
Time to load hashset: 00:00:01.1918160


public void TestContains()
    WriteLine($"Contains {DateTime.Now}");
    int length = 20000000;
    var dateTime = DateTime.Now;
    var array = new string[length];
    for (int i = 0; i < length; i++)
        array[i] = i.ToString();
    Write("Populate set processing time: ");
    WriteLine(DateTime.Now - dateTime);
    var newArray = new string[length/2];
    int j = 0;
    for (int i = 0; i < length; i+=2)
        newArray[j++] = i.ToString();
    dateTime = DateTime.Now;
    Write("Count of items: ");
    WriteLine(array.Where(a => !newArray.Contains(a)).Count());
    Write("Count processing time: ");
    WriteLine(DateTime.Now - dateTime);


Contains 2021-08-14 11:19:44 AM
Populate set processing time: 00:00:03.1046998
2021-08-14 11:19:49 AM
Count of items: Hosting process exited with exit code 1.
(Didnt complete. Killed it after 14 minutes)


Linq Except在我的设备上运行大约比使用HashSets慢1秒(n=20,000,000)。 使用Where和Contains运行了很长时间


独特的数据 确保为类类型重写GetHashCode(正确地) 如果您复制数据集,可能需要高达2倍的内存,这取决于实现 HashSet是为使用IEnumerable构造函数克隆其他HashSet而优化的,但是将其他集合转换为HashSet比较慢(参见上面的特殊测试)


var list1 = new List<int> { 1, 2, 3 };
var list2 = new List<int> { 1, 2, 3, 4 };
if (list1.Except(list2).Count() + list2.Except(list1).Count() == 0)
    Console.WriteLine("same sets");


if (list1 != null && list2 != null && list1.Select(x => list2.SingleOrDefault(y => y.propertyToCompare == x.propertyToCompare && y.anotherPropertyToCompare == x.anotherPropertyToCompare) != null).All(x => true))
   return true;


if (list1 != null && list2 != null && list1.Select(x => list2.Any(y => y.propertyToCompare == x.propertyToCompare && y.anotherPropertyToCompare == x.anotherPropertyToCompare)).All(x => true))
   return true;


 public static class ListTools
    public enum RecordUpdateStatus
        Added = 1,
        Updated = 2,
        Deleted = 3

    public class UpdateStatu<T>
        public T CurrentValue { get; set; }
        public RecordUpdateStatus UpdateStatus { get; set; }

    public static List<UpdateStatu<T>> CompareList<T>(List<T> currentList, List<T> inList, string uniqPropertyName)
        var res = new List<UpdateStatu<T>>();

        res.AddRange(inList.Where(a => !currentList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
            .Select(a => new UpdateStatu<T>
                CurrentValue = a,
                UpdateStatus = RecordUpdateStatus.Added,

        res.AddRange(currentList.Where(a => !inList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
            .Select(a => new UpdateStatu<T>
                CurrentValue = a,
                UpdateStatus = RecordUpdateStatus.Deleted,

        res.AddRange(currentList.Where(a => inList.Any(x => x.GetType().GetProperty(uniqPropertyName).GetValue(x)?.ToString().ToLower() == a.GetType().GetProperty(uniqPropertyName).GetValue(a)?.ToString().ToLower()))
         .Select(a => new UpdateStatu<T>
             CurrentValue = a,
             UpdateStatus = RecordUpdateStatus.Updated,

        return res;


Jon Skeet和miguelmpn的回答都很好。这取决于列表元素的顺序是否重要:

// take order into account
bool areEqual1 = Enumerable.SequenceEqual(list1, list2);

// ignore order
bool areEqual2 = !list1.Except(list2).Any() && !list2.Except(list1).Any();