我可能有一个像下面这样的数组:
[1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
或者,实际上,任何类似类型的数据部分的序列。我要做的是确保每个相同的元素只有一个。例如,上面的数组将变成:
[1, 4, 2, 6, 24, 15, 60]
请注意,删除了2、6和15的重复项,以确保每个相同的元素中只有一个。Swift是否提供了一种容易做到这一点的方法,还是我必须自己做?
我可能有一个像下面这样的数组:
[1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
或者,实际上,任何类似类型的数据部分的序列。我要做的是确保每个相同的元素只有一个。例如,上面的数组将变成:
[1, 4, 2, 6, 24, 15, 60]
请注意,删除了2、6和15的重复项,以确保每个相同的元素中只有一个。Swift是否提供了一种容易做到这一点的方法,还是我必须自己做?
当前回答
func removeDublicate (ab: [Int]) -> [Int] {
var answer1:[Int] = []
for i in ab {
if !answer1.contains(i) {
answer1.append(i)
}}
return answer1
}
用法:
let f = removeDublicate(ab: [1,2,2])
print(f)
其他回答
受https://www.swiftbysundell.com/posts/the-power-of-key-paths-in-swift的启发,我们可以声明一个更强大的工具,它能够过滤任何keyPath上的唯一性。感谢Alexander关于复杂性的各种答案的评论,下面的解决方案应该是最优的。
Non-mutating解决方案
我们扩展了一个函数,它可以过滤任意keyPath上的唯一性:
extension RangeReplaceableCollection {
/// Returns a collection containing, in order, the first instances of
/// elements of the sequence that compare equally for the keyPath.
func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> Self {
var unique = Set<T>()
return filter { unique.insert($0[keyPath: keyPath]).inserted }
}
}
注意:在你的对象不符合RangeReplaceableCollection,但符合Sequence的情况下,你可以有这个额外的扩展,但返回类型将始终是一个数组:
extension Sequence {
/// Returns an array containing, in order, the first instances of
/// elements of the sequence that compare equally for the keyPath.
func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> [Element] {
var unique = Set<T>()
return filter { unique.insert($0[keyPath: keyPath]).inserted }
}
}
使用
如果我们想要元素本身的唯一性,如问题中所示,我们使用keyPath \.self:
let a = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
let b = a.unique(for: \.self)
/* b is [1, 4, 2, 6, 24, 15, 60] */
如果我们想要其他东西的唯一性(比如对象集合的id),那么我们使用我们选择的keyPath:
let a = [CGPoint(x: 1, y: 1), CGPoint(x: 2, y: 1), CGPoint(x: 1, y: 2)]
let b = a.unique(for: \.y)
/* b is [{x 1 y 1}, {x 1 y 2}] */
变异的解决方案
我们扩展了一个变异函数,它能够过滤任意keyPath上的唯一性:
extension RangeReplaceableCollection {
/// Keeps only, in order, the first instances of
/// elements of the collection that compare equally for the keyPath.
mutating func uniqueInPlace<T: Hashable>(for keyPath: KeyPath<Element, T>) {
var unique = Set<T>()
removeAll { !unique.insert($0[keyPath: keyPath]).inserted }
}
}
使用
如果我们想要元素本身的唯一性,如问题中所示,我们使用keyPath \.self:
var a = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
a.uniqueInPlace(for: \.self)
/* a is [1, 4, 2, 6, 24, 15, 60] */
如果我们想要其他东西的唯一性(比如对象集合的id),那么我们使用我们选择的keyPath:
var a = [CGPoint(x: 1, y: 1), CGPoint(x: 2, y: 1), CGPoint(x: 1, y: 2)]
a.uniqueInPlace(for: \.y)
/* a is [{x 1 y 1}, {x 1 y 2}] */
为此,我做了一个尽可能简单的扩展。
extension Array where Element: Equatable {
func containsHowMany(_ elem: Element) -> Int {
return reduce(0) { $1 == elem ? $0 + 1 : $0 }
}
func duplicatesRemoved() -> Array {
return self.filter { self.containsHowMany($0) == 1 }
}
mutating func removeDuplicates() {
self = self.duplicatesRemoved(()
}
}
您可以使用duplicatesRemoved()来获取一个新数组,它的重复元素将被删除,或者使用removeduplicate()来改变自身。看到的:
let arr = [1, 1, 1, 2, 2, 3, 4, 5, 6, 6, 6, 6, 6, 7, 8]
let noDuplicates = arr.duplicatesRemoved()
print(arr) // [1, 1, 1, 2, 2, 3, 4, 5, 6, 6, 6, 6, 6, 7, 8]
print(noDuplicates) // [1, 2, 3, 4, 5, 6, 7, 8]
arr.removeDuplicates()
print(arr) // [1, 2, 3, 4, 5, 6, 7, 8]
我创建了一个时间复杂度为o(n)的高阶函数。另外,像map这样的功能可以返回您想要的任何类型。
extension Sequence {
func distinct<T,U>(_ provider: (Element) -> (U, T)) -> [T] where U: Hashable {
var uniqueKeys = Set<U>()
var distintValues = [T]()
for object in self {
let transformed = provider(object)
if !uniqueKeys.contains(transformed.0) {
distintValues.append(transformed.1)
uniqueKeys.insert(transformed.0)
}
}
return distintValues
}
}
如果你把两个扩展都放在你的代码中,更快的Hashable版本将在可能的情况下使用,Equatable版本将用作备用版本。
public extension Sequence where Element: Hashable {
/// The elements of the sequence, with duplicates removed.
/// - Note: Has equivalent elements to `Set(self)`.
@available(
swift, deprecated: 5.4,
message: "Doesn't compile without the constant in Swift 5.3."
)
var firstUniqueElements: [Element] {
let getSelf: (Element) -> Element = \.self
return firstUniqueElements(getSelf)
}
}
public extension Sequence where Element: Equatable {
/// The elements of the sequence, with duplicates removed.
/// - Note: Has equivalent elements to `Set(self)`.
@available(
swift, deprecated: 5.4,
message: "Doesn't compile without the constant in Swift 5.3."
)
var firstUniqueElements: [Element] {
let getSelf: (Element) -> Element = \.self
return firstUniqueElements(getSelf)
}
}
public extension Sequence {
/// The elements of the sequences, with "duplicates" removed
/// based on a closure.
func firstUniqueElements<Hashable: Swift.Hashable>(
_ getHashable: (Element) -> Hashable
) -> [Element] {
var set: Set<Hashable> = []
return filter { set.insert(getHashable($0)).inserted }
}
/// The elements of the sequence, with "duplicates" removed,
/// based on a closure.
func firstUniqueElements<Equatable: Swift.Equatable>(
_ getEquatable: (Element) -> Equatable
) -> [Element] {
reduce(into: []) { uniqueElements, element in
if zip(
uniqueElements.lazy.map(getEquatable),
AnyIterator { [equatable = getEquatable(element)] in equatable }
).allSatisfy(!=) {
uniqueElements.append(element)
}
}
}
}
如果顺序不重要,那么你总是可以使用这个Set初始化式。
像函数式程序员一样思考:)
要根据元素是否已经出现来筛选列表,需要索引。可以使用enumeration获取索引,并使用map返回值列表。
let unique = myArray
.enumerated()
.filter{ myArray.firstIndex(of: $0.1) == $0.0 }
.map{ $0.1 }
这保证了秩序。如果你不介意顺序,那么Array(Set(myArray))的现有答案更简单,可能更有效。
更新:一些关于效率和正确性的注意事项
一些人对效率进行了评论。我肯定是先写正确而简单的代码,然后再找出瓶颈,尽管我知道这是否比Array(Set(Array))更清楚是有争议的。
这个方法比Array(Set(Array))慢很多。正如评论中所指出的,它确实保持了顺序,并对非Hashable的元素起作用。
然而,@Alain T的方法也保持了秩序,也快得多。所以除非你的元素类型是不可哈希的,或者你只是需要一个快速的一行,那么我建议采用他们的解决方案。
以下是MacBook Pro(2014)在Xcode 11.3.1 (Swift 5.1)发布模式下的一些测试。
profiler函数和两个比较方法:
func printTimeElapsed(title:String, operation:()->()) {
var totalTime = 0.0
for _ in (0..<1000) {
let startTime = CFAbsoluteTimeGetCurrent()
operation()
let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
totalTime += timeElapsed
}
let meanTime = totalTime / 1000
print("Mean time for \(title): \(meanTime) s")
}
func method1<T: Hashable>(_ array: Array<T>) -> Array<T> {
return Array(Set(array))
}
func method2<T: Equatable>(_ array: Array<T>) -> Array<T>{
return array
.enumerated()
.filter{ array.firstIndex(of: $0.1) == $0.0 }
.map{ $0.1 }
}
// Alain T.'s answer (adapted)
func method3<T: Hashable>(_ array: Array<T>) -> Array<T> {
var uniqueKeys = Set<T>()
return array.filter{uniqueKeys.insert($0).inserted}
}
以及少量的测试输入:
func randomString(_ length: Int) -> String {
let letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
return String((0..<length).map{ _ in letters.randomElement()! })
}
let shortIntList = (0..<100).map{_ in Int.random(in: 0..<100) }
let longIntList = (0..<10000).map{_ in Int.random(in: 0..<10000) }
let longIntListManyRepetitions = (0..<10000).map{_ in Int.random(in: 0..<100) }
let longStringList = (0..<10000).map{_ in randomString(1000)}
let longMegaStringList = (0..<10000).map{_ in randomString(10000)}
给出输出:
Mean time for method1 on shortIntList: 2.7358531951904296e-06 s
Mean time for method2 on shortIntList: 4.910230636596679e-06 s
Mean time for method3 on shortIntList: 6.417632102966309e-06 s
Mean time for method1 on longIntList: 0.0002518167495727539 s
Mean time for method2 on longIntList: 0.021718120217323302 s
Mean time for method3 on longIntList: 0.0005312927961349487 s
Mean time for method1 on longIntListManyRepetitions: 0.00014377200603485108 s
Mean time for method2 on longIntListManyRepetitions: 0.0007293639183044434 s
Mean time for method3 on longIntListManyRepetitions: 0.0001843773126602173 s
Mean time for method1 on longStringList: 0.007168249964714051 s
Mean time for method2 on longStringList: 0.9114790915250778 s
Mean time for method3 on longStringList: 0.015888616919517515 s
Mean time for method1 on longMegaStringList: 0.0525397013425827 s
Mean time for method2 on longMegaStringList: 1.111266262292862 s
Mean time for method3 on longMegaStringList: 0.11214958941936493 s