Say I have an array of strings:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
How would I get rid of the duplicates?
Say I have an array of strings:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
How would I get rid of the duplicates?
You can use the array function contains(_:) to check if an element is already part of the array, but that is fairly slow, and for large arrays it won’t perform well. (1.) Better to copy the entries into a Set and use Set operations to find and remove the duplicates. Sets are optimized to make testing for set membership fast, so if aSet.contains(item) is a lot faster than if anArray.contains(item).
If you don't care about preserving the order of your items, you can simply copy your array into a set and then back to an array. However, that does mean that the items in the resulting array will be in a different order.
A function to remove duplicates from an array of strings, while preserving the order, might look like this:
func uniqueElementsFrom(array: [String]) -> [String] {
//Create an empty Set to track unique items
var set = Set<String>()
let result = array.filter {
guard !set.contains($0) else {
//If the set already contains this object, return false
//so we skip it
return false
}
//Add this item to the set since it will now be in the array
set.insert($0)
//Return true so that filtered array will contain this item.
return true
}
return result
}
If you call it with code like this:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
let uniqueStrings = uniqueElementsFrom(array:arrayOfStrings)
print("Unique elements from \(arrayOfStrings) = \n” +
“\(uniqueStrings)")
The output would be
Unique elements from ["a", "b", "a", "c", "a", "d"] =
[“a”, "b", "c", "d"]
However, that function only works with arrays of strings. It would be good if we could write a function that could remove duplicates from any kind of array.
This is a job for Generics. There is a catch however. Sets can only contain objects that conform to the Hashable protocol, since Sets use hashes to make testing for set membership faster.
We can rewrite the uniqueElementsFrom(array:) function to take any array that conforms to the Hashable protocol using Generics. That code looks like this:
func uniqueElementsFrom<T: Hashable>(array: [T]) -> [T] {
var set = Set<T>()
let result = array.filter {
guard !set.contains($0) else {
return false
}
set.insert($0)
return true
}
return result
}
The <T: Hashable> bit after the function name says "The rest of this function will refer to a type T which is unspecified. The only thing you can be sure of is that the type T will conform to the Hashable protocol."
This form of the uniqueElementsFrom(array:) function will work on any array who’s elements are Hashable.
(1.) For arrays, contains(_:) has O(n) performance, and so looping through an array, testing the array to see if it contains each new element with contains(_:) has performance that's almost O(n^2), which is really, really bad for anything but a small array. I'm pretty sure that Set's contains(_:) function has constant time performance, so the whole process would have O(n) performance.