Priority Set in Swift 6
The Priority Set is a data structure that highly resembles a priority queue, but it enforces uniqueness of the elements contained within. Swift 6 doesn’t natively contain a priority queue though a collections library does have a min-max heap that can act as a priority queue. This gave me the opportunity to design my own implementation of a priority queue that better reflects what I need from it, which ultimately yielded the priorty set.
In the traditional conception of a priority queue, the priority of an element is an intrinsic property of the element itself. However, this does not hold for all uses of priority sets, particularly when it is used for the A* algorithm, where the priority value for a given node actually depends more on the context of the node in the graph as opposed to any intrinsic value of the node itself. Additionally, as the context of the node is often only partially known at the time it is added to the priority queue, it’s priority may be updated between when it was originally added to the queue and when it is dequeued. The priority set is built with this purpose in mind - handling all the involved complexity internally.
So, what does the priority set look like? Let’s dive into it. The full source code is available on my github (TODO: Insert link here)
Implementation
In this priority set, elements are added to the set with an upfront separation between the element itself and its
priority value - prioSet.add(element, withPriority: priority). Internally, these are sorted into four data
structures. The first, _priority, is an array of priority values implementing a heap, which is used to enforce the
priority ordering on the priority values in the set. The second, _elements, is a dictionary where each priority
value points to its corresponding array of elements. This dictionary also resolves conflicts when multiple elements
have the same priority by using an array as a queue to prioritize elements added to the priority set first. The
third, _reverse, is a reverse dictionary, where each element points to its respective priority value. This allows
easy querying and updating of the priority value. The last, _index, is a dictionary that stores the indices of
each priority value in the priority array, which is also used for easier and faster updating of priority values.
public typealias Priority_Definition<Priority> = (_ p1: Priority, _ p2: Priority) -> (Bool)
public struct PrioritySet<Element: Hashable, Priority: Hashable> {
internal var _prio_function: Priority_Definition<Priority>
internal var _elements = [Priority: [Element]]()
internal var _priority = [Priority]()
internal var _reverse = [Element: Priority] ()
internal var _index = [Priority: Int]()
public init(_ prio_func: @escaping Priority_Definition<Priority>) {
self._prio_function = prio_func
}
The internal workings of the priority set rely on the _priority array acting as a heap. In order to do so, the
_priority array needs to be sifted using the user-provided priority function, which serves to define how two
priority values are compared. The Priority_Definition typealias is exposed so that conforming to the expected
format is easier. Allowing the user to specify the priority function allows for the priority set to be applicable
in multiple contexts where the way things are prioritized differs, or where the type of the priority value is
nonstandard. This way, the canonical priority set can be used in many situations that would call for either a
min-heap or a max-heap.
internal mutating func _sift_up(_ index: Int) {
var id = index
var parent = (id - 1) / 2
while id > 0 && _prio_function(_priority[id], _priority[parent]) {
_swap(id, parent)
id = parent
parent = (id - 1) / 2
}
}
internal mutating func _sift_down(_ index: Int) {
var id = index
while 2 * id + 1 < _priority.count {
let left = 2 * id + 1
let right = 2 * id + 2
var j = left
if right < _priority.count && _prio_function(_priority[right], _priority[left]) {
j = right
}
if _prio_function(_priority[j], _priority[id]) {
_swap(id, j)
id = j
}
else {
break
}
}
}
internal mutating func _swap(_ a: Int, _ b: Int) {
_index[_priority[a]] = b
_index[_priority[b]] = a
let temp = _priority[a]
_priority[a] = _priority[b]
_priority[b] = temp
}
I think the only thing that is special about the sift code compared to most priority queue implementations I have seen is that I have avoided the more typical recursive definitions of these functions. I’m not a huge fan of recursion where loops suffice, in part because recursion can blow up a stack trace when an error occurs at scale, and because it is easier to miss termination conditions when using recursion. Otherwise, there isn’t anything special here.
That is not the case for the _swap function, but only because extra care needs to be taken due to the _index
dictionary. As such, the index positions for the two priorities to be swapped also need to be swapped.
Moving on, here’s a bunch of small functions and variables that make life easier when using the priority set:
public var isEmpty: Bool {
_priority.isEmpty
}
public var count: Int {
_reverse.keys.count
}
public var unordered: [Element] {
Array(_reverse.keys)
}
public var top: Element? {
if _priority.isEmpty {
return nil
}
return _elements[_priority[0]]!.first
}
public func priority(of element: Element) -> Priority? {
_reverse[element]
}
In an earlier version of this code, the count property returned the length of the _priority array as that
represented the size of the underlying heap, but as multiple elements can have the same priority value, this would
under-count the number of elements in the set. As every element has its own entry in the _reverse dictionary, this
better reflects the size of the priority set.
The unordered property is actually unordered. Swift dictionaries are not ordered collections and do not have order
for their keys. I call this out as it may be tempting to assume that the returned order is related to the internal
workings of the priority set, however this is not the case. Constructing any particular order of the elements in the
priority set would be a linear time operation, as opposed to the constant time required for this unordered definition.
The priority(of:) function is exposed for convenience, removing the need for an external record of the priority
values of contained elements, and may be particularly useful when using the priority value of an element for
computation. However, it is not needed when working with the priority set, as the relevant checks are already
performed internally, as I will discuss below. This function can also be used to test membership if needed.
public mutating func add(_ element: Element, withPriority priority: Priority, replacing toReplace: Bool = false, updating toUpdate: Bool = true) {
if let _ = _reverse[element] {
if toReplace {
replace(element, withPriority: priority)
}
else if toUpdate {
update(element, withPriority: priority)
}
}
else {
if let _ = _elements[priority] {
_elements[priority]!.append(element)
}
else {
_elements[priority] = [element]
_priority.append(priority)
_index[priority] = _priority.count - 1
_sift_up(_priority.count - 1)
}
_reverse[element] = priority
}
}
The add(element: withPriority:) function is the first spot where the priority set begins to get a bit more
complex than a standard priority queue. This is for two reasons: first because of the additional data structures
contained within the priority set, and secondly because of the addition of the update and replace functions. The
function header contains two optional parameters with default arguments that allow for external code to enforce to
certain degrees that the added element has specified priority. If the element already exists in the priority queue,
the add function will redirect to the replace or update functions depending on those optional parameters (this is
the result of the first if block).
For example, in A*, it is common to calculate a fscore for a node, then add that node (with its associated priority) to a priority queue if the fscore is smaller than any previously found fscore for that node. Depending on the implementation of the priority queue, this can result in duplicate elements with different priorities being added to the queue. Using a priority set with the updating parameter in the add function call being True, the priority set will ensure that only one copy of the node with the smallest found fscore will exist on the heap, obviating the need for those external checks.
Even if the element did not already exist in the priority set, there is a possibility that its priority value is being used for another element. As such, the priority set only adds the priority value to the heap if it is not already present. If it is present, the element is added to the end of the array its priority points to, which allows for older elements with the same priority to be prioritized in a queue-like fashion.
public mutating func pop() -> Element? {
if let out = top {
remove(out)
return out
}
return nil
}
public mutating func remove(_ element: Element) {
if let priority = _reverse[element] {
if _elements[priority]!.count == 1 {
_elements[priority] = nil
let priority_index = _index[priority]!
_swap(priority_index, _priority.count - 1)
_priority.removeLast()
_sift_down(priority_index)
_index[priority] = nil
}
else{
_elements[priority]!.remove(at: _elements[priority]!.firstIndex(of: element)!)
}
_reverse[element] = nil
}
}
Conceptually, the pop() and remove(element:) functions are much simpler than the add function. The pop function
removes and returns the prioritized element in the set, using the remove function. The remove function removes the
specified element. Neither function needs bother with updating or replacing priorities, so the only complexity regards
whether the removed element is the only element with its priority. If it is not, the only action that needs taking
is removing it from the list of elements for that priority in _elements.
public mutating func update(_ element: Element, withPriority priority: Priority, adding toAdd: Bool = true) {
if let old_priority = _reverse[element] {
if _prio_function(priority, old_priority) {
replace(element, withPriority: priority)
}
}
else if toAdd {
add(element, withPriority: priority)
}
}
public mutating func replace(_ element: Element, withPriority priority: Priority, adding toAdd: Bool = true) {
if let old_priority = _reverse[element] {
let old_index = _index[old_priority]!
if _elements[old_priority]!.count == 1 {
_elements[old_priority] = nil
if _elements[priority] == nil {
_elements[priority] = [element]
_priority[old_index] = priority
_index[priority] = _index[old_priority]
if _prio_function(priority, old_priority) {
_sift_up(old_index)
}
else {
_sift_down(old_index)
}
}
else {
_elements[priority]!.append(element)
_swap(old_index, _priority.count - 1)
_priority.removeLast()
_sift_down(old_index)
}
_index[old_priority] = nil
}
else {
_elements[old_priority]!.remove(at: _elements[old_priority]!.firstIndex(of: element)!)
if _elements[priority] == nil {
_elements[priority] = [element]
_priority.append(priority)
_index[priority] = _priority.count - 1
_sift_up(_priority.count - 1)
}
else {
_elements[priority]!.append(element)
}
}
_reverse[element] = priority
}
else if toAdd {
add(element, withPriority: priority)
}
}
The update(element: withPriority:) and replace(element: withPriority) functions both serve to change the
priority value of an element that already exists in the priority set (though both have an optional parameter that is
by default True to add the element to the priority set if it is not in the set). The difference between the two is
that the update function will only update the priority if the new priority is more prioritized than the prior
priority (e.g., a higher priority when acting as a max-heap, or a smaller priority when acting as a min-heap),
whereas the replace function will replace the priority regardless of its new or prior value.
The replace function is the most complicated function in the priority set, as the exact operations that it needs perform depend on whether the new or prior priorities had or would have multiple elements. In the case that neither have/would have multiple elements, it is most efficient to replace the priority value in the heap in place and to sift up or down depending on whether it is larger or smaller than the prior priority. Otherwise, it is necessary to add or remove the priority from the heap as/if necessary.
Complexity
In general usage, the priority set inherits the same or better running time as compared to a priority queue. However, the running time can depend on how many elements share the same priority. If no elements share the same priority, then add, pop, remove, update, and replace all take O(logn) time. This matches the time complexity of a traditional priority queue. However, if all elements share the same priority, the priority set instead inherits the time complexity of a queue, with the add and pop functions taking O(1) time, and the remove, update, and replace functions taking O(n). However, in this case, you would be better served just using a queue, as you would not need to the priority aspect of the priority set.
In terms of spatial complexity, the priority set take O(n) space, though in practice is larger than many other O(n) collections.
Sample Use
For a practical demonstration of how to use the priority set, I will use it to solve Leetcode 692. The priority set is not the best solution (its not even a good one) for this leetcode problem, but the problem is sufficient to illustrate how the priority set can be used.
The basic premise behind leetcode 692 is that there is a list of words, and the task to find and return the k most frequent of those words. When multiple words have the same number of instances, they must be returned in lexicographical order.
In order to give priority to both the number of instances and, should that be equal, the lexicographic order, we define the priority set in the following way:
import PrioritySet
var words = ["i","love","leetcode","i","love","coding"]
var k = 3
struct Pair: Hashable {
var i = 0
var s = ""
}
var heap = PrioritySet<String, Pair>({(p1: Pair, p2: Pair) -> (Bool) in
return p1.i > p2.i || (p1.i == p2.i && p1.s < p2.s)
})
Now we just iterate through the list of provided words and add them to the priorty set.
for word in words {
if var prio = heap.priority(of: word) {
prio.i += 1
heap.update(word, withPriority: prio)
}
else {
var prio = Pair()
prio.i = 1
prio.s = word
heap.add(word, withPriority: prio)
}
}
Then just pop the first k words and return that.
var out = [String]()
for _ in 0..<k {
out.append(heap.pop()!)
}
print(out)
// -> ["i", "love", "coding"]
Now, this isn’t the best way to use the priority set here (actually, the priority set provides no advantages over a normal priority queue here), but it does serve as a good example for how to use it, particularly highlighting the update function not present in a typical priority queue.
Now, this is a terrible way to do counting. For a real solution, don’t use the priority queue’s update function to count the number of instances of each word (as this requires O(logn) per update operation). Instead, put the counts into a dictionary, then iterate through the keys of the dictionary to add to the priority set, like below:
var count = [String: Int]()
for word in words {
if let _ = count[word] {
count[word]! += 1
}
else {
count[word] = 1
}
}
for word in count.keys {
var prio = pair()
prio.i = count[word]!
prio.s = word
heap.add(word, withPriority: prio)
}
For me, writing this post, this does make my priority set feel a bit silly, as this better approach uses none of the features of the priority set. But I will be posting a better use case soon-ish, which is the A* algorithm.