Overflowing on deinit in Swift

Swift, like many reference-counted languages, exhibits a subtle footgun with object destruction - nested objects can easily trigger stack overflows.

The trap goes something like this. Take an object like this:

indirect enum E {
    case leaf
    case wrapped(E)
}

func willDie() {
    var e = E.leaf
    for _ in 1...1_000_000 {
        e = .wrapped(e)
    }
}

Now e is a value type here, but because we are using the indirect keyword, under the hood we get a heap-allocated box. This gets referenced counted. So we can think of the whole object as being reference counted.

When e falls out of scope, its reference count drops to 0, and destroy is called on it. This destroy drops the refcount of the wrapped object before exiting, which calls the destroy of the wrapped object, and so the cycle continues. It’s pretty easy to see this cause stack overflows with lists of reasonably tame lengths.

Note that actually, the compiler optimises away simple cases like this. But you can reproduce the issue by doing something like this:

indirect enum F {
    case leaf
    case wrapped(F, Int)
}

func willDie() {
    var e = F.leaf
    for _ in 1...1_000_000 {
        e = .wrapped(e, 1)
    }
}

The situation is non-trivial to resolve. Clearly, we want to move this operation to the heap by making it iterative, rather than recursive. Since we’re dealing with Automatic Reference Counting, we are going to want to play some tricks with placeholder references. Putting this together gets sticky because of some additional fiddly parts of Swift.

Core idea

At its most basic, the idea here is that we block the destroys from cascading by holding a strong reference to any children that exist. This prevents the refcount of the child dropping to 0, so destroy doesn’t have to destroy anything else.

extension E {
    var children: [E] {
        switch self {
        case .leaf: []
        case let .wrapped(e): [e]
        }
    }

// in `E`'s declaration...
deinit {
    var stack = e.children

    while let curr = stack.popLast() {
        // take strong references on any children
        // for our example, this is just the single wrapped
        // node (if present)
        stack.append(curr.children)

        // `curr` gets `deinit`d here
    }
}

Now this is great in theory - the idea is that once curr drops out of scope in the loop, it is no longer referenced anywhere (enums aren’t really reference types in Swift, so this is true). curr’s deinit gets called, but because we hold strong references to each of its children, there is no cascading. So we’ve unrolled the deinit calls to happen on the heap. Great!

There’s a problem though. You can’t define a deinit on an enum. At least, not a generic one. Or a Copyable one. Or a ~Copyable one. This is a bit of a spanner!

There is a solution. We can box the nested values in a class, and define the deinit on that. This means we have to be extra careful about managing our dropping conditions, but it’s completely possible.

Here’s a sketch:

enum E {
    case leaf
    case wrapped(Box)

    static func wrapped(_ e: Self) -> Self {
        .wrapped(Box(e))
    }

    final class Box {
        private var ref: E?
        var value: E { ref! }

        init(_ e: E) {
            self.ref = e
        }

        private func dropSubrefs() {
            ref = nil
        }

        deinit {
            var stack: [Box] = [self]

            while let box = stack.popLast() {
                if case let .some(.wrapped(nextBox)) = box.ref {
                    stack.append(nextBox)
                    box.dropSubrefs()
                }
            }
        }
    }
}