Reflecting on A Bad Stack
Oct 03, 2025After finishing up the first chapter from Too Many Lists, here are some of the things I’ve learned:
- Implementing the first iteration of our linked list.
- How the null pointer optimization can reduce memory overhead.
- Why the default drop trait on
Boxtype can cause stack overflow for ourList.
First iteration of our linked list
Here’s the good way to design “A Bad Stack”
struct List {
head: Link,
}
enum Link {
Empty,
More(Box<Node>),
}
struct Node {
element: i32,
next: Link,
}
Why not just use a simple List enum?
The bad way
pub enum List {
Empty,
Element(i32, Box<List>),
}
This is problematic for two reasons:
-
Non-uniform node layout
Only first element of the list is stack allocated, rest of the elements are allocated on the heap. This makes operations like splitting/merging lists inefficient because we have to copy data between stack and heap. -
Unnecessary heap allocation
The last node allocates anEmptyvariant on the heap, occupying 16 bytes: 8 bytes forBox<List>, 4 bytes fori32and 4 bytes for the padding.
Read the First Layout for a detailed explanation.
Null pointer optimization
Every enum has to store a tag to specify which variant of the enum its bits represent. Rust can however figure out which variant you have without needing the tag if:
- You have an enum with two variants
- One variant is empty and the other one is of a type that can never be all zeros
Here are some types that can never be all zeros:
- References like
&T,&mut Tbecause null pointers aren’t allowed in safe Rust - Owned pointers like
Box<T> - Types that internally contain a pointer that cannot be null, like
String,Vec<T> - Reference-counted pointers like
Rc<T>,Arc<T>
For our Link enum, this optimization works since we have two variants where one is empty and the other is Box<Node> which can never be all zeros
I found another example which really solidified my understanding
pub enum EnumOne {
A,
B(String),
}
pub enum EnumTwo {
A,
B(i32),
}
pub fn main() {
println!("\nWith null pointer optimization:");
println!(" String: {} bytes", std::mem::size_of::<String>());
println!(" EnumOne: {} bytes", std::mem::size_of::<EnumOne>());
println!("\nWithout null pointer optimization:");
println!(" i32: {} bytes", std::mem::size_of::<i32>());
println!(" EnumTwo: {} bytes", std::mem::size_of::<EnumTwo>());
}
If we try to run this, we’ll get
❯ cargo run
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.03s
Running `target/debug/too-many-lists`
With null pointer optimization:
String: 24 bytes
EnumOne: 24 bytes
Without null pointer optimization:
i32: 4 bytes
EnumTwo: 8 bytes
Since i32 can be all zeros, Rust needs a separate tag to distinguish variant A from B(0), which is why it needs 8 bytes: 4 bytes for i32, 1 byte for the tag and remaining 3 bytes of padding
Overflowing boxes
The Box type recursively drops its contents before deallocating itself. This can cause stack overflow for large lists.
Here is Rust’s Drop trait implementation for Box type:
#[stable(feature = "rust1", since = "1.0.0")]
unsafe impl<#[may_dangle] T: ?Sized, A: Allocator> Drop for Box<T, A> {
#[inline]
fn drop(&mut self) {
// the T in the Box is dropped by the compiler before the destructor is run
let ptr = self.0;
unsafe {
let layout = Layout::for_value_raw(ptr.as_ptr());
if layout.size() != 0 {
self.1.deallocate(From::from(ptr.cast()), layout);
}
}
}
}
Consider this list:
Box<Node A> → Box<Node B> → Box<Node C> → Box<Node D>
The drop sequence for this list looks like:
- Drop
Box<Node A>→ Compiler dropsNode A→Node Adrops its next field, which dropsBox<Node B> - Drop
Box<Node B>→ Compiler dropsNode B→Node Bdrops its next field, which dropsBox<Node C> - This goes on recursively for each node…
It’s problematic because the stack will look something like this:
drop Box<Node A>
drop Node A ← drops contents first
drop Box<Node B>
drop Node B ← still waiting
drop Box<Node C>
drop Node C ← still waiting
drop Box<Node D>
drop Node D ← finally done!
deallocate D
return
deallocate C
return
deallocate B
return
deallocate A
return
Each Box must stay on the stack until all subsequent boxes are dropped, because it still needs to deallocate its own memory afterward.
Now if we try and run it for a large list
#[test]
fn test_stack_overflow() {
let mut list = List::new();
for i in 0..100_000 {
list.push(i);
}
}
This causes a stack overflow
❯ cargo test
Running unittests src/lib.rs (target/debug/deps/too_many_lists-108401a94152d12f)
running 1 test
thread 'first::test::test_stack_overflow' has overflowed its stack
fatal runtime error: stack overflow, aborting
To fix this, we’ll have to implement our own Drop trait which does this iteratively. For that we’ll use mem::replace to unlink each node before it drops so that it doesn’t blow up the stack.
impl Drop for List {
fn drop(&mut self) {
// Replace the original value of self.head with Link::Empty and
// take ownership of the original value. We get back the original
// value of self.head WITHOUT drop
let mut link = mem::replace(&mut self.head, Link::Empty);
// Iteratively drop each node
while let Link::More(mut node) = link {
// Take ownership of the next link, leaving Empty in its place
link = mem::replace(&mut node.next, Link::Empty);
// node goes out of scope and gets dropped here, but since
// node.next is now Empty, it doesn't trigger recursive drops
}
}
}
The stack looks much better now:
drop Box<Node A>
drop Node A (next = Empty)
deallocate A
return
drop Box<Node B>
drop Node B (next = Empty)
deallocate B
return
drop Box<Node C>
drop Node C (next = Empty)
deallocate C
return
drop Box<Node D>
drop Node D (next = Empty)
deallocate D
return
We don’t get stack overflow anymore!
❯ cargo test
Running unittests src/lib.rs (target/debug/deps/too_many_lists-108401a94152d12f)
running 1 test
test first::test::test_stack_overflow ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s
And that’s chapter one! Next up is An Ok Stack.