Data Structures

Arrays, Lists, Stacks & Queues

The foundational linear data structures that underpin nearly every program. Understanding their memory layout, time complexities, and trade-offs is essential for writing efficient code.

01 / Arrays & Dynamic Arrays

Contiguous Memory & Random Access

An array stores elements in a contiguous block of memory. Given the base address and element size, any element can be located in O(1) time using simple arithmetic: addr = base + index * size. This makes arrays the fastest structure for random access.

However, inserting or deleting in the middle requires shifting elements, costing O(n). Arrays are also cache-friendly because sequential elements sit next to each other in memory, exploiting spatial locality for fast iteration.

Array Memory Layout
arr[0]
arr[1]
arr[2]
arr[3]
arr[4]
Contiguous in memory — O(1) access by index

Dynamic Arrays

A dynamic array (Java ArrayList, C++ vector, Python list) wraps a fixed-size array and resizes when capacity is exceeded. The typical strategy is to double the capacity, yielding amortized O(1) appends. Individual resize operations copy all elements (O(n)), but they happen infrequently enough that the average cost per append stays constant.

Doubling Strategy
When the array is full, allocate a new array of 2x the size and copy all elements over. Over n appends, total copy work is roughly 1 + 2 + 4 + ... + n = 2n, so amortized cost per append is O(1). Growth factors of 1.5x (used by some implementations) trade slightly more frequent resizes for less wasted space.
OperationStatic ArrayDynamic Array
Access by indexO(1)O(1)
Append (end)N/A (fixed)O(1) amortized
Insert (middle)O(n)O(n)
Delete (middle)O(n)O(n)
Search (unsorted)O(n)O(n)
Space overheadNoneUp to 2x
02 / Linked Lists

Pointer-Based Sequences

A linked list stores each element in a separate node, connected by pointers. Unlike arrays, nodes can be scattered across memory. This makes insertion and deletion at a known position O(1) (just re-wire pointers), but searching for a node is O(n) since you must traverse from the head.

Singly Linked List
head
node
node
tail
null

Variants

Singly Linked

Each node has a next pointer. Traversal is one-way. Simplest form, least memory overhead per node.

Doubly Linked

Each node has next and prev pointers. Enables O(1) deletion of a given node and backward traversal.

Circular

The tail's next points back to the head. Useful for round-robin scheduling and ring-based algorithms.

Cache Unfriendly
Because nodes are heap-allocated at arbitrary addresses, linked list traversal causes frequent cache misses. For sequential access patterns, arrays will significantly outperform linked lists even when the Big-O complexity is the same.
03 / Stacks & Queues

Restricted-Access Structures

Stacks and queues limit how you interact with the data, enforcing a specific order of access. This restriction is their strength: it makes them simple, efficient, and exactly right for a large class of problems.

Stack (LIFO)

Last In, First Out. Three operations, all O(1): push (add to top), pop (remove from top), and peek (view top without removing). Typically implemented with a dynamic array or a singly linked list.

Stack Operations
push/pop ↓↑
top
mid
bottom

Use cases: function call stack, expression evaluation (infix to postfix), DFS traversal, undo/redo functionality, matching parentheses, backtracking algorithms.

Queue (FIFO)

First In, First Out. enqueue adds to the back, dequeue removes from the front, both O(1). Best implemented with a linked list or a circular buffer to avoid O(n) shifts that a naive array approach would require.

Queue Operations
dequeue ←
front
mid
back
← enqueue

Use cases: BFS traversal, task scheduling, print queues, message buffers, rate limiting (sliding window).

04 / Deques, Priority Queues & Ring Buffers

Specialized Variants

Deque (Double-Ended Queue)

Supports insertion and removal at both ends in O(1). Can serve as both a stack and a queue. Python's collections.deque and Java's ArrayDeque use a circular buffer internally. Useful for sliding window problems and work-stealing schedulers.

Priority Queue

Elements are dequeued by priority, not insertion order. The standard implementation is a binary heap, giving O(log n) insert and O(log n) extract-min/max. Used in Dijkstra's algorithm, Huffman coding, event-driven simulations, and OS task schedulers.

OperationQueueDequePriority Queue (Heap)
InsertO(1) backO(1) both endsO(log n)
RemoveO(1) frontO(1) both endsO(log n) min/max
PeekO(1) frontO(1) both endsO(1) min/max
SearchO(n)O(n)O(n)

Ring Buffer (Circular Buffer)

A fixed-size array where the write position wraps around using modular arithmetic: pos = (pos + 1) % capacity. When the buffer is full, the oldest data is overwritten. No memory allocation after initialization.

Ring Buffer — Wrap-Around
R
D
D
W
_
_
R = read head   W = write head   D = data   _ = empty

Use cases: I/O buffers, audio/video streaming, logging systems, producer-consumer patterns. Ring buffers are favored in real-time systems because they have zero allocation overhead and predictable memory usage.

05 / Advanced Linear Structures

Bloom Filters, Skip Lists & More

Bloom Filter

A probabilistic structure that tests set membership. Uses multiple hash functions mapping to a bit array. Can report false positives but never false negatives. Used in databases, caches, and spell checkers to avoid expensive lookups.

Skip List

A layered linked list with express lanes. Each level skips over nodes, giving O(log n) search, insert, and delete on average. A randomized alternative to balanced BSTs, used in Redis sorted sets and LevelDB.

LRU Cache

Combines a doubly linked list with a hash map. The list maintains access order; the map gives O(1) lookups. On access, move the node to the front. On eviction, remove from the tail. O(1) for all operations.

Disjoint Set (Union-Find)

Tracks elements partitioned into disjoint sets. Supports find (which set?) and union (merge two sets). With path compression and union by rank, both operations run in near-O(1) amortized time. Key to Kruskal's MST algorithm.

LRU Cache Pattern
The doubly-linked-list + hash-map LRU cache is one of the most frequently asked interview questions. The key insight: the hash map stores pointers directly into the linked list, so you can locate and move any node in O(1) without traversal.
06 / Choosing the Right Structure

Trade-offs & Decision Guide

There is no universally best data structure. The right choice depends on your access pattern, frequency of operations, memory constraints, and whether you need ordering.

NeedBest ChoiceWhy
Fast random accessArray / Dynamic arrayO(1) index, cache-friendly
Frequent insert/delete at endsDeque / Linked listO(1) at both ends
Frequent insert/delete in middleLinked listO(1) at known position
LIFO orderingStackO(1) push/pop
FIFO orderingQueue / Ring bufferO(1) enqueue/dequeue
Priority-based orderingPriority queue (heap)O(log n) insert/extract
Fixed-size streaming bufferRing bufferZero alloc, wrap-around
Set membership (probabilistic)Bloom filterSpace-efficient, no false negatives
Recently-used cacheLRU cacheO(1) access + eviction
Dynamic connectivityUnion-FindNear O(1) union/find
Practical Reality
In practice, arrays and dynamic arrays dominate. Their cache-friendliness often makes them faster than linked lists even for operations where linked lists have better Big-O complexity. Always benchmark with real data before choosing a pointer-heavy structure over a contiguous one.

Test Yourself

Score: 0 / 10
Question 01
What is the time complexity of accessing an element by index in an array?
Arrays use contiguous memory, so the address of any element can be computed directly as base + index * element_size, giving constant-time access.
Question 02
Why does a dynamic array achieve amortized O(1) appends despite occasional O(n) resizes?
By doubling, each resize copies all n elements, but the next resize won't happen until n more appends. The total copy cost over n appends sums to about 2n, so the amortized cost per operation is O(1).
Question 03
Which linked list variant allows O(1) deletion of a given node (when you have a pointer to it)?
A doubly linked list stores both prev and next pointers, so you can re-wire both neighbors in O(1) without needing to traverse from the head to find the predecessor.
Question 04
Which data structure is best suited for implementing DFS on a graph?
DFS explores as deep as possible before backtracking, which matches the LIFO behavior of a stack. BFS, in contrast, uses a queue (FIFO).
Question 05
What is the time complexity of extracting the minimum element from a binary min-heap?
The minimum is at the root (O(1) to find), but after removing it you must restore the heap property by sifting down, which takes O(log n) in the worst case.
Question 06
A ring buffer is full. What happens when a new element is written?
A ring buffer has a fixed size and uses modular arithmetic to wrap around. When full, writing advances the write pointer, overwriting the oldest data at the read position.
Question 07
A Bloom filter reports that an element is NOT in the set. What can you conclude?
Bloom filters can produce false positives (says "maybe in set" when it isn't) but never false negatives. If it says "not in set," that is guaranteed to be correct.
Question 08
What two data structures are combined to build an O(1) LRU cache?
The hash map provides O(1) key lookup, returning a pointer directly into the doubly linked list. The list maintains recency order, and nodes can be moved to the front or removed from the tail in O(1).
Question 09
Why do arrays often outperform linked lists in practice, even for operations where linked lists have better Big-O?
CPU caches exploit spatial locality: when one array element is accessed, nearby elements are loaded into cache automatically. Linked list nodes are scattered in memory, causing frequent cache misses that dominate actual runtime.
Question 10
What is the amortized time complexity of find and union in a Disjoint Set with path compression and union by rank?
With both path compression and union by rank, the amortized cost per operation is O(α(n)), where α is the inverse Ackermann function. For all practical input sizes, α(n) ≤ 4, making it effectively constant.