The Fenwick Tree, also known as the Binary Indexed Tree (BIT), is a popular and elegant data structure that maintains the prefix sums of a dynamic array1. With this data structure we can update values in the original array and still answer prefix sum queries. Both operations runs in logarithmic time.

More precisely, the Fenwick tree solves the following problem.

We have an array \(A[1,n]\) of integers, and we would like to support the following operations:

  • sum(i) returns the sum of the elements in \(A[1..i]\);
  • add(i, v) adds the value \(v\) to the entry \(A[i]\).

The Fenwick tree efficiently handles these queries in \(\Theta(\log n)\) time while using linear space. In fact, the Fenwick tree is an implicit data structure, which means it requires only \(O(1)\) additional space in addition to the space needed to store the input data (the array \(A\) in our case).

In our descritpion, we are going to use the following array \(A\) as a running example. Notice that we are using a one-based indexing for the array.


Two Trivial Solutions

Let’s describe two trivial solutions for the problem above.

The first solution simply stores \(A\) as it is. This way, sum(i) is solved by scanning the array in \(\Theta(n)\) time, and add(i, v) is solved in \(O(1)\) time.

The second solution, instead, stores the prefix-sums of \(A\). This way, sum(i) is solved in \(O(1)\) time, and add(i, v) is solved by modifying all the entries up to position \(i\) in \(\Theta(n)\) time.

The sum/add query time tradeoffs of these solutions are clearly unsatisfactory.


Fenwick Tree, Level by Level

The Fenwick Tree provides better tradeoffs for this problem. In our description, we will gradually introduce this data structure by constructing it level by level.

To start, let’s simplify the original problem slightly. In this variant, we’ll focus on solving sum queries only for positions that are powers of \(2\), like positions \(1\), \(2\), \(4\), and \(8\) in our array \(A\). The solution of this variant will be the first level of our Fenwick Tree.

The idea for solving this relaxed variant is to sparsify the second trivial solution above, storing only the prefix sums of positions that we need for queries. The figure below illustrates this solution as a tree, with a fictitious root node named \(0\) and child nodes named \(1\), \(2\), \(4\), and \(8\), each storing the sum up to the corresponding power of \(2\). Additionally, below every node, we provide the range of positions it covers. For instance, node \(4\) covers positions in the range \([1, 4]\).

We address the queries of the simplified problem as follows:

  • The sum(i) query is straightforward. We simply access node \(i\). Of course, this only works for indexes \(i\) that are a power of \(2\).

  • For the add(i, v) query we need to add \(v\) to all nodes covering ranges that include position \(i\). For example, for the query add(3, 10), we add the value \(10\) to nodes \(4\) and \(8\). In general, first we have to find the smallest power of \(2\) greater than \(i\), let’s call it \(j\). Then, we add \(v\) to nodes \(j, 2j, 2^2j, 2^3j, \ldots\).

Observe that sum takes constanti time and add takes \(\Theta(\log n)\) time. Hooray! We are within our target time complexity. Now, can we extend this solution to support sum queries on more positions?

We observe that we’re not currently supporting queries for positions within the ranges between consecutive powers of \(2\). For instance, positions in the range \([5,7]\), which fall between \(2^2\) and \(2^3\). But wait! Enabling queries for this subarray is just a smaller instance of our original problem. Therefore, we can apply the same strategy by adding a new level to our tree. If the subarray is \(A[l..r]\), the new level will support the sum(i) query for any \(i\) such that \(i-l+1\) is a power of \(2\).

Our two-level tree can now handle sum(i) queries also for positions that are the sum of two powers of \(2\). Why? Consider a position \(i\) expressed as \(2^{k'}+2^{k}\), where \(k'>k\). We can decompose the range \([1,i]\) into two subranges: \([1,2^{k'}]\) and \([2^{k'}+1,2^{k'}+2^{k}=i]\). Both of these subranges are covered by nodes in our tree. Specifically, range \([1,2^{k'}]\) is covered by node \(2^{k'}\) at the first level, while \([2^{k'}+1,2^{k'}+2^{k}=i]\) is covered by node \(i\) at the second level.

For example, let’s consider the query sum(5). We can handle this in our two-level tree because \(5=2^2+2^0\). Consequently, the range \([1,5]\) is divided into \([1,4]\) and \([5,5]\), and the result (which is \(6\)) is obtained by summing the values of nodes \(2^2=4\) and \(2^2+2^0=5\).

Which positions are still not supported for sum queries? Positions that are neither powers of \(2\) nor the sum of two powers of \(2\). In our example, with \(n=8\), only position \(7=2^2+2^1+2^0\) falls into this category. So, what do we do next? We add a new level to our tree to support queries for positions that are the sum of three powers of \(2\).

That’s all. This is the Fenwick tree for the array \(A\). Now, let’s make some observations:

  1. While we’ve represented our solution as a tree, it can also be represented as an array \(FT\) of size \(n+1\), as shown in the figure above.
  2. We no longer require the original array \(A\) because any of its entries \(A[i]\) can be obtained as \(A[i] = \text{sum}(i) - \text{sum}(i-1)\). This is why the Fewniwck tree is an implicit data structure.
  3. Let be \(h\) equal to \(\lfloor \log (n) + 1 \rfloor\), which is the length of the binary representation of any position in the range \([1,n]\). Since any position can be expressed as the sum of at most \(h\) powers of \(2\), the tree has no more than \(h\) levels. In fact, the number of levels is either \(h\) or \(h-1\), depending on the value of \(n\).

Now, let’s delve into the details of how to solve our sum and add queries on a Fenwick tree.


Answering a sum query

Let’s start by discussing the sum(i) query. Based on the previous discussion, solving this query involves beginning at node \(i\) and traversing up the tree to reach node \(0\). Thus, sum takes time proportional to the height of the tree, resulting in a time complexity of \(\Theta(\log n)\).

For a running example, let’s take the case where \(i=7\). We start at node \(7\) and move to its parent (node \(6\)), its grandparent (node \(4\)), and stop at its great-grandparent (the fictitious node \(0\)), summing their values along the way. This works because the ranges of these nodes (\([1,4]\), \([5,6]\), and \([7,7]\)) collectively cover the queried range \([1,7]\).

It’s important to note that answering a sum query becomes straightforward if we were allowed to store the tree’s structure. However, a significant part of the Fenwick tree’s elegance lies in the fact that storing the tree is not actually necessary. This is because we can efficiently navigate from a node to its parent using a few bit-tricks. This is the reason why the Fenwick tree is also referred to as the Binary Indexed Tree.


Compute the Parent of a Node

We want to compute the parent of a node, and we want to do it quickly and without representing the structure of the tree.

Let’s examine the binary representations of the IDs of the nodes involved in answering the previous query.

Can you find out any pattern? Surprisingly, the binary representation of a node’s parent can be obtained by removing the trailing one (i.e., rightmost bit set to 1) from the binary representation of its children.

Let’s explore why this method works.

Suppose we have a node \(i\), and its range is \([j,i]\) for some \(j\). Its children will be nodes \(i+2^0\), \(i+2^1\), \(i+2^2\), and so on, spanning ranges \([j+1, i+2^0]\), \([j+1, i+2^1]\), \([j+1, i+2^2]\), and so forth. The binary representation of any of these children is identical to that of \(i\), except for the addition of the trailing one (due to the term \(2^k\)).

Now, we need a clever bit-trick to efficiently obtain the parent of a node. Based on our previous discussion, it’s evident that we need a way to remove the trailing one from the binary representation of a node \(i\). The trailing one can be isolated by computing \(k = i {\tt \&} -i\). Thus, \(i-k\) is the parent of \(i\).

In fact, negative numbers are represented in two’s complement form. In this representation, the two’s complement of a number is obtained by taking the bitwise complement of the number and then adding one to it.

For instance, if we have the binary number \(7\) as 0111, its two’s complement, which represents \(-7\), is 1001.

The key property of the two’s complement is that it inverts all the bits in the binary representation of a number, except for the leftmost “trailing one. Thus, when we compute the logical AND of a number and its two’s complement, only the trailing one survives. Therefore, the final subtraction \(i-k\) effectively cancels out this bit from \(i\), as required.

For example,


Performing an add

Now, let’s consider the operation add(i, v). We need to add the value of v to each node whose range include the position \(i\).

Certainly, node \(i\) is one of these nodes since its range ends at \(i\). Additionally, the right siblings of node \(i\) also encompass the position \(i\) in their ranges. This is because siblings share the same starting position, and right siblings have increasing sizes. The right siblings of the parent of node \(i\), the right siblings of the grandparent, and so on also contain position \(i\).

It might seem like we have to modify a large number of nodes. However, a simple observation reveals that this number is at most \(\log n\). This is because, each time we move from a node to its right sibling or to the right sibling of its parent, the size of the covered range at least doubles. And a range cannot double more than \(\log n\) times.

The figure below shows in red the nodes to modify for the operation add(5, _).

Now that we know which are the nodes to modify for add(i,_), let’s discuss how to compute these nodes with bit-tricks.

Coninuing the above example, starting from \(i=5\), the next node to modify is its right sibling, node \(6\). Let’s take a closer look at their binary representations.

Can you find out any pattern?

It seems that we need to isolate the trailing one in \(5\), which is 0001, and add it to \(5\) to obtain \(6\). Is this always the correct approach?

Let’s try it with another node. The right sibling of the parent of \(6\) (and, therefore, of \(5\)) is \(8\).

The trailing one in \(6\) is 0010 (i.e., \(2\)) and \(6+2=8\). Cool!

Why is this method correct? The binary representation of a node and its siblings matches, except for the position of the trailing one. When we move from a node to its right sibling, this trailing one shifts one position to the left. Adding this trailing one to a node accomplishes the required shift, as seen when we add \(5\) to its trailing one.

Now, consider the ID of a node that is the last child of its parent. In this case, the rightmost and second trailing one are adjacent. To obtain the right sibling of its parent, we need to remove the trailing one and shift the second trailing one one position to the left.

Thankfully, this effect is one again achieved by adding the trailing one to the node’s ID.

The time complexity of add query is \(\Theta(\log n)\), as we observe that each time we move to the right sibling of the current node or the right sibling of its parent, the trailing one in its binary representation shifts at least one position to the left. This can occur at most \(\lfloor \log n \rfloor +1\) times.


Fenwick Tree in Rust

Here, we present a minimal Rust implementation of a Fenwick tree. In this non-generic implementation, we’ve arbitrarily chosen to use i64 as the type for the values. While we’ve transitioned to 0-based indexing for queries, internally, we still use the 1-based indexing to maintain consistency with the notes.

For a more advanced implementation, it could be required to allow generic types and move away from the 1-based indexing. Additionally, there are various potential optimizations to enhance its performance. For more details, refer to Practical trade-offs for the prefix-sum problem.

#[derive(Debug)]
pub struct FenwickTree {
    tree: Vec<i64>,
}

impl FenwickTree {
    pub fn with_len(n: usize) -> Self {
        Self {
            tree: vec![0; n + 1],
        }
    }

    pub fn len(&self) -> usize {
        self.tree.len() - 1
    }

    /// Indexing is 0-based, even if internally we use 1-based indexing
    pub fn add(&mut self, i: usize, delta: i64) {
        let mut i = i + 1; 
        assert!(i < self.tree.len());

        while i < self.tree.len() {
            self.tree[i] += delta;
            i = Self::next_sibling(i);
        }
    }

    /// Indexing is 0-based, even if internally we use 1-based indexing
    pub fn sum(&self, i: usize) -> i64 {
        let mut i = i + 1;  

        assert!(i < self.tree.len());
        let mut sum = 0;
        while i != 0 {
            sum += self.tree[i];
            i = Self::parent(i);
        }

        sum
    }

    pub fn range_sum(&self, l: usize, r: usize) -> i64 {
        self.sum(r) - if l == 0 { 0 } else { self.sum(l - 1) }
    }

    fn isolate_trailing_one(i: usize) -> usize {
        if i == 0 {
            0
        } else {
            1 << i.trailing_zeros()
        }
    }

    fn parent(i: usize) -> usize {
        i - Self::isolate_trailing_one(i)
    }

    fn next_sibling(i: usize) -> usize {
        i + Self::isolate_trailing_one(i)
    }
}


Applications of Fenwick Tree

We present now three problems that can be solved with Fenwick Tree.


Counting Inversions in an Array

We are given an array \(A[1 .. n]\) of \(n\) positive integers. If \(1 \leq i < j \leq n\) and \(A[i] > A[j]\), then the pair \((i, j)\) is called an inversion of \(A\).

The goal is to count the number of inversions of \(A\).

We assume that the largest integer \(M\) in array \(A\) is in \(O(n)\). This assumption is important because we’re using a Fenwick Tree of size \(M\) and building such a data structure takes \(\Theta(M)\) time and space. If, on the other hand, \(M\) is too large, we need to sort array \(A\) and replace each element with its rank in the sorted array.

Then, we use a Fenwick tree on an array \(B\) with \(M\) elements, initially all set to \(0\). We scan array \(A\) from left to right. When processing \(A[j]\), we set \(B[j]\) to \(1\). The number of elements larger than \(A[j]\) that we’ve already processed can be calculated using the range_sum(j+1, M) function.

The running time is \(\Theta(n\log n)\). It’s worth noting that there is another popular solution with the same time complexity, which utilizes a variant of Merge Sort.

A Rust implementation is as follows.

pub fn counting_inversions(a: &[u64]) -> usize {
    if a.is_empty() {
        return 0;
    }

    let max = *a.iter().max().unwrap() as usize;
    let mut ft = FenwickTree::with_len(max + 1);

    let mut count: usize = 0;
    for &e in a {
        count += ft.range_sum((e + 1) as usize, max) as usize;
        ft.add(e as usize, 1);
    }

    count
}


Nested Segments

This problem is from CodeForces.

We are given \(n\) segments: \([l_1, r_1], [l_2, r_2], \ldots, [l_n, r_n]\) on a line. There are no coinciding endpoints among the segments.

The task is to determine and report the number of other segments each segment contains.

We can restate the problem as follows: For the \(i\)-th segment, we want to count the number of segments \(j\) such that the following conditions hold: \(l_i < l_j\) and \(r_j < r_i\).

This problem can be solved by employing the sweep line algorithm and a Fenwick tree. First, we build the Fenwick tree by adding \(1\) in each position that correspons to right endpoint of a segment. This way, a sum(r) reports the number of segments that end in the range \([1,r]\).

Next, we let a sweep line process the segments in increasing order of their left endpoints. When we process the segment \([l_i,r_i]\), we compute sum\((r_i-1)\) as the result for the current segment. Before moving to the next segment, we add \(-1\) at position \(r_i\) to remove the contribution of the right endpoint of the current segment.

The claim is that sum\((r_i-1)\) is the number of segments contained in \([l_i,r_i]\). This is because all the segments that start before \(l_i\) have already been processed, and their right endpoints have been removed from the Fenwick tree. Therefore, sum\((r_i-1)\) is the number of segments that start after \(l_i\) and end before \(r_i\).


Update the Array

This problem is from SPOJ.

We are given an array \(A[1,n]\), initially all the entries are set to \(0\) and we would like to support two operations: - access(i) returns \(A[i]\); - range_uptade(l, r, v) updates the entries in \(A[i..j]\) by adding \(v\).

In this solution, we utilize a Fenwick tree on an array \(B[1,n]\), initially populated with zeros. For a range_update(l, r, v), we add \(v\) to \(B[l]\) and subtract \(v\) from \(B[r+1]\). This ensures that the value at position \(i\) is the prefix sum up to \(B[i]\).

A Rust implementation of this solution is the following.

#[derive(Debug)]
struct UpdateArray {
    ft: FenwickTree,
}

impl UpdateArray {
    pub fn with_len(n: usize) -> Self {
        Self {
            ft: FenwickTree::with_len(n),
        }
    }

    pub fn len(&self) -> usize {
        self.ft.len()
    }

    pub fn access(&self, i: usize) -> i64 {
        self.ft.sum(i)
    }

    pub fn range_update(&mut self, l: usize, r: usize, v: i64) {
        assert!(l <= r);
        assert!(r < self.ft.len());

        self.ft.add(l, v);
        if r + 1 < self.ft.len() {
            self.ft.add(r + 1, -v);
        }
    }
}


Dynamic Prefix-sums with Range Update

The range update of the previous problem is paired with the access(i) operation. This is easier than the problem we are going to solve in this section. Here we want to support range_update(l, r, v) and sum(i) operations. Notice that access(i) is also immediately supported with sum(i) - sum(i-1). This pair of operations makes the problem harder than the previous one.

More formally, the problem is as follow.

Given an array \(A[1,n]\) of integers, we would like to support the following operations.

  • sum(i) returns \(\sum_{k=1}^i A[k]\);
  • range_update(l, r, v) updates the entries in \(A[l..r]\) by adding \(v\).

We notice that the add operation of the original Fenwick tree is just a special case of the range_update operation. Moreover, as mentioned above, access(i) operation is also supported with two sum operations.

It’s evident that we can solve a range_update(l, r, v) with \(j-i+1\) add queries in \(\Theta((j-i+1)\log n)\) time. However, our goal is to achieve a time complexity of \(\Theta(\log n)\). This time complexity is independent of the size of the updated range and, therefore, is much more better than the previous one for large ranges.

The solution to this problem utilizes two Fenwick trees. Therefore, we require doubling the space usage to support the more powerful range_update(l, r, v) operation.

Initially, we consider a flawed solution using a single Fenwick tree, denoted as \(FT_1\). To fix the issues in this solution, we introduce a second Fenwick tree, denoted as \(FT_2\).

In our initial approach, we follow a similar strategy to the one used in the ‘Update the Array’ problem. For a range_update(r, l, v), we modify \(FT_1\) by adding \(v\) at position \(l\) and subtracting \(v\) at position \(r+1\). When querying sum(i), we multiply the result from \(FT_1\) by \(i\). This approach, however, led to errors in the results.

Let’s consider a range_update(l, r, v) operation on an brand new Fenwick tree. The correct results for a query sum(i) after the update are the following.

  • If \(1 \leq i < l\), sum(i) is \(0\).
  • If \(l \leq i < r\), sum(i) is \(v(i-l+1)\).
  • If \(r \leq i\), sum(i) is \(v(r-l+1)\).

Instead, the results returned by our implementation of sum(i) are the following.

  • If \(1 \leq i < l\), sum(i) is \(0\);
  • If \(l \leq i < r\), sum(i) is \(v\cdot i = v (l-1) + v(i-l+1)\);
  • If \(r < i\), sum(i) is \((v-v)i = 0\).

Our initial implementation reports the correct results for \(1 \leq i < l\) but introduces errors in other cases. Specifically, for \(l\leq i \leq r\), it includes an additional term \(v(l-1)\), while it erroneously reports \(0\) instead of the correct value \(v(r-l+1)\) in the latter case.

To address these errors, we introduce a second Fenwick tree, denoted as \(FT_2\), which will keep track of these discrepancies. When we perform a range_update(l, r, v), we add \(-v(l-1)\) to position \(l\) and \(v\cdot r\) to position \(r+1\) in \(FT_2\).

This revised approach ensures that the result of sum(i) can be expressed as \(a\cdot i + b\), where \(a\) represents the sum up to \(i\) in \(FT_1\) and \(b\) is the sum up to \(i\) in \(FT_2\).

The value of \(b\) from the second Fenwick tree corrects the errors present in the flawed solution. Specifically:

  • For \(1 \leq i < l\), \(b\) equals \(0\).
  • For \(l \leq i \leq r\), \(b\) equals \(-v(l-1)\).
  • For \(r < i\), \(b\) is \(v\cdot r - v(l-1) = v(r-l+1)\).

The Rust implementation of Fenwick tree with range update is as follows.

#[derive(Debug)]
struct RangeUpdate {
    ft1: FenwickTree,
    ft2: FenwickTree,
}

impl RangeUpdate {
    pub fn with_len(n: usize) -> Self {
        Self {
            ft1: FenwickTree::with_len(n),
            ft2: FenwickTree::with_len(n),
        }
    }

    pub fn len(&self) -> usize {
        self.ft1.len()
    }

    pub fn sum(&self, i: usize) -> i64 {
        self.ft1.sum(i) * i as i64 + self.ft2.sum(i)
    }

    pub fn access(&self, i: usize) -> i64 {
        self.sum(i) - if i == 0 { 0 } else { self.sum(i - 1) }
    }

    pub fn add(&mut self, i: usize, v: i64) {
        self.range_update(i, i, v)
    }

    pub fn range_update(&mut self, l: usize, r: usize, v: i64) {
        self.ft1.add(l, v);

        self.ft2.add(l, -v * (l as i64 - 1));

        if r + 1 < self.len() {
            self.ft1.add(r + 1, -v);
            self.ft2.add(r + 1, v * r as i64);
        }
    }
}


Further Readings


Exercises

These notes are for the “Competitive Programming and Contests” course at Università di Pisa.

  1. For an introduction to (static) prefix sums and their applications, take a look at ‘The Power of Prefix Sums’ post.