Binary Indexed Tree- Fenwick Tree


The Binary Indexed Tree or the Fenwick Tree as it is alternatively known is a very useful data structure that has wide uses in various day to day applications. BIT is very useful for solving queries of a particular type, Let us suppose we have an array of elements and we need to do two types of queries frequently.
  1. Change the value of any element in the list.
  2. Get the sum of all the elements in any range within the given list
The naive solution has time complexity of O(1) for query 1 and O(n) for query 2. Suppose we make m queries. The worst case (when all queries are 2) has time complexity O(n * m). Binary Indexed Trees are easy to code and have worst time complexity O(m log n). i.e each query of the type 2 takes at most O(log n)

Now, Let us consider the elements of an array as frequencies e.g. arr[1]=5,arr[2]=6 etc, They can have any values, the term frequency is just meant to use as a notation.

Let us consider an initial array of size N(can be any valid size within Integer Range).We can easily construct another array of similar size N with values which is used to hold the cumulative values e.g.


The underlying working principle of the Binary Indexed Tree is achieved by the help of responsibility array. It’s an array of size>=N size where each index holds some specific values (called the responsibility sum).
Let us declare int res[N];

Basic Idea

We know that each integer can be represented by sum of Powers of 2  e.g.
  • 9=8+1   ----------> 1001(Binary Notation)
  • 37=32+4+1 ----------> 00100101(Binary Notation)
  • 15=8+4+2+1 ----------> 1111(Binary Notation)
We can see that the maximum number of 1 in the binary notation of 15 is ceil(log215)=4 ,similarly for 37 and 9.
i.e for 37 ceil(log237)=6 
    for 9 ceil(log29)=4

So, number of digits of any number in 2's power form would be log2N Presto, this is what exactly the property being made use in BIT.

Similarly, instead of storing the cumulative frequencies for the entire array we can store the (sum of some sub frequencies(Not the entire values from 0 to that index)) in particular indexes, the purpose of which will be a bit clear in a short while.

  • Let idx be some index(not value) of the array of size N (therefore,0<=idx<=N)
  • Let r be the position of the last occurrence of 1 in the binary notation of idx from left  to right.(See the example below) (therefore, 1<=r<=log2N (
The responsibility range is the range of (idx-2^r+1)    to   idx (inclusive). And the responsibility array contains the sum of all the index in the frequency array corresponding to the range

Let us see an example here,let idx=12,(1100),hence r=2 and thus res[12] ,  the range of indexes of the frequency array covered by it is , [12-(2^2)+1 to 12] i.e [9-12] hence ,

Reading Cumulative value at any position. 

 In order to read the cumulative value at any index e.g  13 we just need to remove the last one bit(i.e r) each time from the idx till the value is >0

13th index(1101) = res[13]+res[12]+res[8 ]
                  1101 + 1100    +1000(here we are removing the last bit having  1 at each iteration)
We can also check from the above table e.g. cumulative value of 6th index (0110)=res[6]+res[4]=9+11=20
i.e 0110 =6
     0100 =4

Reading Logic

Now we need to find a fast way of finding the r at each step. This can be easily done using 
Val=(num & -num); where val is the value obtained after making rth bit 0 and num is the initial number

The proof is as follows:

 Let num be the integer whose last digit(means r) we want to isolate. In binary notation num can be represented as a1b, where a represents binary digits before the last digit and b represents zeroes after the last digit.

Integer  num is equal to (a1b)¯ + 1 = a¯0b¯ + 1.    b consists of all zeroes, so b¯ consists of all ones. Finally we have
-num = (a1b)¯ + 1 
= a¯0b¯ + 1 
= a¯0(0...0)¯ + 1
= a¯0(1...1) + 1 
= a¯1(0...0) 
= a¯1b.
Now, we can easily isolate the last digit, using bitwise operator AND (in C++, Java it is &) with num and -num:

&      a¯1b
= (0...0)1(0...0)

Below is the code for querying the cumulative value at any index.

The number of iterations in this function is number of  bits in idx, which is at most log N.
Time complexity: O(log N).
Code length: Up to ten lines.

So,in order to get the sum between say two points a,b, we do read(b)-read(a-1) [we have used a-1 to include the ath value as well] and for getting the actual position at any index b just read(b)-read(b-1)

Change frequency at some position and update tree

The concept is to update tree frequency at all indexes which are responsible for frequency whose value we are changing. In reading cumulative frequency at some index, we were removing the last bit and going on. In changing some frequency val in tree, we should increment value at the current index (the starting index is always the one whose frequency is changed) for val, add the last digit to index and go on while the index is less than or equal to N. Function in C.

Change frequency at some given range with the same value and update the array

When incrementing frequencies in range [start..end], we just increment difference at index start  and decrement difference at index end+1. This is also like saying: increment all frequencies in range [a..infinity] and decrement all frequencies in range [b+1..infinity].

Above function shows how to initialise the BIT in an array called res[] and N=number of elements + 1and arr is the initial frequency array

Java Fast Input/Output

Most of the time I write my codes in Java,and its a bit frustrating to see that one cannot attain the speed of execution here compared to the other languages like C,Cpp. However, Java is very feature rich in terms of library support, user base and ease of OOP development.Due to its very organized and huge collection of library its often the programming language of choice for most professionals.

Java being an interpreted language is slower than C and other compiled language ,however it still can be optimized to the level where it can attain runtime speeds which varies from C by a factor of less than 2,and major optimization can be attained within the Input/Output section itself.

Often people use the Scanner class in Java for input(be it reading from a file or console),however Scanner is the slowest possible way of taking an input.

There are two other better class to achieve this :


BufferedReader provides us with automated buffer management and also the very useful readLine() method(can take entire String of line as input,similar to Scanner's nextLine()) method.Often its sufficient when we deal with String inputs to use the buffered Reader class.
Given below is a small code snippet which demonstrates how to used the BufferedReader to take inputs as String.

BufferedInputStream is the fastest way to take Input,however we need to make our own buffer management and also there is no readLine() method predefined.

So,I am going to create  the following using BufferedInputStream.

  •  function readInt() //reads the next Integer skipping the newline and spaces
  •  function readLong() //reads the next Long skipping the newline and spaces
  •  function readString() //reads the next String till it gets a newLine("\n") 

Also,I have used the StringBuilder class here .StringBuffer is supposedly the fastest  way to build and manipulate Strings in java instead of the regular concatenation.However StringBuilder is not threadsafe, and for those of you who wants to use a threadsafe version can instead use the StringBuffer class.
Below I have embedded the code for the custom class I have created and added suitable comments.

Fast Input/Output in C

This is my first post and I would begin it with one of the very crucial and overlooked segment of a program.
i.e the IO(Input/Output).
Most of us are content with using scanf,printf,getc,putchar in C .
For programs with large number of numerical inputs(ints,long) in C we often use scanf("%d") which is even slower than taking input as a String using scanf("%s")  or gets() and then converting the number using atoi function.

Almost everyone has never heard of the function getchar_unlocked()  in C.

  •  getchar_unlocked() is a macro provided by C and this function has much less overhead as compared with the getchar().
  • getchar_unlocked() is not thread safe and does not have many safety checks.Although it is not recommended to use a function which is not thread safe,it can be safely used if we just need to take inputs and don't have a multi threaded program.(usually in the programming competitions where speed is the primary concern).
Below is  the code snippet which demonstrates how to use the getchar_unlocked() for developing fast Input solutions for reading int,long etc.

PS: I have used the inline keyword here,Inlining is done to request the compiler to replace the source code of the function at the places its being called,thereby it will become a bit faster and optimized as there wont be an overhead for allocating the stack separately for the function.

Normally Printf is quite fast for outputs,however for writing Integer or Long Outputs,the below function is a tad bit faster. Here we use the putchar_unlocked() method for outputting a character which is similar thread-unsafe version of putchar() and is faster.