This results in the following encoding:
0 101 0 111 0 110 0
This encoding requires 13 bits instead of the 14 bits required by the uncompressed version, so we are still saving some space. However, unlike the last code
we considered, this one is unambiguous. Notice that if a code starts with 0, it consumes 1 bit; if a code starts with 1, it is 3 bits long. This gives us a deterministic
algorithm for placing spaces in the encoded stream.
In the “Exercises” section, you will prove that there is no such thing as an unambiguous code that can compress every possible input; some inputs will get bigger. This is why it is so important to know something about what kind of data we
want to encode. In our example, we notice that the number 0 appears frequently,
and we can use that fact to reduce the amount of space that the encoded version
requires. Entropy measures the predictability of the input. In our case, the input
seems somewhat predictable, because the number 0 is more likely to appear than
other numbers. We leverage this entropy to produce a usable code for our purposes.