Levenshtein coding

Levenshtein coding is a universal code encoding the non-negative integers developed by Vladimir Levenshtein.[1][2]

Encoding

The code of zero is "0"; to code a positive number:

Initialize the step count variable C to 1.
Write the binary representation of the number without the leading "1" to the beginning of the code.
Let M be the number of bits written in step 2.
If M is not 0, increment C, repeat from step 2 with M as the new number.
Write C "1" bits and a "0" to the beginning of the code.

The code begins:

Number	Encoding	Implied probability
0	`0`	1/2

1	`10`	1/4

2	`110 0`	1/16
3	`110 1`	1/16

4	`1110 0 00`	1/128
5	`1110 0 01`	1/128
6	`1110 0 10`	1/128
7	`1110 0 11`	1/128

8	`1110 1 000`	1/256
9	`1110 1 001`	1/256
10	`1110 1 010`	1/256
11	`1110 1 011`	1/256
12	`1110 1 100`	1/256
13	`1110 1 101`	1/256
14	`1110 1 110`	1/256
15	`1110 1 111`	1/256

16	`11110 0 00 0000`	1/4096
17	`11110 0 00 0001`	1/4096

To decode a Levenshtein-coded integer:

Count the number of "1" bits until a "0" is encountered.
If the count is zero, the value is zero, otherwise
Start with a variable N, set it to a value of 1 and repeat count minus 1 times:
Read N bits, prepend "1", assign the resulting value to N

The Levenshtein code of a positive integer is always one bit longer than the Elias omega code of that integer. However, there is a Levenshtein code for zero, whereas Elias omega coding would require the numbers to be shifted so that a zero is represented by the code for one instead.

Example code

Encoding

void levenshteinEncode(char* source, char* dest)
{
    IntReader intreader(source);
    BitWriter bitwriter(dest);
    while (intreader.hasLeft())
    {
        int num = intreader.getInt();
        if (num == 0)
            bitwriter.outputBit(0);
        else
        {
            int c = 0;
            BitStack bits;
            do {
                int m = 0;
                for (int temp = num; temp > 1; temp>>=1)  // calculate floor(log2(num))
                    ++m;
                for (int i=0; i < m; ++i)
                    bits.pushBit((num >> i) & 1);
                num = m;
                ++c;
            } while (num > 0);
            for (int i=0; i < c; ++i)
                bitwriter.outputBit(1);
            bitwriter.outputBit(0);
            while (bits.length() > 0)
                bitwriter.outputBit(bits.popBit());
        }
    }
}

Decoding

void levenshteinDecode(char* source, char* dest)
{
    BitReader bitreader(source);
    IntWriter intwriter(dest);
    while (bitreader.hasLeft())
    {
        int n = 0;
        while (bitreader.inputBit())     // potentially dangerous with malformed files.
            ++n;
        int num;
        if (n == 0)
            num = 0;
        else
        {
            num = 1;
            for (int i = 0; i < n-1; ++i)
            {
                int val = 1;
                for (int j = 0; j < num; ++j)
                    val = (val << 1) | bitreader.inputBit();
                num = val;
            }
        }
        intwriter.putInt(num);           // write out the value
    }
    bitreader.close();
    intwriter.close();
}

References

"1968 paper by V. I. Levenshtein (in Russian)" (PDF).
David Salomon (2007). Variable-length codes for data compression. Springer. p. 80. ISBN 978-1-84628-958-3.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] "1968 paper by V. I. Levenshtein (in Russian)" (PDF).

[2] David Salomon (2007). Variable-length codes for data compression. Springer. p. 80. ISBN 978-1-84628-958-3.

Levenshtein coding

Encoding

Example code

Encoding

Decoding

See also

References