Arithmetic coding increases sequence length

2 Ansichten (letzte 30 Tage)
Giuseppe Esposito
Giuseppe Esposito am 25 Jun. 2018
Bearbeitet: Michael Montouchet am 2 Okt. 2025 um 15:59
Hi all, I'm using the function "arithenco(seq,counts)" to compress a sequence of 1's,2's,3's and 4's of size 65536. The correspondent counts (number of occurrences for each symbol) is [1991,7759,52117,3669] so the symbol 3 shows an high probability to occur and I would expect a compression gain from the arithmetic code. But this doesn't happen, and the function outputs a code of size 66424 (longer than the original), how is possible? Thank you for the attention.
  1 Kommentar
Michael Montouchet
Michael Montouchet am 2 Okt. 2025 um 15:56
Bearbeitet: Michael Montouchet am 2 Okt. 2025 um 15:58
To write your sequence of 1, 2, 3, 4 as a binary sequence, you need at least 2 bits per symbol. The space required to write it as a binary sequence is 2 * 65536 bits.
The arithmetic code is made of 0 and 1, so you need 1 bit per symbol. The space required to write it as a binary sequence is 1 * 66424 bits.
So you managed to compress an initial input into a code that is nearly half of the initial size.

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Michael Montouchet
Michael Montouchet am 2 Okt. 2025 um 15:57
Bearbeitet: Michael Montouchet am 2 Okt. 2025 um 15:59
To write your sequence of 1, 2, 3, 4 as a binary sequence, you need at least 2 bits per symbol. The space required to write it as a binary sequence is 2 * 65536 bits.
The arithmetic code is made of 0 and 1, so you need 1 bit per symbol. The space required to write it as a binary sequence is 1 * 66424 bits.
So you managed to compress an initial input into a code that is nearly half of the initial size.

Kategorien

Mehr zu Electrical Block Libraries finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by