Skip to content

Compression Mode 07 — Frequency Map since v2.0.0

This compression mode applies frequency-based byte substitution to compress strings with many repeated characters.

It replaces the most frequent characters with compact single-byte indices and uses an escape sequence for all other characters.

How It Works

  1. Count character frequencies in the input string.
  2. Select up to 254 most frequent characters.
  3. Store them as a frequency table prefix at the beginning of the compressed payload.
  4. Encode the input as a byte stream:
    • frequent character → 1-byte index (0–253)
    • other character → escape byte 0xFF + UTF-16 bytes
  5. Pack bytes into UTF-16 characters (2 bytes per character).

A splitter string is inserted between the header and the encoded body and is chosen dynamically to avoid collisions with the input.

Header Character Usage

NameUsage
Code #107
Code #2Splitter index
Code #3default
i?false
o?false
s?false
b?default