Skip to content

Compression Mode 05 — Character Encoding since v1.0.0

This mode uses JSSC internal Character Encodings to compress the input string by internally remapping UTF-16 characters into the most optimal JSSC encoding and emitting the result back as UTF-16.

Each JSSC encoding contains 256 symbols, meaning each encoded symbol fits into 8 bits. One UTF-16 character (16 bits) is therefore represented by two JSSC-encoded symbols.

Step-by-Step Walkthrough

Input: Hello, World! Привет, Мир! (It says "Hello, World!" in English and Russian.)

  1. Select the most optimal JSSC Character Encoding: in this case, it would be JSSCENRU.
  2. Internally remap input characters to JSSC encoding (hex):
    • H48
    • e65
    • l6C
    • l6C - Note that JSSC input RLE may put 2 (32) here instead of l (6C).
    • o6F
    • ,2C
    • _20
    • W57
    • o6F
    • r72
    • l6C
    • d64
    • !21
    • _20
    • П8F
    • рB0
    • иA8
    • вA2
    • еA5
    • тB2
    • ,2C
    • _20
    • М8C
    • иA8
    • рB0
    • !21
  3. Emit as UTF-16:
    • 48 65
    • 6C 6C
    • 6F 2C
    • 20 57
    • 6F 72
    • 6C 64
    • 21 20
    • 8F B0
    • A8 A2
    • A5 B2
    • 2C 20
    • 8C A8
    • B0 21

Output: 䡥汬漬⁗潲汤℠辰ꢢꖲⰠ貨뀡