Compression Mode 05 — Character Encoding since v1.0.0
This mode uses JSSC internal Character Encodings to compress the input string by internally remapping UTF-16 characters into the most optimal JSSC encoding and emitting the result back as UTF-16.
Each JSSC encoding contains 256 symbols, meaning each encoded symbol fits into 8 bits. One UTF-16 character (16 bits) is therefore represented by two JSSC-encoded symbols.
Step-by-Step Walkthrough
Input: Hello, World! Привет, Мир! (It says "Hello, World!" in English and Russian.)
- Select the most optimal JSSC Character Encoding: in this case, it would be
JSSCENRU. - Internally remap input characters to JSSC encoding (hex):
H→48e→65l→6Cl→6C- Note that JSSC input RLE may put2(32) here instead ofl(6C).o→6F,→2C_→20W→57o→6Fr→72l→6Cd→64!→21_→20П→8Fр→B0и→A8в→A2е→A5т→B2,→2C_→20М→8Cи→A8р→B0!→21
- Emit as UTF-16:
4865→䡥6C6C→汬6F2C→漬2057→⁗6F72→潲6C64→汤2120→℠8FB0→辰A8A2→ꢢA5B2→ꖲ2C20→Ⱐ8CA8→貨B021→뀡
Output: 䡥汬漬⁗潲汤℠辰ꢢꖲⰠ貨뀡