KOI character encodings

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

KOI (Russian: КОИ, from код обмена информацией, kod obmena informatsiey, “code for information exchange”) is a family of code pages used to encode Cyrillic text. A key feature is that text stays readable if the leftmost bit is stripped, which helps when data travels through systems that only handle 7-bit characters. This works because Cyrillic characters are placed in a special order: each letter is 128 code points away from the Latin letter it sounds closest to. But this arrangement doesn’t match any language’s alphabet, so sorting requires lookup tables.

KOI encodings are built from ASCII by mapping Latin letters to phonetically similar Cyrillic letters. This approach goes back to early systems like Russian Morse code and MTK-2 telegraph code.

KOI-7, introduced in 1967, is the original 7-bit page and does not include lowercase letters. The Cyrillic letters are ordered by Latin letters, and other code points match ASCII (though the dollar sign may be replaced by ¤).

KOI-8, standardized in 1974 as GOST 19768, is an 8-bit extension of ASCII. It originally included 32 lowercase and 31 uppercase Russian letters. Over time, many KOI-8 derivatives appeared (often called KOI8 or KOI-8).

There are several variants defined by standards, such as GOST R 34.303-92 (KOI-8 V1, ISO-IR-153) and KOI-8 N1/N2 (variants of Code page 866) that do not follow the classic KOI-8 layout.

DKOI is an EBCDIC-based KOI used on ES EVM mainframes, with multiple standards and variants.

Some encodings called KOI actually define Latin alphabets rather than Cyrillic.

This page was last edited on 3 February 2026, at 08:37 (CET).