Clone this repo:


  1. 94c367a Exclude UTF-16 encoding for automatic detection by Jinsuk Kim · 6 weeks ago master
  2. 910cca2 Fix crash detected by asan test by Jinsuk Kim · 5 months ago
  3. e21eb6a Post-detection mapping for HTML5 mode by Jinsuk Kim · 8 months ago
  4. 368a9cc Merge pull request #4 from randomascii/master by JinsukKim · 10 months ago
  5. 9a5abb8 Save 908,288 bytes by deleting 'const' three times by Bruce Dawson · 10 months ago


Compact Encoding Detection(CED for short) is a library written in C++ that scans given raw bytes and detect the most likely text encoding.

Basic usage:

#include "compact_enc_det/compact_enc_det.h"

const char* text = "Input text";
bool is_reliable;
int bytes_consumed;

Encoding encoding = CompactEncDet::DetectEncoding(
        text, strlen(text),
        nullptr, nullptr, nullptr,

How to build

You need CMake to build the package. After unzipping the source code , run to build everything automatically. The script also downloads Google Test framework needed to build the unittest.

$ cd compact_enc_det
$ ./
$ bin/ced_unittest

On Windows, run cmake . to download the test framework, and generate project files for Visual Studio.

D:\packages\compact_enc_det> cmake .