python-chardet
Port variant v14
Summary Universal character encoding detector (3.14)
Package version 7.4.3
Homepage https://github.com/chardet/chardet
Keywords python
Maintainer Python Automaton
License Not yet specified
Other variants v13
Ravenports Buildsheet | History
Ravensource Port Directory | History
Last modified 22 MAY 2026, 14:41:05 UTC
Port created 30 MAY 2017, 20:17:50 UTC
Subpackage Descriptions
single # chardet Universal character encoding detector. [![License: 0BSD]](LICENSE) [Documentation] [codecov] chardet 7 is a ground-up, 0BSD-licensed rewrite of [chardet]. Same package name, same public API — drop-in replacement for chardet 5.x/6.x, just much faster and more accurate. Python 3.10+, zero runtime dependencies, works on PyPy. [Read more details about the rewrite process.] ## Why chardet 7? **99.3% accuracy** on 2,517 test files. **47x faster** than chardet 6.0.0 and **1.5x faster** than charset-normalizer 3.4.6. **Language detection** for every result. **MIME type detection** for binary files. **0BSD licensed.** | | chardet 7.4.0 (mypyc) | chardet 6.0.0 | [charset-normalizer] 3.4.6 | | ---------------------- | :--------------------: | :-----------: | :-------------------------: | | Accuracy (2,517 files) | **99.3%** | 88.2% | 85.4% | | Speed | **551 files/s** | 12 files/s | 376 files/s | | Language detection | **95.7%** | 40.0% | 59.2% | | Peak memory | **52.9 MiB** | 29.5 MiB | 78.8 MiB | | Streaming detection | **yes** | yes | no | | Encoding era filtering | **yes** | no | no | | Encoding filters | **yes** | no | yes | | MIME type detection | **yes** | no | no | | Supported encodings | 99 | 84 | 99 | | License | 0BSD | LGPL | MIT | [charset-normalizer]: https://github.com/jawah/charset_normalizer ## Installation `bash pip install chardet ` ## Quick Start ```python import chardet chardet.detect(b"Python is a great programming language for beginners and experts alike.") # {'encoding': 'ascii', 'confidence': 1.0, 'language': 'en', 'mime_type': 'text/plain'} # UTF-8 English with accented characters chardet.detect("The naïve approach doesn't always work in complex systems.".encode("utf-8")) # {'encoding': 'utf-8', 'confidence': 0.84, 'language': 'en', 'mime_type': 'text/plain'} # Japanese EUC-JP chardet.detect("日本語の文字コード検出テストです。このテキストはEUC-JPでエンコードされています。正しく検出できるか確認します。".encode("euc-jp")) # {'encoding': 'EUC-JP', 'confidence': 1.0, 'language': 'ja', 'mime_type': 'text/plain'} # Get all candidate encodings ranked by confidence text = "Le café est une boisson très populaire en France et dans le monde entier." results = chardet.detect_all(text.encode("windows-1252")) for r in results[:4]: print(r["encoding"], round(r["confidence"], 2)) # Windows-1252 0.32 # iso8859-15 0.32 # ISO-8859-1 0.32 # MacRoman 0.31 ``` ### Streaming Detection For large files or network streams, use `UniversalDetector` to feed data incrementally: ```python from chardet import UniversalDetector detector = UniversalDetector() with open("unknown.txt", "rb") as f: for line in f: detector.feed(line)
Configuration Switches (platform-specific settings discarded)
PY313 OFF Build using Python 3.13 PY314 ON Build using Python 3.14
Package Dependencies by Type
Build (only) python314:dev:std
python-pip:single:v14
autoselect-python:single:std
Build and Runtime python314:primary:std
Download groups
main mirror://PYPIWHL/8c/6c/0a40afdb50a0fe041ab95553b835a8160b6cf0e81edf2ae2fe9f5224cbf9
Distribution File Information
1173b74051570cf08099d7429d92e4882d375ad4217f92a6e5240ccfb26f231e 626562 python-src/chardet-7.4.3-py3-none-any.whl
Ports that require python-chardet:v14
python-encutils:v14 Text file encoding detection functions (3.14)