CJK Unified Ideographs Extension I

CJK Unified Ideographs Extension I
CJK Unified Ideographs Extension I
Range	U+2EBF0..U+2EE5F; (624 code points)
Plane	SIP
Scripts	Han
Assigned	622 code points
Unused	2 reserved code points
Unicode version history
15.1 (2023)	622 (+622)
Unicode documentation
	Code chart ∣ Web page
	Note: [1][2]

CJK Unified Ideographs Extension I is a Unicode block comprising CJK Unified Ideographs included in drafts of an amendment to China's GB 18030 standard circulated in 2022 and 2023, which were fast-tracked into Unicode in 2023.

Background

Unlike most other sets of CJK unified ideographs, Extension I was not prepared and submitted by the Ideographic Research Group (IRG).

GB 18030 is a mandatory national standard of the People's Republic of China (PRC). It defines a Unicode Transformation Format which retains compatibility with existing data in the earlier GBK and EUC-CN character encodings, and specifies particular Unicode characters which devices sold in China must support.[3] Its 2022 edition, GB 18030-2022, changed a number of required characters to map to standard Unicode code points, rather than to private use area code points.

In late 2022, the PRC made a draft of a further amendment to be made to GB 18030 available for public consultation. This draft would have placed 897 new sinographic characters in Plane 10 (hexadecimal: 0A), a yet-untitled astral Unicode plane.[4] This was motivated by a "strong need of citizen real-name certification in China".[5] Since it would impact ISO/IEC 10646 (the Universal Coded Character Set, the ISO standard synchronised with Unicode), the draft was circulated in ISO/IEC JTC 1/SC 2, the ISO subcommittee responsible for ISO 10646. The Chinese national body maintained that "ISO/IEC 10646 do not specify the purpose of the 0A plane", which ISO 10646 denotes as "reserved for future standardization", and that this use was therefore "not inappropriate".[4]

However, since the intent of ISO 10646 was for Plane 10 to be reserved for future allocation by ISO 10646 and Unicode via their usual ballot process, not for it to be allocated unilaterally by national standards bodies, this proposed move was criticised by experts and other national bodies as one which would "destabilize the synchronization" between GB 18030 and ISO/IEC 10646 (and thus Unicode), and which would make it impossible to conform to both with a single implementation,[4] effectively forking Unicode.

As an alternative, the repertoire (eventually reduced to 622 characters after expert review) was fast-tracked into Unicode version 15.1 in the CJK Unified Ideographs Extension I block.[4] The CJK Unified Ideographs Extension D block was cited as a precedent, since it comprised a repertoire of urgently needed characters (UNCs) from IRG member bodies, whereas the IRG working-set initially slated to become Extension D would instead become Extension E.[6] For compactness, the block was allocated to the available space in the Supplementary Ideographic Plane after CJK Unified Ideographs Extension F, as opposed to on the Tertiary Ideographic Plane after CJK Unified Ideographs Extension H; this means that the CJK extension blocks are no longer in alphabetical order by extension letter.[7] Following this, the draft GB 18030 amendment was modified to use the Extension I code points.[5]

The Extension I characters make up the "GIDC23" Unihan source,[8] defined as sourced from the "ID system of the Ministry of Public Security of China, 2023".[9]

Block

CJK Unified Ideographs Extension I^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+2EBFx	𮯰	𮯱	𮯲	𮯳	𮯴	𮯵	𮯶	𮯷	𮯸	𮯹	𮯺	𮯻	𮯼	𮯽	𮯾	𮯿
U+2EC0x	𮰀	𮰁	𮰂	𮰃	𮰄	𮰅	𮰆	𮰇	𮰈	𮰉	𮰊	𮰋	𮰌	𮰍	𮰎	𮰏
U+2EC1x	𮰐	𮰑	𮰒	𮰓	𮰔	𮰕	𮰖	𮰗	𮰘	𮰙	𮰚	𮰛	𮰜	𮰝	𮰞	𮰟
U+2EC2x	𮰠	𮰡	𮰢	𮰣	𮰤	𮰥	𮰦	𮰧	𮰨	𮰩	𮰪	𮰫	𮰬	𮰭	𮰮	𮰯
U+2EC3x	𮰰	𮰱	𮰲	𮰳	𮰴	𮰵	𮰶	𮰷	𮰸	𮰹	𮰺	𮰻	𮰼	𮰽	𮰾	𮰿
U+2EC4x	𮱀	𮱁	𮱂	𮱃	𮱄	𮱅	𮱆	𮱇	𮱈	𮱉	𮱊	𮱋	𮱌	𮱍	𮱎	𮱏
U+2EC5x	𮱐	𮱑	𮱒	𮱓	𮱔	𮱕	𮱖	𮱗	𮱘	𮱙	𮱚	𮱛	𮱜	𮱝	𮱞	𮱟
U+2EC6x	𮱠	𮱡	𮱢	𮱣	𮱤	𮱥	𮱦	𮱧	𮱨	𮱩	𮱪	𮱫	𮱬	𮱭	𮱮	𮱯
U+2EC7x	𮱰	𮱱	𮱲	𮱳	𮱴	𮱵	𮱶	𮱷	𮱸	𮱹	𮱺	𮱻	𮱼	𮱽	𮱾	𮱿
U+2EC8x	𮲀	𮲁	𮲂	𮲃	𮲄	𮲅	𮲆	𮲇	𮲈	𮲉	𮲊	𮲋	𮲌	𮲍	𮲎	𮲏
U+2EC9x	𮲐	𮲑	𮲒	𮲓	𮲔	𮲕	𮲖	𮲗	𮲘	𮲙	𮲚	𮲛	𮲜	𮲝	𮲞	𮲟
U+2ECAx	𮲠	𮲡	𮲢	𮲣	𮲤	𮲥	𮲦	𮲧	𮲨	𮲩	𮲪	𮲫	𮲬	𮲭	𮲮	𮲯
U+2ECBx	𮲰	𮲱	𮲲	𮲳	𮲴	𮲵	𮲶	𮲷	𮲸	𮲹	𮲺	𮲻	𮲼	𮲽	𮲾	𮲿
U+2ECCx	𮳀	𮳁	𮳂	𮳃	𮳄	𮳅	𮳆	𮳇	𮳈	𮳉	𮳊	𮳋	𮳌	𮳍	𮳎	𮳏
U+2ECDx	𮳐	𮳑	𮳒	𮳓	𮳔	𮳕	𮳖	𮳗	𮳘	𮳙	𮳚	𮳛	𮳜	𮳝	𮳞	𮳟
U+2ECEx	𮳠	𮳡	𮳢	𮳣	𮳤	𮳥	𮳦	𮳧	𮳨	𮳩	𮳪	𮳫	𮳬	𮳭	𮳮	𮳯
U+2ECFx	𮳰	𮳱	𮳲	𮳳	𮳴	𮳵	𮳶	𮳷	𮳸	𮳹	𮳺	𮳻	𮳼	𮳽	𮳾	𮳿
U+2ED0x	𮴀	𮴁	𮴂	𮴃	𮴄	𮴅	𮴆	𮴇	𮴈	𮴉	𮴊	𮴋	𮴌	𮴍	𮴎	𮴏
U+2ED1x	𮴐	𮴑	𮴒	𮴓	𮴔	𮴕	𮴖	𮴗	𮴘	𮴙	𮴚	𮴛	𮴜	𮴝	𮴞	𮴟
U+2ED2x	𮴠	𮴡	𮴢	𮴣	𮴤	𮴥	𮴦	𮴧	𮴨	𮴩	𮴪	𮴫	𮴬	𮴭	𮴮	𮴯
U+2ED3x	𮴰	𮴱	𮴲	𮴳	𮴴	𮴵	𮴶	𮴷	𮴸	𮴹	𮴺	𮴻	𮴼	𮴽	𮴾	𮴿
U+2ED4x	𮵀	𮵁	𮵂	𮵃	𮵄	𮵅	𮵆	𮵇	𮵈	𮵉	𮵊	𮵋	𮵌	𮵍	𮵎	𮵏
U+2ED5x	𮵐	𮵑	𮵒	𮵓	𮵔	𮵕	𮵖	𮵗	𮵘	𮵙	𮵚	𮵛	𮵜	𮵝	𮵞	𮵟
U+2ED6x	𮵠	𮵡	𮵢	𮵣	𮵤	𮵥	𮵦	𮵧	𮵨	𮵩	𮵪	𮵫	𮵬	𮵭	𮵮	𮵯
U+2ED7x	𮵰	𮵱	𮵲	𮵳	𮵴	𮵵	𮵶	𮵷	𮵸	𮵹	𮵺	𮵻	𮵼	𮵽	𮵾	𮵿
U+2ED8x	𮶀	𮶁	𮶂	𮶃	𮶄	𮶅	𮶆	𮶇	𮶈	𮶉	𮶊	𮶋	𮶌	𮶍	𮶎	𮶏
U+2ED9x	𮶐	𮶑	𮶒	𮶓	𮶔	𮶕	𮶖	𮶗	𮶘	𮶙	𮶚	𮶛	𮶜	𮶝	𮶞	𮶟
U+2EDAx	𮶠	𮶡	𮶢	𮶣	𮶤	𮶥	𮶦	𮶧	𮶨	𮶩	𮶪	𮶫	𮶬	𮶭	𮶮	𮶯
U+2EDBx	𮶰	𮶱	𮶲	𮶳	𮶴	𮶵	𮶶	𮶷	𮶸	𮶹	𮶺	𮶻	𮶼	𮶽	𮶾	𮶿
U+2EDCx	𮷀	𮷁	𮷂	𮷃	𮷄	𮷅	𮷆	𮷇	𮷈	𮷉	𮷊	𮷋	𮷌	𮷍	𮷎	𮷏
U+2EDDx	𮷐	𮷑	𮷒	𮷓	𮷔	𮷕	𮷖	𮷗	𮷘	𮷙	𮷚	𮷛	𮷜	𮷝	𮷞	𮷟
U+2EDEx	𮷠	𮷡	𮷢	𮷣	𮷤	𮷥	𮷦	𮷧	𮷨	𮷩	𮷪	𮷫	𮷬	𮷭	𮷮	𮷯
U+2EDFx	𮷰	𮷱	𮷲	𮷳	𮷴	𮷵	𮷶	𮷷	𮷸	𮷹	𮷺	𮷻	𮷼	𮷽	𮷾	𮷿
U+2EE0x	𮸀	𮸁	𮸂	𮸃	𮸄	𮸅	𮸆	𮸇	𮸈	𮸉	𮸊	𮸋	𮸌	𮸍	𮸎	𮸏
U+2EE1x	𮸐	𮸑	𮸒	𮸓	𮸔	𮸕	𮸖	𮸗	𮸘	𮸙	𮸚	𮸛	𮸜	𮸝	𮸞	𮸟
U+2EE2x	𮸠	𮸡	𮸢	𮸣	𮸤	𮸥	𮸦	𮸧	𮸨	𮸩	𮸪	𮸫	𮸬	𮸭	𮸮	𮸯
U+2EE3x	𮸰	𮸱	𮸲	𮸳	𮸴	𮸵	𮸶	𮸷	𮸸	𮸹	𮸺	𮸻	𮸼	𮸽	𮸾	𮸿
U+2EE4x	𮹀	𮹁	𮹂	𮹃	𮹄	𮹅	𮹆	𮹇	𮹈	𮹉	𮹊	𮹋	𮹌	𮹍	𮹎	𮹏
U+2EE5x	𮹐	𮹑	𮹒	𮹓	𮹔	𮹕	𮹖	𮹗	𮹘	𮹙	𮹚	𮹛	𮹜	𮹝
Notes 1.^ As of Unicode version 15.1 2.^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension I block:

Version	Final code points[lower-alpha 1]	Count	L2 ID	WG2 ID	IRG ID	Document
15.1	U+2EBF0..2EE5D	622	L2/23-011			Lunde, Ken (2023-01-11), "18) GB 18030-2022 Amendment", CJK & Unihan Group Recommendations for UTC #174 Meeting
			L2/23-057	N5201	N2591	Draft GB 18030-2022 Amendment Feedback & Recommendations, 2023-02-03
			L2/23-100			GB 18030-2022 Amendment, Draft 2 + Disposition of Comments, Draft 1, 2023-04-10
			L2/23-082			Lunde, Ken (2023-04-22), "02 and 03", CJK & Unihan Group Recommendations for UTC #175 Meeting
			L2/23-106	N5214		Lunde, Ken (2023-04-24), "The Alternate Proposal—Unicode Version 15.1", Proposal to provisionally assign or accept 603 urgently-needed ideographs
			L2/23-076			Constable, Peter (2023-05-01), "E.4.2 Proposal to provisionally assign or accept 603 urgently-needed ideographs", UTC #175 Minutes
			L2/23-114R	N5214R2		Lunde, Ken (2023-07-05), Proposal to encode 622 urgently needed ideographs in UCS
			L2/23-115			Constable, Peter (2023-05-01), USNB Comments on Draft 2 of GB 18030-2020 Amendment 1 and recommendation for ISO/IEC 10646:2022 Amendment 2
			L2/23-154	N5238		Revision of 622 UNCs of China (Feedback on WG2 N5214), 2023-06-30
			L2/23-163			Lunde, Ken (2023-07-11), "01", CJK & Unihan Group Recommendations for UTC #176 Meeting
			L2/23-157			Constable, Peter (2023-07-31), "E.1 Section 1) CJK Unified Ideographs Extension I", UTC #176 Minutes
Proposed code points and characters names may differ from final code points and names

References

"Unicode character database". The Unicode Standard. Retrieved 2023-09-12.
"Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-09-12.
Kaplan, Michael S (2013-03-28). "You call it GB18030, I call it UTF-GBK..." Sorting it all out.
United States National Body (May 1, 2023). "USNB Comments on Draft 2 of GB 18030-2022 Amendment 1 and recommendation for ISO/IEC 10646:2020 Amendment 2" (PDF). ISO/IEC JTC1/SC2 N4852, WG2 N5222; UTC L2/23-115.
China National Body (2023-10-13). "IRG #61 Activity Report" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2623; UTC L2/23-240.
Lunde, Ken (2023-04-22). "03) L2/23-100: GB 18030-2022 Amendment, Draft 2 + Disposition of Comments, Draft 1" (PDF). CJK & Unihan Group Recommendations for UTC #175 Meeting. UTC L2/23-082.
"CJK/Unihan Changes". Unicode 15.1.0. Unicode Consortium. 2023-09-12. To keep the CJK block ranges as compact as possible, Extension I has been added to Plane 2, instead of directly after Extension H on Plane 3. Implementers should also check that their code does not assume that CJK extensions all occur in alphabetic order by the extension letter.
"CJK Unified Ideographs Extension I" (PDF). The Unicode Standard, Version 15.1. Unicode Consortium. 2023.
Lunde, Ken; Cook, Richard, eds. (2023-09-01). "kIRG_GSource". Unicode Han Database (Unihan). Unicode 15.1.0. UAX #38.

CJK Unified Ideographs Extension I

Background

Block

History

References

Further reading