Modern Chinese characters

Modern Chinese characters (traditional Chinese: 現代漢字; simplified Chinese: 现代汉字; pinyin: xiàndài hànzì) are the Chinese characters used in modern languages, especially in the standard Mandarin Chinese.[1]

While traditional study[2][3] has paid substantial attention to the historical development of Chinese characters and their associated writing systems, there are a variety of novel aspects of modern Chinese characters, including that of orthography, phonology, and semantics, as well as matters of collation and organization and statistical analysis, computer processing, and pedagogy.[4][5]

According to Ethnologue,[6] "Mandarin Chinese is the largest language in the world, if you count only native speakers. If you count both native and non-native speakers, English is the largest (with Mandarin being the 2nd largest)." And Mandarin is written in modern Chinese characters.[7][8]


Historical development

Since maturing as a complete writing system, Chinese characters have had an uninterrupted history of development over more than 3,000 years, with stages including

leading to the modern written forms,[9] as illustrated by the development of the character ; ; 'horse':

Oracle Bronze Bigseal Seal Clerical Regular Simplified

In 1980, Zhou Youguang, known as the "father of pinyin", published a paper entitled "Introduction to the Studies of Modern Chinese Characters"—within, he detailed aspects of the numbers, orders, forms, sounds, meanings, and pedgagogy regarding the modern characters.[10] His paper was followed by Gao Jiaying's "A Brief Discussion on the Establishment of Modern Chinese Character Studies", [11] and other related writings on the subject.[12] At least five textbooks have been published in this area.[4][5][1][13][14]

Regional varieties

Chinese characters were originally invented for writing the Chinese language, and were later employed for other East Asian languages, developing as part of a shared orthographic tradition. Among other places, for ordinary and historical purposes, Simplified characters are primarily used in mainland China, Singapore, and Malaysia, Traditional characters are used in Taiwan, Hong Kong, and Macau, along with kanji in Japan, hanja in Korea, and chữ Hán in Vietnam.[15] For example, the Traditional character (wide, broad) has a Simplified form of 广 and a shinjitai kanji form of .


In contrast with the Latin alphabet used to write many languages, including English, Chinese characters have many divergent properties, including:[16]

  • There are tens of thousands of different characters,
  • A character is in a two-dimensional block structure,
  • A character may have dozens of strokes,
  • In most cases, the character denotes a morpheme.[17]
  • Monosyllabic, normally one character per syllable.[18]
  • Texts written in Chinese characters are intelligible to readers of different dialects and different dynasties.



Modern characters include:[19]

  • Received characters as standardized during the Ming, known as jiu zixing, accounting for about 75% of modern characters, e.g. ; 'the Sun', ; 'the Moon', ; 'metal', ; 'wood', ; 'water', ; 'fire', and ; 'earth'
  • Newly coined characters, about 2.7% of the total number, e.g. ; 'ammonia', ; 'iodine', 乒乓; 'ping pong', 'table tennis', ; 'do not', and ; 'pot';
  • Repurposed ancient characters, with pronunciations and meanings differing from ancient ones, e.g. ; 'she', and (used in 旮旯; 'corner');
  • Simplified forms, often derived from variants already in common use, about 20%, e.g. 汉语; 漢語; 'Chinese language', 学习; 學習; 'study'.
  • Modern dialect characters, such as the Cantonese characters included in Hong Kong Supplementary Character Set.

Number and sets

Due to the dynamic development of languages, there is no definite number of modern Chinese characters. However a reasonable estimation can be made by a survey of the character sets of relevant standard lists and influential dictionaries in the countries and regions where Chinese characters are used.[20]

Mainland China

The important standards in the People's Republic of China include List of Frequently Used Characters in Modern Chinese (现代汉语常用字表), totalling 3,500 characters,[21] and List of Commonly Used Characters in Modern Chinese (现代汉语通用字表 with 7,000 characters, including the 3,500 characters in the previous list).[22] But the current standard is Table of General Standard Chinese Characters, which was released by the State Council in June 2013 to replace the previous two lists and some other standards. It includes 8,105 characters of the Simplified Chinese writing system, 3,500 as primary, 3,000 as secondary, and 1,605 as tertiary. In addition, there are 2,574 Traditional characters and 1,023 variants.[23] And the character sets of Xinhua Zidian[24] and Xiandai Hanyu Cidian,[25] the most popular modern Chinese character dictionary and word dictionary. They each includes over 13,000 characters of Simplified characters, Traditional characters and some variants.

A college graduate who is literate in written Chinese knows between three and four thousand characters. Specialists in classical literature or history, who would often encounter characters no longer in use, are estimated to have a working vocabulary of between 5,000 and 6,000 characters.[26]


In Taiwan, there are the Chart of Standard Forms of Common National Characters with 4,808 characters, and the Chart of Standard Forms of Less-Than-Common National Characters 次常用國字標準字體表; Cì chángyòng guózì biāozhǔn zìtǐ biǎo, with 6,341 common national characters. Both lists were released by the Ministry of Education, with a total of 11,149 characters of the Traditional Chinese writing system.

Hong Kong

In Hong Kong, there is the List of Graphemes of Commonly-Used Chinese Characters for elementary and junior secondary education, totally 4,762 characters. This list was released by the Education Bureau, and is very influential in the educational circles.


In Japan, there are the jōyō kanji, frequently-used Chinese characters, designated by the Japanese Ministry of Education, including 2,136 characters), and jinmeiyō kanji for use in personal names, currently including 983 characters).


In Korea, there are the Basic Hanja for educational use (漢文敎育用基礎漢字, a subset of 1,800 Hanja defined in 1972 by a South Korea educational standard), and the Table of Hanja for Personal Name Use (人名用追加漢字表), published by the Supreme Court of Korea in March 1991.[27] The list expanded gradually, and to year 2015 there were 8,142 hanja permitted to be used in Korean names.[28]

With consideration of all the character sets mentioned above, the total number of modern Chinese characters in the world is over 10,000, probably around 15,000.[29][30] Such an estimation should not be counted as too rough, considering that there are totally over 90,000 Chinese characters (CJK Unified Ideographs) in Unicode, and more if every Chinese character ever appeared in the world is to be included.[31]


Chinese character frequencies are calculated on data of corpora. A corpus is a collection of texts representative of one or more languages. The frequency of a character is the ratio of the number of its occurrences in the corpus to the total number of characters of the corpus. The formula for calculating frequency is Fi = ni N  ×  100%,

where ni is the number of times a certain (ith) Chinese character appears in the corpus, and N is the total number of characters in the corpus.[32]


The first person to make a statistic study on the frequency of Chinese characters was Chen Heqin (陳鶴琴).[33] In the 1920s, he and his assistants spent two years manually counting the characters in a corpus of 554,478 characters, and obtained 4,261 different characters with frequency information. They then compiled a book Applied Lexis of Vernacular Chinese (語體文應用字彙).[34] The 10 most frequently-used words in their corpus are, by descending frequency, ; 'of', ; 'no', 'not', ; 'one, 'a(n), ; 'PERF', ; 'to be', ; 'I/me', ; 'on', 'up', ; 'he/him', ; 'to have', ; 'person'.

2001 frequency survey

In 2001, the Chinese University of Hong Kong published a number of frequency lists on the Web,[35] entitled "Hong Kong, Mainland China and Taiwan Chinese Frequency: a Trans-regional Diachronic Survey". The frequency data came from a grand corpus with a number of sub-corpora representing the Chinese languages in the three regions of Hong Kong, Mainland China and Taiwan and in the two time periods of the 1960s and 1980/90's. Each sub-corpus includes about 5,000 different characters, as shown by their frequency lists.

From the data of these frequency lists, some important and interesting features of Chinese can be discovered:

  1. , and are the three most frequently-used characters across the regions and time periods of the corpora. And is number one in all the frequency lists.
  2. The 10 most frequently-used characters across the three regions and two time periods are very consistent. That means a frequently-used character in one region or period is very likely to be frequently-used in another region or period.
  3. The 100 most frequently-used characters in the 80/90's cover (i.e., have an accumulated frequency of) 41.00% of the Hong Kong texts of that period, 41.34% of the Mainland texts, and 41.88% of the Taiwan texts. That is more than 4 out of every 10 characters for the three regions.
  4. The 1000 most frequently-used characters in the 80/90's cover 89.25% of the Hong Kong texts of that period, 90.26% of the Mainland texts, and 88.74% of the Taiwan texts.

Survey by the Chinese government

Large-scale surveys by the Ministry of Education and the State Language Commission of PRC over the years have shown that the use of Chinese characters and words has a strong distribution pattern. The number of characters used in modern Chinese is stable at about 10,000 for quite a few years. The number of most frequently-used characters with a coverage rate of 80%, 90%, and 99% is about 590, 960, and 2,400 respectively.[36]

Chinese character frequency is essential to quantitative research of Chinese language, and has been applied to language teaching, dictionary composition, word lists compilation, Chinese character information processing, etc.[37]


The orders or sorting methods of Chinese dictionaries are traditionally divided into three categories: form-based orders, sound-based orders and meaning-based orders.[38] In modern Chinese, people also use frequency orders.


In this category of orders, words are sorted according to various features of the forms or shapes of Chinese characters. Comparing with sound-based orders, form-based orders have the advantages of (a) allowing lookup of characters and words without knowing their pronunciations, and (b) effective collation of large character sets without support from other sorting methods. There are two subcategories of form-based orders: stroke-based orders and component-based orders, which further includes radical-based orders, etc.[39][40]


There are two major sound representation systems for Standard Chinese: pinyin and bopomofo. Accordingly, there is a pinyin alphabetical order and a bopomofo-based order.[41]


Meaning-based orders, also called semantics-based orders, arrange characters and words in a hierarchical structure of semantic categories.[42]


This category of orders have Chinese characters sorted by their frequency of uses, normally in descending order. That means the most frequently-used character is at the top of the list. A frequency list is created from a text corpus. In corpus linguistics, the frequency of a character is the ratio percentage of its number of occurrences in the corpus to the total number of characters of the corpus.[32]

Orders of words

A Chinese word consist of one or more characters. Single-character words can be sorted by a character order, and multi-character words can be sorted character by character in a similar way.[43]


Modern Chinese characters appear in the form of square blocks. There are three layers or levels of structural units of Chinese characters: strokes, components, and whole characters.[44][lower-alpha 1] For example, ; 'character' has two components, each of which is composed of three stokes:

= (㇔㇔㇇) + (㇇㇚㇐).


Strokes (笔画; 筆劃; bǐhuà) are the smallest writing units of Chinese characters. When writing a Chinese character, the trace of a dot or a line left on the writing material (such as paper) from pen-down to pen-up is called a stroke.[46]

Stroke number is the number of strokes of a Chinese character. It varies, for example, characters and have only one stroke, while character has 36 strokes, and (composed of three ; 'dragon') consists of 48 strokes.[47]

Stroke forms refer to the shapes of strokes. The stroke forms of a standard Chinese character set can be classified into a stroke table (or stroke list), for instance, the Unicode CJK strokes list has 36 types of strokes: [48]

Stroke order is the order in which strokes are written to form a Chinese character, for example, the stroke order of is ㇓,㇐,㇑. [49]


Chinese characters are composed of components, which are in turn composed of strokes.[50] In most cases, a component is larger than a stroke (i.e., consists of more than one stroke) and smaller than the whole character (combines with some other components to form a character). For example, in character , there are two components, and , each with more than one stroke: : ㇓㇑) and (: ㇓㇐㇐㇑). In the special cases of one-stroke characters, such as and , a stroke is a component and is a character.

Chinese character component analysis is to divide or separate a character into components. There are two ways for Chinese character dividing, hierarchical dividing and plane dividing. Hierarchical dividing separate layer by layer from larger to smaller components, and finally get the primitive components. Plane dividing separate out the primitive components all at once.[51]

A component that can independently form a character is a character component, or a component of independent character formation (成字部件). For example, component formed character independently, and is a component in characters , and . A component that can not independently form a character is a non-character component, or a component of dependent character formation (非成字部件). For example, component in character , and .[50]

A component that cannot be (further) divided into smaller components by the rules is a primitive component, or basic component (基础部件; 基礎部件). Primitive components are the final-level components of hierarchical dividing. For example, components and in character . A component composed of two or more primitive components is a compound component (合成部件). For example, component in character , and .[52]

Whole characters

'Whole characters' 汉字整字; 漢字整字; hànzì zhěngzì lie at the final level of the stroke–component–character Chinese character composition. [53] An non-decomposable character (独体字) consists of one primitive component, which is directly formed by strokes and can not be decomposed into smaller components. [54] A decomposable character (合体字) can be broken down into multiple components.

The structure of a Chinese character is the pattern or rule in which the character is formed by its (first level) components.[52] Chinese character structures include:

  • Single-component structure (i.e. a non-decomposable character): The character is formed by a single primitive component, such as , and .
  • Left-right structure: The character is formed by a component on the left and another one on the right, such as , and .
  • Up-down structure: The character is formed by a component above another component, such as , and .
  • Surrounding structure: One component is completely or partially surrounded by another component, such as , , , , , and .

Popular typefaces of modern Chinese characters include Song (宋体; 宋體 or Ming (明体; 明體), Fangsong (仿宋体; 仿宋體), those inspired by regular script (楷体; 楷體), by clerical script (隶体; 隸體), as well as sans-serif 黑体; 黑體; 'black form' and Wei (魏体; 魏體).[55]

In Chinese, in addition to the international points system, a unique 'number' (字号; 字號) system is used for character sizes. For example, the Simplified Chinese version of Microsoft Word allows setting font sizes by either points or numbers.[56]


Standard Chinese of Chinese characters is based on the Beijing dialect of Mandarin.[57]

Normally a Chinese character is read with one syllable. Some Chinese characters have more than one pronunciation (polyphonic characters). Some syllables correspond to more than one character (homophonic characters).[58]

Polyphonic characters

Polyphonic characters those with two or more pronunciations, as opposed to monophonic characters with only one.

A polyphonic monosemous character (多音同義字) has two or more pronunciations of the same meaning. For example: the English word 'ton' is transliterated as ; , with two pronunciations of dūn and dùn coexisting in some old dictionaries, with both sharing the meaning of 'ton'. Since is both a character and a word, it is also a polyphonic monosemous character, as well as a polyphonic monosemous word.

In December 1985, the PRC announced the Table of Mandarin Words with Variant Pronunciation (普通话异读词审音表) to define the standard pronunciations for polyphonic monosemous characters.[59] In Taiwan, there is a similar official standard for Mandarin words with variant sounds, where pronunciations are expressed in bopomofo instead of pinyin.

A polyphonic polysemous character (多音多義字) has two or more pronunciations, and different pronunciations represent different meanings. For example, character ; " is pronounced cháng with the meaning of 'long', or zhǎng with the meaning of 'grow'. The simplified character is pronounced as zāng, from ; 'dirty' or as zàng, from ; 'internal organs'. The pronunciation of such characters is determined by context.

Polyphonic polysemous characters may hinder the learning and application of Chinese characters and should be reduced. There are two main methods:[60]

  • Chang pronunciation. A common approach is to change rare sounds and sub-frequent sounds to frequent readings. And change the ancient pronunciations to today's pronunciations.
  • Change form. It means changing some sounds and meanings to be expressed by other characters.


Homophonic characters (同音字) are those sharing the same pronunciation, as opposed to heterophonic characters simplified Chinese: 异音字; traditional Chinese: 異音字. Homophonic characters are either narrowly understood as having identical initials, finals, and tones, or more broadly as merely having identical initials and finals, with tones possibly differing. For example, 馬、瑪、碼、螞 are all pronounced , while ; , ; , ; , ; are homophones only in the broader sense. Usually, people understand homophony in characters as referring to the narrow sense.[61]

Homophonic characters are widespread in Mandarin: there are around 1,300 possible syllables, including tonal distinctions—excluding tones, the number of different syllables drops to 400. Meanwhile, the written language has more than 10,000 characters, for an average of 7.5 characters mapped to each syllable.[62]

Zhou Youguang (1993) introduced two ways homophones have been historically reduced:[63][64]

  • Differentiate character pronunciations without changing the word. For example: 癌症; 'cancer' was originally pronounced yánzhèng, later changed to áizhèng due to confusion with 炎症; yánzhèng; 'inflammation';
  • Differentiate words and pronunciation. For example: 期終; qízhōng; 'end-term') was confused with 期中; qízhōng; 'mid-term', later the synonym 期末; qímò; 'end-term' began to be used instead.


There are two systems for phonetic notation of Chinese characters.

  • Bopomofo: for example, 香港; 'Hong Kong';
  • Pinyin: for example, 香港; Xiānggǎng

In pinyin, either diacritics ( or numbers ma1 may be used to mark tones. The Jyutping system for Cantonese uses numbers, e.g. 香港; hoeng1gong2

Kun'yomi are readings of kanji using native Japanese words mapped to the meanings of borrowed Chinese characters. Characters have also been borrowed with on'yomi readings with borrowed Sino-Japanese pronunciations. For example, when Chinese character ; shān; 'mountain' was borrowed in Japan, people read it with either a native kun'yomi pronunciation of yama, or with a Sino-Japanese on'yomi pronunciation of shan.[65] These phenomena also appear in Mandarin and English, such as i.e. being read aloud as 'that is'. Qiu Xigui called it 同義換讀; 'synonymous reading'.[66]


In modern Chinese, a character may represent a word, a morpheme in compound word, or alternatively a meaningless syllable combined with some other syllables or characters to form a morphine.[67] In a language, morphemes are the minimal units of meaning.[68] Some characters have only one meaning, some have multiple meanings, and some characters largely share the same meaning.[69]

Monosemous and polysemous characters

A character with only one meaning is a monosemous character, and a character with two or more meanings is a polysemous character. According to statistics from the Chinese Character Information Dictionary, among the 7,785 mainland standard Chinese characters in the dictionary, there are 4,139 monosemous characters, 3,053 polysemous characters and 593 meaningless characters. [70]

The meaning people assigned to a character when it was created is the 'original meaning' (本义; 本義) of the character. For example, the original meaning of ; bīng is 'weapon', being an example of a character with multiple semantic components (会意字; 會意字:[71] a ; jīn; 'cutting knife' being held with both hands .

The meaning developed from the original meaning of a character through association is the 'extended meaning' (引申义; 引申義). For example, 士兵; 'soldier' is an extended meaning of .

The meaning added through the loan of homonymous sounds is the 'phonetic-loan meaning' (假借义; 假借義). For example, the original meaning of ; is 'dustpan': its use as the pronoun 'his', 'her', 'its' is due to its use as a phonetic loan.


Synonym characters are a group of Chinese characters that have the same or similar meaning. The characters in a synonym group often differ in frequency of use and word-formation ability, and there are some (subtle) differences in meaning and emotional color. The knowledge of synonym characters will help students write Chinese more correctly and express meanings more accurately.[72] For examples

Both and have the meaning of 'face'. But there are some differences.[73] Generally, is not used as an independent word in Mandarin, but only in multi-character compounds. For example, 見面; 'to meet', 面目; 'face and eyes', 面紅耳赤; 'red face', 面黃肌瘦; 'yellow face, with thin muscles'. The in these words cannot be equivocated with . In contrast, can usually be used alone in Mandarin as its own word, as well as in compounds such as 臉譜; 'facial makeup', 花臉; 'painted face', 娃娃臉, 圓臉; 'round face' and 方臉; 'square face', 一張可愛的臉; 'a cute face'. The in these words cannot be replaced by .

Meanings of characters and words

The meaning of a single-character word is its character meaning. The meaning of a multicharacter word is generally derived from the meanings of the characters. The relationships between the meaning of a compound word and of its characters are categorized as follows: [74]

  1. Synonyms: (A + B = A = B), such as 聲音; 'sound' = ; 'sound' = ; 'sound'.
  2. Synthetic meaning (A + B = AB), such as 品德; 'moral character' = and ; 'morality'
  3. Expanded meaning (A + B = AB + ε), such as 景物; 'scenery' from ; 'situation' + ; 'thing'
  4. Partial meaning (A + B = A or B, but not the other), for example 國家; 'country' = ; 'country' but ≠ ; 'family', 容易; 'easy' = ; 'easy' but ≠ ; 'countenance'.
  5. Complementary meaning (A + B = ε), for example 東西; 'thing', 'stuff' is not ; 'east' + 西; 'west'.

According to sampling statistics, categories 2 and 3 account for 89.7% of the compound words.


In the analysis of internal structures, Chinese characters are decomposed into internal structural components in relations with the sound and meaning of the characters.[75]

Traditional classification

In Shuowen Jiezi, Xu Shen proposed six categories (六書; liùshū; 'Six Writings') of Chinese characters, including [76]

  1. Pictograms (Chinese: 象形; Pinyin: xiàngxíng; 'form imitation'), single-semantic-component characters which are drawings of the objects they represent.
  2. Simple ideograms (指事; zhǐshì; 'indication'), express an abstract idea with an iconic form.
  3. Compound ideographs (會意; huìyì; 'joined meaning'), combine two or more semantic components to indicate the meaning of the character.
  4. Phono-semantic compound characters (形声; 形聲; xíngshēng; 'form and sound'), consist of phonetic components and semantic components.
  5. Derivative cognates (轉注/转注; zhuǎnzhù; 'reciprocal meaning'), two characters had similar Old Chinese pronunciations and may have had the same etymological root.
  6. Rebus (phonetic loan) characters (假借; jiǎjiè; 'borrowing, making use of'), are characters "borrowed" to write another morpheme which is pronounced the same or nearly the same.

Modern classification

The traditional Six Writings pre-supposed that every internal component can either represent the sound or meaning of the character. But, after the long evolution of Chinese writing systems, quite a few components can no longer effectively play the roles and have become pure form components. From the internal structure point of view, modern Chinese characters are composed of semantic components, phonetic components and pure form components. And they have formed seven categories of modern Chinese characters: [77][78]

Semantic component characters are composed of semantic components and include [79] [80]

  • Pictograms, such as 田 (field), 井 (well), 門 (door).
  • Simple ideograms, such as 一 (one), 二 (two), 刃 (blade).
  • Compound ideographs. For example, 拿 (take): 合 (close) 手 (hands) together to take; 掰 (break apart): 分 (Separate) something with two 手 (hands); 从 (follow): one 人 (person) follows another person; 泪 (tears): 氵(water) from 目 (eyes).
  • Special methods, such as 叵 (can not): turn 可 (can) to the opposite (right) side; 冇 (none, not have): 有 (have) taken away "二" (contents).

Phonetic component characters are composed of phonetic components. [79] For example,

  • Phonetic-loan, for example, character "花" (flower) is borrowed to mean "spending".
  • Used in a transliterated foreign word, e.g., the characters in words "打" (dá, dozen) and "馬達" (mǎdá, motor).
  • Multi-phonetic component characters, for example, "新" (xīn) was originally a semantic-phonetic character, but its modern meaning of "new" has nothing to do with the original semantic component of "斤" (jīn, 0.5 kg), but the sounds are similar. In this way, "新" (xīn) then has two phonetic components: "亲" (qīn) and "斤" (jīn).

Pure form characters are composed of form components, which neither represent the sound nor the meaning of the characters.[81] For example:

  • 日 (sun): The 日 character in modern regular script is no longer of round shape.
  • 广 (wide): The traditional Chinese character is "廣".
  • 鹿 (deer): Oracle resembled a deer.

Semantic-phonetic characters, also called "phono-semantic characters", consist of semantic components and phonetic components. [82] There are six combinations:

  1. Left meaning (semantic) and right sound (phonetic), such as 肝 (gān, liver), 惊 (jīng, fear), 湖 (hú, lake);
  2. Right meaning and left sound, such as 鵡 (wǔ, parrot), 剛 (gāng, firm), 甥 (shēng, nephew);
  3. Upper meaning and lower sound: 霖 (lín, rain), 茅 (máo, grass) and 竿 (gān, pole);
  4. Lower meaning and upper sound: 盂 (yú, bowl), 岱 (dài, Mount Tai), 鯊 (shā, shark);
  5. Outer meaning and inner sounds: 癢 (yǎng, itch), 園 (yuán, garden), 衷 (zhōng, heart), 座 (zuò, seat), 旗 (qí, flag);
  6. Inner meaning and outer sound: 辮 (biàn, braid), 悶 (mèn, dull), 摹 (mó, imitation).

Semantic-form characters are composed of semantic components and pure form components.[83] Many of these characters were originally semantic-phonetic characters. Due to subsequent changes in the pronunciation of the phonetic components or the characters, the phonetic components could not effectively represent the pronunciation of the character and became pure form. For example: [84]

  • 布 (bù, cloth): used to have semantic (component) 巾 (scarf) and phonetic 父 (fù), the phonetic component is no longer 父.
  • 急 (jí, urgent): used to have semantic 心 (heart) and phonetic 及 (jí). Now the upper component no longer looks like 及.
  • 鸡 (jī, chicken), not read as 又 (yòu).

Phonetic-form characters are composed of phonetic components and pure form components.[85] They mostly came from ancient semantic-phonetic characters, where the semantic components lost their functions and became pure form. For example,

  • 球 (qiú, ball): Originally refers to a kind of beautiful jade, with semantic component 王(玉, jade). Later, it was borrowed to represent a ball, and then extended to any round three-dimensional object, and 王(jade) became a pure form component, while 求 (qiú) remains a phonetic component.
  • 笨 (bèn, stupid): Originally refers to the inner white layer of bamboo, with semantic component 竹 (bamboo) and phonetic (běn). Later, the character was borrowed by sound to mean stupid.
  • 华:This is a simplified character with phonetic 化, and pure form component 十.

Semantic-phonetic-form characters consist of the three kinds of components. For example, [81]

  • 岸 (àn, bank, shore), originally had semantic component ⿱山厂 and phonetic 干 (gàn). In modern Chinese, ⿱山厂 is not a character or radical with a sound or meaning, but 山 can still express meaning, while 厂 remains a pure form component.
  • 聽 (tīng, listen), semantic 耳 (ear) and phonetic 壬 (ting3). In modern Chinese characters, the right part has become a pure form component.

Semantic-phonetic-form characters are very rare and the examples above are not quite persuasive. Whether they can be justified as an internal structural category remains to be further studied. (If not a category, then the classification above can also be called "New Six Writings")

According to Yang, [83] among the 3,500 frequently used Chinese characters of their experiment, semantic component characters are the least, accounting for about 5%; pure form component characters account for about 18%; semantic-form and phonetic-form characters account for about 19%. The largest group is semantic-phonetic characters, accounting for about 58%.



The historical milestones of Chinese character simplification include: [86] [87]

In 1909, Lu Feikui published article "Vulgar Chinese Characters Should Be Used in General Education" (普通教育當采用俗體字). The May Fourth Movement further promoted Chinese character simplification.

In August 1935, the Ministry of Education of China in Nanjing published the "List of the First Batch of Simplified Chinese Character" (第一批簡體字表), which included 324 characters.

In January, 1956, the Chinese Character Simplification Scheme was approved by the State Council of China.

In May, 1964, the General list of simplified characters (簡化字總表) was published. A revised version was published in 1986.

In June 2013, the Table of General Standard Chinese Characters was released by the State Council of China. It includes 8,105 characters of the Simplified Chinese writing system. In addition, there are 2,574 corresponding Traditional characters and 1,023 variants.


There are four main sources of simplified characters: [88]

  1. Ancient characters, such as: 云 (雲, cloud), 礼 (禮, etiquette), 后 (後, after)
  2. Simplified Chinese characters popular in the society, such as: 体 (軆, body), 声 (聲, sound), 铁 (鐵, iron).
  3. Cursive regularized characters, for example: 书 (書, book), 为 (爲, for), 东 (東, east).
  4. Newly coined characters, for example: 国 (國, country), 拥 (擁, support), 护 (護, protect).


The methods to simplify Chinese characters include [89] [90]


That is, to omit some components of the character, for example:

  • Omit one side, such as 録→录, 號→号, 雲→云, 麗→丽;
  • Omit both sides, such as 術→术, 裏→里;
  • Omit a corner, such as 際→际、墾→垦;
  • Keep a corner, such as 聲→声, 醫→医,
  • Omit inside, such as 廣→广, 奮→奋;
  • Omit outside, such as 開→开;
  • Omit strokes, such as 減→减, 淨→净, 鹵→卤;
  • Others, such as 匯→汇, 齒→齿, 瘧→疟, 滅→灭.


That is to change forms based on the original characters. For example,

  • Change one or both components of a semantic-phonetic character, such as 驚→惊, 護→护, 響→响, (鐘鍾)→钟.
  • Change to semantic-phonetic characters, such as 竄→窜, 郵→邮, 樁→桩.
  • Change components of multi-semantic characters, such as: 塵→尘, 筆→笔.
  • Change to multi-semantic characters, such as: 簾→帘, 體→体, 竈→灶.
  • Keep outline (cursive script regularized), such as: 龜→龟, 報→报, 肅→肃, 傘→伞, 齊→齐, 車→车, 堯→尧, 樂→乐, 發→发.
  • Symbolize components, such as: 僅→仅, 漢→汉, 鄧→邓, 區→区, 師→师.
  • Simplify radicals, such as: 訁→讠 (說話談... → 说话谈...), 飠→饣 (飲饃餓... → 饮馍饿), 釒→钅 (鋼鐵銅... → 钢铁铜...).
  • Others: 舊→旧, 靈→灵, 辦→办.


Usually replace the whole character with a character of similar sound. For example,

  • 穀→谷 (gǔ), 後→后 (hòu), 幾→几 (jǐ), 闆→板 (bǎn);
  • 隻(zhī) → 只(zhǐ); 發(fā) → 髮(fà) → 发(fā, fà).

Rationalisation of modern Chinese characters

(Writing of a subtopic article in progress ...)

See also


  1. In some applications, there are smaller configuration units, e.g., stroke segments, turning points, and pixels.[45]


  1. Yin 2007.
  2. Qiu 2000.
  3. Chen 2021.
  4. Su 2014.
  5. Yang 2008.
  6. "What is the most spoken language?".
  7. Arcodia 2021, pp. 62–71.
  8. Zhou 2003.
  9. Qiu 2013, pp. 45–101.
  10. Zhou 1980.
  11. Gao 1985.
  12. Su 2014, pp. 29–30.
  13. Gao 1993.
  14. Zhang 1992.
  15. Su 2014, pp. 19–21.
  16. Peking University 2004, pp. 145–148.
  17. Norman 1988, pp. 74–75.
  18. Norman 1988, p. 74.
  19. Su 2014, pp. 51–52.
  20. Su 2014, p. 47.
  21. 现代汉语常用字表 Archived 2016-11-13 at the Wayback Machine [List of Frequently Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
  22. 现代汉语通用字表 Archived 2016-11-23 at the Wayback Machine [List of Commonly Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
  23. 国务院关于公布《通用规范汉字表》的通知. (in Chinese). State Council of the People's Republic of China. 5 June 2013.
  24. Language Institute 2020.
  25. Language Institute 2016.
  26. Norman 1988, p. 73.
  27. National Academy of the Korean Language (1991) Archived March 19, 2016, at the Wayback Machine
  28. '인명용(人名用)' 한자 5761→8142자로 대폭 확대. Chosun Ilbo (in Korean). 2014-10-20. Retrieved 2017-08-23.
  29. Su 2014, p. 51.
  30. (Lecture notes of the subject "Modern Chinese Characters and Information Technology", Dept of Chinese and Bilingual Studies, Hong Kong Polytechnical University, by Dr. Zhang Xiaoheng, June 12, 2017.)
  31. "UAX #38: Unicode Han Database (Unihan)".
  32. Su 2014, p. 34.
  33. Su 2014, p. 35.
  34. Chen 1928.
  35. "Chinese Character Frequency Statistics for Hong Kong, Mainland China and Taiwan - A Trans-Regional, Diachronic Survey: 香港、大陸、台灣 - 跨地區、跨年代漢語常用字頻統計".
  36. National Language Commission 2013.
  37. Su 2014, p. 42.
  38. Su 2014, pp. 183–207.
  39. Zhan 2008, p. 19-24.
  40. Wang 2003, p. 20-27.
  41. Wang 2003, p. 27-28.
  42. Wang 2003, p. 29-31.
  43. Su 2014, pp. 201–202.
  44. Peking University 2004, pp. 148–152.
  45. Zhang 2013.
  46. Su 2014, pp. 74–75.
  47. National Language Commission 1999.
  49. Su 2014, pp. 82–84.
  50. National Language Commission 2009, p. 1.
  51. Su 2014, p. 86.
  52. National Language Commission 2009, p. 2.
  53. Su 2014, p. 94.
  54. National Language Commission 2009a, p. 1.
  55. Li 2013, p. 62.
  56. Zhang 2006.
  57. Peking University 2004, p. 169.
  58. Su 2014, pp. 160–161.
  59. National Language Commission 1985.
  60. Su 2014, pp. 172–175.
  61. Su 2014, p. 176.
  62. Peking University 2004, p. 172.
  63. Zhou 1993.
  64. Su 2014, p. 180.
  65. "Kanji".
  66. Qiu 2013, pp. 210–211.
  67. Yang 2008, p. 169.
  68. Fromkin 1993, p. 41.
  69. Yang 2008, p. 170–172.
  70. Li 1988, p. 1112.
  71. Qiu 2013, p. 124.
  72. Su 1994, pp. 128–129.
  73. Su 1994, p. 129.
  74. Yang 2008, pp. 173–174.
  75. Li 2013, pp. 122–124.
  76. Qiu 2013, pp. 102–108.
  77. Yin 2007, pp. 97–100.
  78. Su 2014, pp. 102–111.
  79. Yin 2007, p. 98.
  80. Su 2014, pp. 103–105.
  81. Yin 2007, p. 100.
  82. Yin 2007, p. 99.
  83. Yang 2008, p. 147.
  84. Su 2014, p. 107-108.
  85. Su 2014, p. 109.
  86. Su 2014, pp. 120–126.
  87. Li 2013, pp. 300–302.
  88. Su 2014, p. 127.
  89. Su 2014, pp. 127–128.
  90. Li 2013, p. 304.

Works cited

  • Arcodia, Giorgio (and Basciano, Bianca) (2021). Chinese Linguistics. Oxford: Oxford University Press. ISBN 978-0-19-884784-7.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • Chen, Heqin (陳鶴琴) (1928). 語體文應用字彙 (Applied Lexis of Vernacular Chinese) (in Chinese). Beijing: Shangwu (The Commercial Press).
  • Chen, Mao (陳茂仁) (2021). 文字學概論 (Introduction to Chinese Writing) (in Chinese). Taipei: 新學林 (New Xuelin).
  • Fromkin, Victoria (and Robert Rodman) (1993). An Introduction to Language (5th ed.). Orlando, USA: Harcourt Brace Javanovich College Publishers. ISBN 0-03-075379-1.
  • Fu, Yonghe (傅永和) (1994). 规范汉字 (Standardizing Chinese Characters) (in Chinese). Beijing: 语文出版社 (Chinese Language Press).
  • Fu, Yonghe (傅永和) (1999). 中文信息处理 (Chinese Information Processing) (in Chinese) (3rd ed.). Guangzhou: 广东教育出版社 (Guangdong Education Press). ISBN 9-787540-640804.
  • Gao, Jiaying (高家鶯, 范可育) (1985). "建立现代汉字学刍议 (A brief discussion on the establishment of modern Chinese character studies)". Journal of Shanghai Normal University (上海师范大学学报). 1985 (4).{{cite journal}}: CS1 maint: multiple names: authors list (link)
  • Gao, Jiaying (高家鶯, 范可育,费锦昌) (1993). 現代漢字學 (Modern Chinese Characters) (in Chinese). Beijing: 高等教育出版社 (Higher Education Press). ISBN 7040040670.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • Language Institute, Chinese Academy of Social Sciences (2016). 现代汉语词典 (Modern Chinese Dictionary) (in Chinese) (7th ed.). Beijing: Commercial Press. ISBN 978-7-100-12450-8.
  • Language Institute, Chinese Academy of Social Sciences (2020). 新华字典 (Xinhua Dictionary ) (in Chinese) (12th ed.). Beijing: Commercial Press. ISBN 978-7-100-17093-2.
  • Li, Dasui (李大遂) (2013). 简明实用汉字学 (Concise and Practical Chinese Characters) (in Chinese) (3rd ed.). Beijing: Peking University Press. ISBN 978-7-301-21958-4.
  • Li, Gongyi (李公宜,劉如水 (主編)) (1988). 漢字信息字典 (Chinese Character Information Dictionary) (in Chinese). Beijing: 科学出版社 (Science Press). ISBN 7-03-000862-6.
  • National Language Commission, Ministry of Education, China (1985). Table of Mandarin Words with Variant Pronunciation (普通話异讀詞審音表) (PDF). Beijing: National Language Commission. Retrieved September 15, 2023.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • National Language Commission, Ministry of Education, China (1999). GB13000.1字符集汉字字序(笔画序)规范 (Standard of GB13000.1 Character Set Chinese Character Order (Stroke-Based Order)) (PDF) (in Chinese). Shanghai Education Press. ISBN 7-5320-6674-6.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • National Language Commission, Ministry of Education, China (2009a). Specification of the Undecomposable Characters Commonly Used in the Modern Chinese (现代常用独体字规范) (PDF). Beining: National Language Commission. Retrieved September 8, 2023.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • National Language Commission, Ministry of Education, China (2009). Specification of Common Modern Chinese Character Components and Component Names (现代常用字部件及部件名称规范) (PDF). Beining: National Language Commission. Retrieved 3 September 2023.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • National Language Commission, Ministry of Education, China (2013). 2012年中國語言生活狀况報告 (Report on Language Life in China 2012) (in Chinese). Beijing: Shangwu (The Commercial Press).{{cite book}}: CS1 maint: multiple names: authors list (link)
  • Norman, Jerry (1988). Chinese. Cambridge: Cambridge University Press. ISBN 978-0-521-29653-3.
  • Peking University, Modern Chinese Language Teaching and Research Office (2004). Modern Chinese (现代汉语) (in Chinese). Beijing: Commercial Press. ISBN 7-100-00940-5.
  • Qiu, Xigui (2000). Chinese writing. Translated by Gilbert L. Mattos; Jerry Norman. Berkeley: Society for the Study of Early China and The Institute of East Asian Studies, University of California. ISBN 978-1-55729-071-7. (English translation of Wénzìxué Gàiyào 文字學概要, Shangwu, 1988.)
  • Qiu, Xigui (裘锡圭) (2013). 文字学概要 (Chinese Writing) (in Chinese) (2nd ed.). Beijing: 商务印书馆 (Commercial Press). ISBN 978-7-100-09369-9.
  • Su, Peicheng (苏培成) (1994). 现代汉字学纲要 (Essentials of Modern Chinese Characters, Chapter 6) (in Chinese). Beijing: Peking University Press). ISBN 7-301-02597-1.
  • Su, Peicheng (苏培成) (2014). 现代汉字学纲要 (Essentials of Modern Chinese Characters) (in Chinese) (3rd ed.). Beijing: 商务印书馆 (The Commercial Press, Shangwu). ISBN 978-7-100-10440-1.
  • Wang, Ning (王寧,鄒曉麗) (2003). 工具書 (Reference Books) (in Chinese). Hong Kong: 和平圖書有限公司. ISBN 962-238-363-7.
  • Yang, Runlu (杨润陆) (2008). 现代汉字学 (Modern Chinese Characters) (in Chinese). Beijing: Beijing Normal University Press. ISBN 978-7-303-09437-0.
  • Yin, Jiming (殷寄明, 汪如东); et al. (2007). 现代汉语文字学 (Modern Chinese Writing) (in Chinese). Shanghai: 复旦大学出版社 (Fudan University Press). ISBN 978-7-309-05525-2.{{cite book}}: CS1 maint: multiple names: authors list (link)
  • Zhan, Deyou (詹德优等) (2008). 中文工具書使用法 (How to use Chinese reference books) (in Chinese). Beijing: Commercial Press. ISBN 978-7-100-01510-3.
  • Zhang, Jingxian (张静贤) (1992). Modern Chinese Character Tutorial (现代汉字教程) (in Chinese). Beijing: Modern Press.
  • Zhang, Xiaoheng (张小衡) (2006). "The Number, Point and Metric Systems of Font Size (字形的 '号制' '点制' 与 '米制')". Computer Engineering and Applications (计算机工程与应用). 42 (2006) (10): 175–177 & p 215.
  • Zhang, Xiaoheng (张小衡); Zhang, Li Xiaotong (李笑通) (2013). 一二三笔顺检字手册 (Handbook of the YES Sorting Method) (in Chinese). Beijing: 语文出版社 (The Language Press). ISBN 978-7-80241-670-3.
  • Zhou, Youguang (周有光) (1980). "现代汉字学发凡 (Introduction to the studies of modern Chinese character)". Language Modernization (语文现代化). Knowledge Press (知识出版社). 2 (1980).
  • Zhou, Youguang (周有光) (1993). "傳聲時代的語言 (Language in the Age of Sound Transmission)". China Education News (中国教育报). year (1993.10) (date 27).
  • Zhou, Youguang (2003). The Historical Evolution of Chinese Languages and Scripts. Translated by Zhang Liqing. Columbus: National East Asian Languages Resource Center, Ohio State University. ISBN 978-0-87415-349-1.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.