Tangut
Tangut Language
According to Shi (2020), the Western Xia was a significant imperial dynasty in medieval China, with a population predominantly composed of the Dangxiang-Qiangic people. It was ruled by as many as ten emperors in the course of its 190-year recorded history. The early phase of Tangut history was characterized by its standoff with Northern Song and Liao, and later with Southern Song and Jin. In each case, the three powers formed a three-kingdom dynamic. Western Xia boasted an impressively large and stably controlled territory, detailed institutional laws, and is known for both its cultural prosperity and largely successful military campaigns. The Tangut people created their own script and language, later known as Tangut, making possible the wealth of historical and cultural records of Western Xia that we have today. As time passed, the power of the Tangut people was destroyed by the Mongols, and the Dangxiang ethnic group was forgotten by history during the Yuan and Ming dynasties, so Tangut became an extinct language and the texts written in Tangut disappeared.
Through phonological, lexical, and morphological comparisons by Lai et al. (2020), it has been shown that Tangut should be classified as a West Gyalrongic language, making it the only documented medieval Gyalrongic language. In terms of phonology, Tangut shares the bilabial reflex found in East Gyalrongic and exhibits innovative changes and omissions of preinitials not found in East Gyalrongic languages. Lexical comparisons reveal that common bisyllabic words not attested in East Gyalrongic languages provide strong evidence for the subgrouping of Tangut in West Gyalrongic, with eight examples identified. Conversely, no common bisyllabic words are exclusively shared by East Gyalrongic and Tangut. Morphologically, three pieces of evidence support the placement of Tangut within West Gyalrongic, and syntactically, Tangut shares innovative case and modal markers with West Gyalrongic.
Tangut Script
Tangut script is an functionally ideographic script, in which the meanings are strongly involved in the correspondence between language and script. The syllables in the table are from the romanization of the construction in Gong (2020), and the meanings are from Jiǎnmíng Xià-Hàn Zìdiǎn digitized by CCAMC.
| Syll. | Character | Abstract Shape | Meanings | 
|---|---|---|---|
| luˣ | stove | ||
| warehouse | |||
| to play | |||
| moist | |||
| lǔˣ | five | ||
| rope | |||
| holy | |||
| skilful | 
From the table above, it is easy to see that Tangut is also cognitively ideographic. For example, for ‘warehouse’ and ‘moist’, the upper component is the semantic component (‘to hide’ and ‘water’) and the lower component is the phonetic component. And for ‘rope’, both the left and right components are semantic components.
Using the cognition described in canonical Tangut literature such as Yīntóng and Wénhǎi, it is easy to make connections to Tangut characters, identify referential and constructive relationships, and generalize the productive phonetic components and semantic components to derive the abstract shape (see the table above for expressions) of each Tangut character. Tangut characters with the same abstract shape are thus recognized as unifiable.
Tangut Character Set
In May 2007, Richard Cook submitted the first proposal for the encoding of the Tangut script to WG2 (WG2 N3297), in which the code charts summarized the repertoires from Tóngyīn Yánjiū (as W-source), Xià-Hàn Zìdiǎn (as X-source) and Xīxiàwén Zhèngzì Yánjiū (as Y-source), and add a column in Mojikyō glyphs for unified repertoire (as Z-source).
This multi-column code charts has subsequently been extended with additional literature (WG2 N4522), including: Seikabun Shōjiten, Грамматика Тангутского Языка, Wénhǎi Yánjiū, Словарь Тангутского (Си Ся) Языка, “Wǔyīn Qiēyùn yǔ Wénhǎi Bǎoyùn Bǐjiào Yánjiū” and Xià-Hàn Zìdiàn. This multi-column code charts contains 6,126 lines, of which the first line corresponds to the Tangut iteration mark (U+16FE0), and the rest of the 6,125 lines correspond to 6,125 characters (U+17000..U+187EC), which thus form the Tangut character set in Unicode 9.0.
In addition, five Tangut characters (U+187ED..U+187F1) proposed in WG2 N4724 are encoded in Unicode 11.0, six Tangut characters (U+187F2..U+187F7) proposed in WG2 N4896 are encoded in Unicode 12.0, and nine Tangut characters (U+18D00..U+18D08) proposed in WG2 N5064 are encoded in Unicode 13.0.
For Tangut components, in September 2008, Michael Everson and Andrew West list the indexing components from nine documents, summarized into 802 characters proposed in WG2 N3495. Subsequently, by further considering the components used in the IDS for Tangut characters, this list was further categorized into 753 proposed in WG2 N4636, with another two components proposed in WG2 N4667. The 755 proposed components (U+18800..U+18AF2) thus form the Tangut component character set in Unicode 9.0.
In addition, seven Tangut components (U+18AF3..U+18AF9) proposed in WG2 N4957 and six Tangut components (U+18AFA..U+18AFF) proposed in WG2 N5064 are encoded in Unicode 13.0.
Typefaces
Before the Tangut script was encoded, there used to exist a number of Tangut fonts, such as the font designed by Jing Yongshi, Han Xiaomang, Liu Changqing, Gong Huangcheng & Lin Yingjin, and Mojikyō project. The common feature of these fonts is that they map Tangut glyphs to CJK codepoints, so when using them in word processing applications, it is necessary to constantly switch fonts or they will appear as CJK characters. In addition, these fonts are not compatible with each other, and if the fonts are not set correctly, the characters will be displayed chaotically.
Below are two Tangut typefaces that I was involved in developing or familiar with. The first is stylistically aligned with Noto Serif CJK SC, and the second is stylistically aligned with Ryūmin.
Noto Serif Tangut
This typeface is the product of the redesign project for Noto Serif Tangut. Since the initial version of Noto Serif Tangut maintained the same stroke characteristics as Noto Serif CJK SC, redesign did not consider changing the style.
Matching Design. The typeface has been designed to fit Noto Serif CJK SC on a deeper level. On the one hand, the face ratio (the ratio of the face width to the body width) of the glyphs in Noto Serif Tangut has been readjusted based on the face ratio of Noto Serif CJK SC, which in turn adjusts the size and position of the space occupied by the glyphs. On the other hand, based on the second center line distance (the distance between the center lines of the primary components) measured from Noto Serif CJK SC, the relative size and position of the components in Noto Serif Tangut have been readjusted to ensure that the glyphs are of the same height and visual size.
Glyph Correction. Based on the references, the missing strokes and components are added, incorrect stroke avoidance is modified, incorrect components are corrected, and incorrect structures are adjusted. In addition, the design team incorporated suggestions from the UTC and WG2 proposals, as well as ongoing feedback from the Jia Changye team and other scholars.
Visual Optimization. Leveraging the experience of Han ideographs type design, the design team adjusted the relations of the strokes to make room for the strokes, making them more spacious. The white space of Tangut characters is adjusted so that the color of the glyphs is uniform and consistent. In addition, the relative positions and relative lengths of some strokes are adjusted to ensure the legibility of slanted strokes while keeping the correct cognition.
Innovative Design. According to the Tangut literature, the Tangut script has unique components and a unique structure compared to the Han ideographs, which is reflected in the type design. In addition, for some components that do not have a corresponding stroke shape in the Songti style, the design team made several revisions to the glyphs to finally decide on the appropriate design.
AraTangut
This typeface is a custom typeface designed for the paper “On the Environment in which a ‘Dot’ Appears in Tangut Characters and on its Function” written by Shintaro Arakawa. In the paper, Ryūmin R-KL and Midashi Go MB31 were chosen for CJK typeface, Adobe Caslon Pro for Latin, and Times New Roman for Cyrillic. Therefore, this typeface matches the design of Ryūmin.
Special Shapes. While keeping the thickness and straightness of the strokes consistent with Ryūmin Regular, this typeface adapts the characteristics of the strokes to Tangut calligraphy. For example, the in-strokes (qǐbǐ, the beginning part of a stroke) and out-strokes (shōubǐ, the ending part of a stroke) of hidariharai (piě, throw), migiharai (nà, press) and L-gata (shùwāngōu, vertical curve hook) are not inherited from Minchō style, but from Tangut calligraphy.
Special Strokes. This typeface carries out special design for the unique strokes in Tangut script, including hidariharai-orihane (piědiǎngōu, throw dot hook), ro-gata (héngzhéwāngōu, horizontal vertical curve hook), ku-gata (piěnà, throw press) and fufufu-gata (héngzhézhézhézhépiě, horizontal vertical horizontal vertical horizontal throw).
Special Components. Since the paper requires the presentation of the same component (corresponding to the same codepoint) in different relative positions and avoidance situations, this typeface provides a rich set of Tangut hen (piān, left component), aida (jiān, middle component), tsukuri (páng, right component) and kanmuri (guān, upper component).
This typeface creates a unique and ethnic flavor between Minchō and Tangut calligraphy, and works well and harmoniously with Kanji and Kana in Japanese typesetting environments. It arranges a variety of diagonal strokes evenly, contributing to a balanced white space and improving the reading experience. AraTangut contributes to the revitalization of language studies in Japan and encourages the production of typefaces for other scripts.