The Kanjimori Database
About the Database
The Kanjimori database contains composition data for thousands of Chinese characters with an emphasis on those used in Japanese (Kanji). It's a "living database," meaning it's constantly being updated and refined as new evidence for a character's origin emerges. Check out the database report below for insights based on our data!
Database Report
Sources
The Kanjimori database is in many ways a culmination of community efforts to better understand the origins of Chinese characters. Some key sources used in constructing the database include:
Kanji Composition
- Kadokawa Shin Jigen (角川新字源) – Kadokawa Shin Jigen is a Japanese kanji dictionary that includes historical forms and likely origins for each character. This book was an important reference for kanji origins, particularly as a source of perspectives on less well understood kanji. The authors include Tamaki Ogawa (小川 環樹), Taichirō Nishida (西田 太一郎), Tadashi Akatsuka (赤塚 忠), Tetsuji Atsuji (阿辻 哲次), Takeshi Kamatani (釜谷 武志), and Yūko Kizu (木津 祐子).
- Kanji no Taikei (漢字の体系) – Kanji no Taikei is a modernized Japanese kanji dictionary written by Shizuka Shirakawa (白川) that organizes characters by shared components. This book was a helpful resource for identifying overlooked Japanese-relevant characters with a certain component. Kanji entries include historical forms as well as brief explanations of character origins. We highly recommends this book as it embodies the same philosopy of learning kanji through composition that Kanjimori advocates for.
- Wiktionary – The Wiktionary team has made phenomenal progress towards documenting the composition and origin of Chinese characters. Their entries were often the starting point for our investigations. Key Wiktionary contributors for Chinese character data include .
- Hanziyuan – Hanziyuan is a site cataloguing historical forms of Chinese characters. It's a convenient reference for seeing character evolution over time and was often used to confirm that historical components did indeed exist in some version of a character. Key contributors to Hanziyuan include Richard Sears, Ann Wu, and Dixin Yan.
- Zdic – Zdic is an online Chinese dictionary that includes historical and variant forms for many characters. As with Hanziyuan, Zdic is a great resource for reconstructing character evolution over time.
Words & Definitions
- KANJIDIC – KANJIDIC is a kanji database that was referenced for many kanji properties including stroke counts, readings, and meanings. The database was created by Jim Breen and is currently maintained by the Electronic Dictionary Research and Development Group (EDRDG).
- JMdictDB – JMdictDB is a Japanese dictionary database and is the source for most word definitions on Kanjimori. Like KANJIDIC, JMdictDB is the work of Jim Breen and is maintained by the Electronic Dictionary Research and Development Group (EDRDG).
We sincerely thank the authors and contributors of these sources for their help in making Kanjimori possible!
Public Dataset
Public distributions of the data are currently unavailable while Kanjimori is under development, but the data will be made public with regular releases in the near future. Check back later for updates!