Speak Good Chinese
Get the software!
Wordlists
Screenshots
Manuals
Team
Extras
Articles

If you want to look up words:

MDBG Chinese-English dictionary

In order to assist you and your students we provide some wordlists found in some common Chinese learning methods. You can open them via the Open List button on the Settings page. The wordlists will be installed in the wordlists directory.

Currently, on OSXi> and Linux the wordlists should have the .sgc extension for SpeakGoodChinese. If your browser renames the extension to .zip, you should rename the files again to have the extension .sgc. Wordlists with the .zip extensions will be loaded, but this feature is still experimental.

On Windows, you should unpack the word lists (.sgc or .zip) into a map with the name of the wordlist. Then you can install them using either by name or by opening an audio file (e.g., name.wav) or wordlist.txt or wordlist.Table inside the map. (this will work on Mac and Linux too)

Plain text wordlists (e.g., name.txt) or full tables (e.g., name.Table or name.tsv) can also directly be opened using the 'Open List' button on the Settings page.

How to create word lists

To load, or read, a new word list go to the Settings page and click on the Open List button. You can then select the file that contains the word list. It will automatically open and be stored in your local word list directory.

Very simple word lists

The easiest and simplest way to create a word list is to create a pure text file (.txt, or DOS file). The file should have the name of the word list and the extention should be .txt. Each line should countain a single pinyin word.

Example: SGC2example1.txt
chi1
ta1
jin1tian1
can1ting1
huan1ying2
zhong1guo2
duo1shao3
zhi3you3

Simple word lists with characters and translations

Characters and translations can be added to the simple, text based word list descibed above. Again, create a pure text (.txt, or DOS file) file with the .txt extention. Each line should start with a pinyin word. Next, follow the pinyin by a tab or ';' character and the characters that correspond to the pinyin, agains a tab or ';' character and the translation in free text.

Example: SGC2example2.txt
chi1;吃;to eat
ta1;他;he, she, it
jin1tian1;今天;today
can1ting1;餐厅;restaurant
huan1ying2;欢迎;to welcome
zhong1guo2;中国;China
duo1shao3;多少;how many
zhi3you3;只有;only

Audio files can be added in the same manner. Just add and item with the extention of an audio file to the line (between tabs or ';' characters). For instance, chi1.mp3 will be interpreted as an audio example.

These simple word lists do not contain information about the order and nature of the items. There can be errors in distinguishing characters, translations, and audio examples.

Word list tables

Simple word lists based on text files can lead to errors in the values of the items. Large lists with both characters and translations should be constructed as tab-separated-values (tsv) tables. These can be exported (Save as) from a spreadsheet or database program. The file extention of such a file should be .Table or .txt.

A SpeakGoodChinese wordlist table is a tab-separated-values table that starts with a header line which contains the column headers Pinyin, Character, Sound, Translation. Then the column values are written on a line with tabs separating them.

Example: SGC2example3.tsv

PinyinCharacterSoundTranslation
chi1-to eat
ta1-he, she, it
jin1tian1今天-today
can1ting1餐厅-restaurant
huan1ying2欢迎-to welcome
zhong1guo2中国-China
duo1shao3多少-how many
zhi3you3只有-only

Word lists with associated audio files

These word list distributions are simple ZIP files with the name <list name>.sgc. They contain a list of all the words in pinyin with the name wordlist.txt or wordlist.Table as is discussed above. Except that the name of the word list file should be either wordlist.txt or wordlist.Table. Other names are not allowed. The sound files should be named <pinyin word>.ext, where <pinyin word> is the pinyin transcription, eg, sheng1zi4, and ext the sound extension type (eg, wav). Note that SpeakGoodChinese uses Praat to process the sound files. So only those sound files recognized by Praat can be used (see Praat: Read from file...). SpeakGoodChinese will use WAV (.wav), Flac (.flac), MP3 (.mp3), and Speex (.spx) files as examples if they are present . Don't forget to include a LICENSE.txt file with the copyright and licensing information. If you use one of the Creative Commons licenses or the GNU GPL, we might be willing to put your list on our web-site.

It is possible to use SpeakGoodChinese to record example audio files for wordlists. See the Articles page for instructions. The recorded files have file-names that start with the intended pinyin spelling of the word, followed by other aspects (eg, a time stamp). Select those recordings you consider good enough as examples and rename them to <pinyin>.wav (replace <pinyin> by the target word, eg, ni3hao3). Copy them to the target wordlist directory for inclusion in the distribution.