Friday, April 5, 2013

Disambiguating different senses of a Chinese word

Chinese is a rich language with a long history and this leads to words that have many different meanings.  That is, the same character or combination of characters has different meanings in different contexts.  There are two different cases of this: (a) a homonym, where the words are truly different, and (b) polysemy, where there a multiple related senses of the word.  Chinese is a little different from languages, like English, where more-or-less phonetic spellings are used because the same character may have multiple different pronunciations.  In fact, 'spelling' does not really apply to Chinese. Determining the correct meaning for a word in a particular sentence or other context is called word-sense disambiguation.

In this post I will describe how to use the www.chinesenotes.com dictionary to find the different meanings of Chinese words.  Chinese words with the same character(s) may different in traditional forms, pronunciation, sense, and meaning.

Let's look at an example, the character 是 (pinyin: shì), which is one of the most common words in Chinese that most often means 'is.'  There are six different  meanings found for this character, as shown in the screenshot below.


Word-sense disambiguation for 是



The character is entered into the search text field (1, in red) and the Search button clicked, showing the results below.  The simplified and traditional characters are shown (2, 3).  In this case both simplified and traditional are the same but there are often differences with the same simplified character mapping to different traditional characters.  An example is the character 台 (pinyin: tái).  This simplified character is equivalent to the traditional characters 台, 臺, 檯, 臺,颱, depending on the context.

The Mandarin pronunciation is shown under the Pinyin column (4).  Again, this example has the same pronunciation for each meaning but this is not always the case.  The character 行 has the possible pronunciations xíng, háng, hàng, and hèng, again, depending on context.

The English translation of the word is given under the English column (5).  The different English words separated by '/' (6) represent different approximations to the same meaning of the Chinese word.  There is not often a one-to-one relationship between Chinese and English words so usually several English words are given.  This also aids in translation by providing the translator with a few choices.  Different lines represent different words.  For example, the first meaning of 是 is 'is' and the second means 'is precisely.'  These are different but related senses.  An example, of the first may be: 你好,我是王。(Hello, I am Wang).  An example of the second may be: A: 我觉得你你不是王。(I don't think that you are Wang.)  B: 我是王。(I am Wang.)  If we look at a good Chinese-Chinese dictionary you will probably find about 20 different entries for 是.  However, many of them are even more closely related than this.

The grammatical function of the word under the column Grammar (7) can more information to help disambiguate the word.  For example, the forth sense of 是 is a pronoun, meaning 'this' or 'that.'  Clearly, this is a totally different meaning.  If click on the simplified Chinese text them the detailed information for that meaning shows that it is a classical Chinese word.  In fact, it was not until the modern Chinese era that 是 had the meaning 'is.'

Finally, there are notes (8) to help explain and differentiate the different meanings and senses.

Thursday, April 4, 2013

Finding useful information about a Chinese word

The www.chinesenotes.com dictionary has lots of useful information about Chinese words.  In this post I am going to explain some of the basic information returned.  Even though this information is basic it is not in any online Chinese-English dictionaries that I am aware of.

Let's look at an example.  Try looking up the Chinese word 句子 (pinyin: jùzi, English: sentence) in basic mode.  A screen shot is shown below.

Screenshot of result returned from a simple search

The input is shown near the number 1 (in red).  The text returned in large text is the simplified Chinese text.  The traditional text is also shown (2).  In this case there is no traditional text shown so the result is the same as the simplified.  In other words 句子 is the same in both simplified and traditional text.

The grammar (3) is shown beneath the traditional text .  There is a hyperlink around the text for the grammar to see more detail about the part of speech this word is.  In this case the word is a noun and that information may be enough for you.  This link will be more useful if the part of speech is something that there is no equivalent of in English grammar.


Synonyms (4) are shown below grammar.  In this case, there is one synonym (语句).  There is a hyperlink to the synonyms for convenience in following the link.  In some cases, for words with many different meanings, there are many different meanings so that synonym relationships are not as obvious.  For these words the synonyms, if there are any, are described in the notes.

The topic (5) is the general area of vocabulary that the word belongs to.  In this case the word relates to the 'Language' topic.  The topic can help differentiate the meaning of a word and the context that it is most likely to be used in.  This word is most likely to be used when we are talking about language.

A measure word is similar to a classifier in English but it are not optional in Modern Chinese.  They must be used in front of nouns and the correct measure word must be used.  The appropriate measure word is noted in (6) with a hyperlink to the definition.

An example sentence is given in (7).  This can be very useful if you are trying to use this word in your own Chinese text composition.