Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
Education Communications Handhelds News

Official Kanji Count Increasing Due To Electronics 284

JoshuaInNippon writes "Those who have studied Japanese know how imposing kanji, or Chinese characters, can be in learning the language. There is an official list of 1,945 characters that one is expected to understand to graduate from a Japanese high school or be considered fluent. For the first time in 29 years, that list is set to change — increasing by nearly 10% to 2,136 characters. 196 are being added, and five deleted. The added characters are ones believed to be found commonly in life use, but are considered to be harder to write by hand and therefore overlooked in previous editions of the official list. Japanese officials seem to have recognized that with the advent and spread of computers in daily life, writing in Japanese has simplified dramatically. Changing the phonetic spelling of a word to its correct kanji only requires a couple of presses of a button, rather than memorizing an elaborate series of brush strokes. At the same time, the barrage of words that people see has increased, thereby increasing the necessity to understand them. Computers have simplified the task of writing in Japanese, but inadvertently now complicated the lives of Japanese language learners. (If you read Japanese and are interested in more details on specific changes, has some information!)"
This discussion has been archived. No new comments can be posted.

Official Kanji Count Increasing Due To Electronics

Comments Filter:
  • by RingDev ( 879105 ) on Wednesday June 09, 2010 @04:54PM (#32515962) Homepage Journal

    Is it just me, or is having your language based on a character set that requires computer rendering for most people to be able to communicate clearly somewhat asinine?

    No disrespect to those that practice the art of cartography, but for day to day communication... wow.


  • by Monkeedude1212 ( 1560403 ) on Wednesday June 09, 2010 @05:09PM (#32516164) Journal

    Or you could just, you know, use English to utterly butcher the representation of any foreign word

    I think the Scot's are worse for it. Have you ever heard them say Edinburgh?

  • Re:UTF-8 (Score:5, Interesting)

    by JustinOpinion ( 1246824 ) on Wednesday June 09, 2010 @05:13PM (#32516206)
    The usual explanation given is that people were injecting unicode characters as part of trolling attempts to break Slashdot's layout. So trolls were doing things like using right-to-left control characters to spoof their comment score. See this comment [], which explains the situation and links to some examples. Slashdot reacted by blocking anything not in the basic character set.

    Frankly this is an unsatisfying answer. Or rather an unsatisfying solution. It seems like it wouldn't take that long for a developer to go through some of the unicode set and build a whitelist and/or blacklist that was comprehensive enough to allow us geeks to use useful symbols (currency, micro, greek letters, etc.) without allowing damaging characters.

    It seems like many of Slashdot's anti-trolling features (e.g. trying to prevent allcaps or ASCII art) are somewhat misguided. Nowadays the moderation is pretty good, such that troll comments are basically buried. You may as well let regular posters with good karma post in caps or use ASCII art if that's what their post requires (e.g. posting some calculations that uses lots of symbols and few words ends up being flagged unnecessarily).

    All that to say that Slashdot could presumably fix these things, but apparently they have little interest in doing so.
  • by dbet ( 1607261 ) on Wednesday June 09, 2010 @05:17PM (#32516280)
    Hmm, how "official" are Meriam and Webster? Or any dictionary? These are guides that help people understand new words, they're not necessarily the boss of the English language. OTOH, what the Cultural Center is doing with Kanji does seem somewhat official.
  • by angus77 ( 1520151 ) on Wednesday June 09, 2010 @05:36PM (#32516524)

    To be pedantic, Hiragana [] and Katakana [] glyphs are the equivalent of English syllables.

    To be extra pedantic, they're not necessarily syllables, but morae.

    For example, "o" is a one-mora syllable on it's own, whereas "oo" is also one syllable, but containing two morae (two beats to one syllable). "Oto" would then be both two morae and two syllables.

  • WTF (Score:5, Interesting)

    by NemosomeN ( 670035 ) on Wednesday June 09, 2010 @06:07PM (#32516930) Journal
    Ok, the characters listed aren't difficult, or uncommon, they just aren't "official." The real issue here is, why the hell does have more features than Click an external link, and there's an interstitial offering a direct, Google cache, and web archive (Way Back Machine) link. Seriously, bring this to .org. And add Coral cache to both, I know it's got an l AND an r in it, but it could still benefit .jp.
  • Re:UTF-8 (Score:3, Interesting)

    by KiloByte ( 825081 ) on Wednesday June 09, 2010 @06:09PM (#32516950)

    There's absolutely no reason to not allow every single printable character, perhaps excluding RTL or combining chars if you're paranoid. A white/blacklist made by hand would be counterproductive, character classification functions are there for a reason.

  • by Yuan-Lung ( 582630 ) on Wednesday June 09, 2010 @06:20PM (#32517110)

    Not really, Kanji have "ON" and "KUN" readings. One is for full words, others is to mix with other kanjis and make other words. Forgot which is which, but in many cases kanji can serve the same use as kana.

    Onyumi is the original pronouciation of the Chinese character. Usually used for proper names and nouns. Kunyumi is when the character retrofitted into a Japanese word, usually used as verbs. They don't really 'serve the same use as kana', Using the proper kanji instead of spelling it out with kana provides more definition, but hides the pronunciation.

  • And it is an epic fail, that this retarded excuse is used.
    The characters that cause such things are a well-known set. Like the control (<32) characters in ASCII.
    If you filter them, you’re good.
    And if you are smart, you can even check for RTL/LTR/etc characters, and add a character to the end that fixes it. Or do it like a pro, and just force LTR via CSS for the element surrounding UTF-8 user input. So people can comment in RTL languages too.

    There. Done.

    That lame excuse only works on non-professionals. If you can’t handle UTF-8 you’re not one.

  • by Goaway ( 82658 ) on Wednesday June 09, 2010 @06:41PM (#32517350) Homepage

    Good job stripping out anything that isn't ASCII, Slashcode. What is this, the eighties?

    Let's try the long way around.

    What's the one meaning of [] ?

  • The characters that cause such things are a well-known set.

    The set could be extended in a future version of Unicode.

    Like the control (<32) characters in ASCII.

    And like an additional block of control characters (0x80-0x9F) was added in the ISO 8859 encodings.

  • by homejapan ( 1250680 ) <info.homejapan@com> on Wednesday June 09, 2010 @08:33PM (#32518484) Homepage

    I cringe a bit every time a story like this pops up. Here come the myths, the misinformation, the wild exaggerations... Life was easier before the "anime/manga" fans took up their little obsession.

    Well, let's be positive: This is a learning & teaching experience, right? So for the interested, a bit of debunking about Japanese:

    1) "Kanji" is not a language.
    I know, I haven't seen anyone on this page make that mistake, so I'm not pointing a finger at anyone here. Just at people out there who do think "kanji" is the name of the language – like Steve Jobs in his keynote a couple days ago. I had to write a debunking: []

    2) Japanese does NOT use "three writing systems". (That claim does appear on this page.)
    Japanese uses ONE writing system. Precisely one. No more, no less. It contains multiple character sets, including Chinese characters (aka kanji), home-grown "kana" phonetic characters (with two variants, hiragana & katakana), punctuation & typographic symbols (including some from European languages), and Arabic numerals. Those all combine to form exactly ONE writing system.

    It's nothing special. English uses multiple character sets, including Latin letters (with two variants, upper case & lower case), punctuation & typographic symbols, and Arabic numerals. All of which combine to form ONE writing system.

    I haven't written a post on this one yet, but definitely need to. That "three writing systems" is a really common misconception. (Comment by Moridineas is very much on the right track, pointing out that the jumble of features and origins found in the Japanese writing system is just the normal way human language rolls.)

    3) "OMG Japanese is so hard." Well, that's purely opinion, so I won't say it's right or wrong or a misconception or anything. I'll just add that there are learners with precisely the opposite opinion: I call it a wonderfully easy language to learn! There are plenty of reasons; see [] .

    Lots more linguistic debunking at my site. But I'll refrain from further boring the good people here.

    So, anyway. Fascinating stuff, and actually it's nice to see so many people take an interest. Let's just watch the exaggerations and stick to reality. (Yeah, like that'll happen. Who am I kidding? : )

  • by treeves ( 963993 ) on Wednesday June 09, 2010 @08:45PM (#32518610) Homepage Journal
    Actually, very few words in Japanese consist of a SINGLE Kanji character. And foreign-derived words like terebi from television (and like leet and haxxor would be) are always written with Katakana, not Kanji.

I've noticed several design suggestions in your code.