Registrations Now Accepted For Asian Domain Names 138
Eric Sun was among the first to point out that as of Thursday evening, VeriSign has begun accepting Chinese, Japanese and Korean domain names. "This increases the possible characters from 37 (26 letters, 10 numerals, and hyphen) to 40,282. Find more information [see this AP story]." snrsamy points to the same story as featured on C|Net
. jamie suggests reading the technical lowdown at VeriSign.
Re:We need a net Pat Buchanan (Score:1)
Come on, we invented it, we populated it, we control it, and now the Asian hordes are trying to subvert it.
Let them make their own internet.
Not to mention some of the domain names may belong to Al Gore
Re:TLD (Score:1)
Re:Quick (maybe stupid) question... (Score:1)
However, Chinese are commonly known as more concise than English or other languages with a small character set. There are thousands of commonly used characters each of which have the function of a word in English. Many characters have more than one meaning, and their combination (2 characters in most case) makes new words. And don't forget the amazing flexibility in the Grammar system (e.g. fewer stop words like "the")! We are not even talking about the ancient Chinese which is much SHORTER.
Give me any sentence with more than 10 English words (with no words like Yugoslavia of course), I guarentee to re-write it in Chinese in less space.
You see, this is the basic rule of information. You increase the complexity of encoding scheme, you get more density.
How complex this is? Well, I have to say that the 12 years' of Chinese class are a painful memory.
Re:Not a troll, but... (Score:1)
Re:What a lot of whining! (Score:1)
> I dunno, just a guess, but maybe someone's already thought of this? Perhaps...
It's easy to enter kanji if you are using Internet Explorer - just visit Windows Update and download the Japanese Input Method Editor update, and you'll be able to type kanji in your browser (using romaji I think). I don't know how you do with Mozilla...
Extra RS-232 pins (unserious) (Score:1)
They already did [iu.edu]:
pin 14 STD Secondary transmit data
pin 16 SRD Secondary receive data
(also pin 19 SRTS Secondary RTS, pin 13 SCTS Secondary CTS, etc.)
These pins can be used to double the amount of data sent through your RS-232 cable, which would be useful if you decided to (say) switch from 8-bit characters to 16-bit characters.
It's not an RS-232 cable unless it has all 20 wires!!! (-: (-:
Re:Big5 or Unicode (Score:3)
- This solution is only for web browsers. It requires a special version of a web browser, or a plugin, to be able to use the new encoding scheme. It won't work for email, ftp, telnet, gopher, etc, unless a special version of the program is written.
- DNS doesn't break. DNS still uses ASCII. This scheme uses RACE to encode the multi-lingual character set into ASCII. NSI will put a small prefix at the start of the domain name to identify it as multi-lingual (for example eq- would be found at the start of the domain name. The exact prefix has not yet been released to prevent squatters from snapping them up.)
- The special browsers will detect the prefix, and translate the ASCII gibberish into the specified multi-lingual character set. The browser also does the conversion back to ASCII to allow a DNS lookup.
- WHOIS does not/will not support this. You can only use WHOIS with the ASCII encoded gibberish.
- This is not supported by the IETF. This is a custom solution implemented by NSI. But it looks like they are going to be WAY behind schedule in actually rolling this out.
- They are accepcting registrations right now, but none of these names will resolve for at least a month, probably much longer. In other words, the system isn't useable yet, but NSI can collect money.
- The IETF is working on their own, probably completely incompatible system, to do the same thing.
The United States lost the Vietnam War. (Score:1)
Re:What a lot of whining! (Score:2)
Blah. Spare us your arrogant anti-English/US attitude.
Fact is, it is conveniant to be able to block certain top-level country codes at the business gateway (or ISP) in order to cut down on spam.
Incidentally, someone's connection to the Asian population is most likely NOT through spam, since most spam coming from asian top-levels is actually just U.S. spam--either routed through someone elses mail system, or with spoofed headers.
Re:RFC (Score:1)
This is nothing more than an attempt by NSI to open another huge revenue stream without any consideration for the effect it will have on the Internet, or the long-term interests of the Internet community. After all, they see an untapped potential market and a chance to dominate it by jumping in before the standards are developed that would allow others to participate. Now their competitors will have to follow their lead or risk losing the market, and the standards process will have been neatly circumvented. The cost is borne by the Internet community, and the benefits are reaped by NSI.
Why did I vote for Nader? Now I remember...
Re:RFC (Score:2)
Input String
Utf-8
Prepared String
Utf-8
Registration String
RACE
bq--gcrmxyi
--
EFF Member #11254
Re:Thats a lot of characters... (Score:2)
there was a report a couple of weeks ago regarding a problem with internationalised IIS's where unicode representations of directory traversal codes (.,/,\,etc) where being substitued after access checks had been applied...
Now imagine domain based trust relationships - these will be implemented in numerous sub-systems (tcp wrappers,
I imagine that this will lead to numerous security issues due to slight differences in systems support for multi-byte characters.
Another question (which I suspect will be answered in the FAQ) is do you need to register the same domain name several times to take account of the differing unicode byte widths?
Re:RFC (Score:4)
How it works is there is a special prefix "<rp>" (or maybe this just represents the prefix, I can't really tell from the PDF, but I didn't think < and > were valid domain name characters) that indicates a part of the domain is encoded, followed by the encoded name which only uses ASCII characters, and includes information about which character set was used (Unicode, SJIS, etc.). The algorithm is called RACE, Row-based ASCII Compatible Encoding.
A couple of examples were given for both a domain name and a server name:
<rp>45dfg62de34432.COM
<rp>3df45gd345.<rp>45dfg62de34432.COM
So I guess you can set your spam filters to block any domain starting with <rp>! :)
This is not meant to sound xenophobic, but (Score:1)
I don't think standards like this scale well.
What would happen if someone said
"Let's add 2 new data pins to RS232"?
I live in a country where we want 3 extra symbols to accomodate the language. They're all in Latin-1, of course. I don't even think that expanding to 8-bit Latin-1 is necessarily a good thing, let alone introducing an entirely new character encoding (16-bit) to the scheme of things.
I don't want to be
"f\0a\0t\0p\0h\0i\0l\0.\0o\0r\0g\0\0"
We don't let Russian trains into central Europe (the tracks are wider), why should be let Kanji into our character sets. (Yes, I know Russian trains do come to Europe, I live at the end of one of the lines, just not central europe!)
Anyway, here's to 3-bit serial lines...
(Could I patent that? I'd need to design an IC of flip-flap-flup-flipflop-catflap-flatcap-fatcat-fl
FatPhil
Not a troll, but... (Score:3)
Will moderators shoot down the fact that I mention Microsoft?
Windows has had a CJK-capable kanji input scheme for years. CJK: Chinese, Japanese, Korean. Windows also has had bidi (bidirectional) support for right-left and/or top-bottom languages, including Hebrew.
If you have the appropriate cjk-input features installed, it's just a funky keyboard shortcut to open it up to enter kanji. If not, you'll probably be limited to clicking on visible links, not entering domain names or other text by hand.
I don't know what features Linux has to handle EFIGSS (English, French, Italian, Swedish, Spanish) differences, nevermind bidi or kanji input.
Re:Don't let them ruin the Internet (Score:1)
You think prostitution is funny? (Score:1)
Re:IMO about time (Score:1)
RFC (Score:2)
URLs can't hold these characters (Score:1)
Re:And these site names are entered how? (Score:1)
Yeah, I'd kind of figured that, hence the reference to the fictional "UnicodeMap". I occasionally use character map programs for accents, and even know a few keyboard shortcuts for common ones. I can't imagine doing that for a whole line, let alone a language I don't know enough (any) to have a clue where to start looking for the character that probably can't be displayed anyway because the neccesary fonts are not installed, Chinese might as well be Martian in that respect.
I don't really think it's going to be an issue though; NonLatinAlphabet.com is almost certainly going to register their URL in the DNS supported languages of all the countries they wish to do business in and point them to that language version of the site. Ultimately it should make it easier for users who don't have Latin keyboards to get by on the web, and this is definately a very good thing.
English may well be the lingua-franca of the web, but why should a Chinese speaker get to a Chinese web site, hosted in China, that is displayed in Chinese by entering a URL in English. All web users require some support for Latin characters, and probably always will, but as a failsafe the reverse should apply too, and we can't fall back on IP numbers because the web is supposed to be using HTTP 1.1 isn't it?
Could you steal Domain Names? (Score:1)
What about sites that want their corporate name in all these new languages (would Yahoo have to register it's name under all the new languages?). Is there a market for this kind of registration?
Capt. Ron
Re:Oh great, Japanese URLs, just what we need. (Score:2)
Seeing as the Internet is supposed to be the medium that allows a break-down of barriers between nations and a free flow of information, don't you think that it might be a good idea to include as many languages as possible rather than exclude anybody who doesn't use a language that conforms to your standards?
I think you need to realise now, that English is not the only language in the world - in fact we're in a vast minority. It's possible that at some point enough people will undertake the task of learning enough foreign languages to free up communication between ourselves, and perhaps ulitmately one language will be considered the accepted standard - however, don't expect that to be English.
Spamming floodgate (Score:3)
----
Re:IMO about time (Score:2)
No, it's not. This is one of the most brain dead decisions ever made, in the name of political correctness, with complete disregard for the practical issues. The effect of this will be to reduce the global appeal of the web, not increase it. Western surfers will now effectively be cut off from many far Eastern domains. Sure, there's a reasonable workaround for entering non-ASCII domains on an ASCII keyboard, but it's too complex for the general public, and far Eastern companies are unlikely to publish the ASCII-fied domain anyway. This is a very black day for the net...
Re:Unicode usage in Japan is close to 0% (Score:1)
No, the Unicode hiragana/katakana ranges are ordered in standard Japanese ordering, and the kanji in the CJK range is ordered in Chinese dictionary order (radical first, then stroke count). You do know that kanji means Chinese characters, right? It's not unreasonable to order them the Chinese way.
In IE or Netscape, look under the encoding menu. You will find 3 choices; Shift-JIS, JIS, and EUC.
Well, I also find Unicode (UTF-8) in IE, and both Unicode (UTF-7) and Unicode (UTF-8) in Netscape. You need to realize that Unicode is for displaying all languages, not just Japanese.
Most Japanese experts on this subject view Unicode as an unwanted Western imposition.
True... also known as "Not Invented Here".
Re:RFC (Score:1)
Learning Asian (Score:1)
Quick (maybe stupid) question... (Score:1)
Kierthos
Re:Ideogrammatic languages are a pain (Score:2)
Re:Quick (maybe stupid) question... (Score:1)
Oh great, Japanese URLs, just what we need. (Score:2)
And what will the new ones look like to us Americans? Ugh, I can't bear to think of it.
I can't wait... (Score:1)
Re:Big5 or Unicode (Score:3)
How can one company be granted the monopoly rights to something so important to the world's economy and everyone on the Internet again? Should this be assigned to a not-for-profit entity under the auspices of ICANN?
--
And we implement this how? (Score:1)
Re:Big5 or Unicode (Score:2)
Since the majority of chinese users input their chinese as big5, (eg www.ê.com) will not be the same as the unicode equivalent
I think it's probably not too difficult for the Chinese browsers to do the conversion behind the scenes. Kinda like ASCIIEBCDIC conversions; you don't need to change the keyboard to enter text of the other variety.
Now, which one does the registrar accept, and the DNS servers cache? Read the article? From the first couple pages, it appeared that the domain name is actually not in Unicode nor Big5; it's translated to an ugly ASCII encoding.
Re:Unicode would be better. (Score:2)
Michael Everson of Everson Gunn Teoranta has proposed an encoding of Klingon in Plane 1 of ISO/IEC 10646-2 [dkuug.dk]; if it gets adopted, future versions of Unicode may adopt it (Everson's one of the editors and authors of Unicode 3.0).
Re:Take your medicine.. (Score:1)
Wide guage trains physically cannot come to _CENTRAL_ Eurpoe, where the 6" narrower guage is used.
However, I can hop on a wide guage train here in Helsinki which goes all the way to Moscow.
You see not all of Europe is CENTRAL Europe.
I'm sure you'd agree that not all of America is Central America. Screw it, I don't need your agreement, your opinion is less than worthless.
Now safe me the fucking effort and go kill yourself.
FatPhil
Re:Quick (maybe stupid) question... (Score:1)
I can't agree with your "basic rule of information". I can see nothing about it in my copy of Cover and Thomas. Kolmogorov or Chaitin have stuff to say about this kind of thing.
FatPhil
I'd argue the other way (Score:2)
I have occasion to buy an international airline ticket this year, and I refuse to use priceline because they have Will Shitner doing their ads. Give me Nemoy, Stewart, Dorn, Spiner, McFadden, anyone but shitner. Blow me priceline.
Man, you have got some real problems, don't you? Did Shatner beat you as a child or something? I mean, I'm not crazy about Troi, but it's not like I carry some kind of grudge. And you manually typed in a .sig as an anonymous coward? That's just weird.
Re:IMO about time (Score:1)
--
Times are changing. . . (Score:1)
I guess I better start learning the numbers. .
---
Re:Why asian character sets? (Score:2)
Presumably they're pitching it at the asian market cos that's where they expect to make money.
There are apparently good reasons for not allowing 8-bit characters not in US-ASCII in domain names - it would break too much.
Re:Asia Carrera (Score:1)
I think she runs Solaris now. *sigh* a pornstar after my own heart
Re:argggg! (Score:1)
I.e. Gie/3er remains Gie/3er, but da/3 becomes dass.
Re:Why asian character sets? (Score:1)
--
Re:Quick (maybe stupid) question... (Score:1)
e.g. Steamed fish vs steamed red snapper in soy sauce
As for the spoken language, chinese is actually easier for human ears in noisy channels/environments than english because you can detect the changes in pitch. Whereas in english much of the pitch component is "wasted".
Cheerio,
Link.
Asia Carrera (Score:1)
Asia Carrera,
and she runs Linux.
Re:Dangit, now how will I get hot Asian teens? (Score:1)
Re:RFC (Score:1)
The IETF draft (clearly not an RFC) on the matter, dated 28 June 2000 can be found at: http://www.i-d-n.net/draft/draft-ietf-idn-requirem ents-03.txt
The remaining questions are a) NSI has no control over the TLD for each respective character set, so why are they offering these? b) why are they polluting the .com, .net, and .org TLDs? c) if you already own "wine.com", does this mean they're willing to give the UTF-8 translation to Joe Blow so he can hijack all your asian client and ruin your otherwise good name?
Clearly this is not well thought out at all.
Please peruse this: http://www.emarketer.com/enews/reuters/11_09_2000. rwntz-story-bcnetinterlanguagedc.html?re f=dn
and come up with your own conclusing as to the real reason why. (hint: third paragraph)
Re:It breaks the dns-rfc. (Score:2)
Damn you're quick. Of course the whole point of this is to provide a work-around to that problem. All it does is make an ASCII representation of a different character set. These representations are flagged by having the hostname start with bq-. So if you run across a hostname that looks like bq-safjdlfaqwue72819.bq-hewaguifuifdajhks.co.jp you'll know that the hostname probably makes good sense to anyone who has a Japaneese web browser. If you are in the habit of reading such pages you'll get the appropriate plugin. If you don't have the plugin, you probably couldn't read the content anyway and believe you me, there is a LOT of content on the web that's written in a language you can't read. (I'm not saying that you're stupid or anything, I'm just making the bet that there isn't anyone here who knows every language in which material has been posted to the internet, this includes Klingon)
_____________
Re:We need a net Pat Buchanan (Score:1)
--
Peace,
Lord Omlette
ICQ# 77863057
Re:Quick (maybe stupid) question... (Score:3)
Re:Not a troll, but... (Score:3)
Some programs, like Emacs, communicate directly with the Japanese conversion server (canna, Wnn[4|6], ATOK, etc.), but there are very few apps which can do this.
Re:We need a net Pat Buchanan (Score:1)
It has been invented at the CERN in Swizzerland
Re:Don't let them ruin the Internet (Score:1)
ummm... DNS is only used in name resolution, packets are routed according to the IP address once resolved which is totally unrelated to the domain name - that happens right now - nothing has changed.
If anything extending the number of TLD's will reduce latency as it will spread the load accross more servers probably on a geographical basis!
feel free to troll its your god given right, but do try to remember that acting both jingoistic and technically ignorant in the same mail is very unlikely to get you any respect.
Re:Quick (maybe stupid) question... (Score:1)
Why Verisign is behind this. (Score:1)
Verisign owns Network Solutions and Thawte.
So they own your certs (need to be renewed) and your names (refer to Network Solutions' terms and conditions).
And there's this push for DNSSEC, which isn't that great anyway. But it'll be a convenient tool to centralise even more power.
Open your eyes a bit and you'll see more scary stuff.
Soon there'll be a bigger push for certificates becoming mainstream - via smartcards and other stuff. And Windows 2000 has some nice support for that... Maybe Microsoft will buy Verisign.
What do you think?
Have fun,
Link.
ordering RACE-encoded names with joker.com? (Score:1)
But now what's to stop me from looking through the RFC, figuring out how to encode my domain name using RACE, and then registering it using joker.com as a domain name that begins with "bq-"?
Re:RFC (Score:1)
- Host parts that have no international characters are not changed.
so it should not be possible to RACE-encode a domain name in order to hijack it.
Ofcourse its still possible to describe slash dot in Chinese and register that name
See also
http://www.i-d-n.net/draft/draft-ietf-idn-race-
Re:Why asian character sets? (Score:1)
-B
Re:Appearance of names (Score:1)
Have to try that one when I get to work tomorrow.
(NJStar's not bad as IME's go - at least it's not a Microsoft product)
Re:ordering RACE-encoded names with joker.com? (Score:1)
This has already been tried - stories were doing the rounds last week of registrars doing this. When bq- changes, they'll have some very annoyed customers.
Re:IMO about time (Score:2)
Yes, that's *exactly* what I'm saying. I'm not saying it because I happen to use ASCII, but because ASCII is a more natural system for computers to deal with. If Western European and American languages consisted of 30000+ characters, and those in the the East consisted of some 100 or so, I'd suggest using the Eastern system at the drop of a hat, even if it wasn't my native system. This has nothing to do with whether or not it's my native character set that's chosen, and everything to do with whether a good decision is made from a techincal perspective.
What about Cyrillic and ISO-Latin? (Score:2)
If all those other languages are accounted for, I view this as a good thing. If this is part of an overall shift to Unicode on the web, then all these languages are automatically supported, and I would think it an even better thing.
Re:Quick (maybe stupid) question... (Score:1)
--
EFF Member #11254
Re:RFC (Score:1)
In addition, there's the inherent difficulty in the fact that a Chinese website using a Simplified Chinese set of ideographs could hijack surfers wanting to go to a site with the same name, but with Traditional Chinese ideographs.
In Japanese, there are hiragana, katakana, and kanji. The first two are phonetic alphabets, and the third is an ideographic alphabet based on Chinese characters. Generally, input methods convert from the first to the second, often selectively, so difficult ideographs are replaced with simpler phonetic symbols, though the meaning remains. One word could have lots of representations, and still mean (and read) the same!
These issues should have been thought out before NSI started this idiocy.
Re:DeCSS in a domain name? (Score:1)
the domain name wouldn't work though, they are talking alphabetic symbols rather than length of domain name, i.e. you have the existing English alphabet of 26 letters + 10 numbers & - for _each_ char of the allowed 67, now you can also use ascii encodings of asian characters as well.
Re:What a lot of whining! (Score:2)
Furthermore, much of that spam comes through the same set of systems which never seem to do anything about it.
Re:RFC (Score:3)
The rp is a variable. The first couple pages notes that the implementation-testers should assume that the "RACE Prefix," or rp, should be "bq-".
Re:Quick (maybe stupid) question... (Score:1)
Just because there exists in a language two symbols 'blah' and 'thingy' so that 'blahthingy' means something else doesn't mean that this standard will adopt it. It's much more likely to use the 'common' kanji. (Ob note: There's only about 50 different Japanese characters for dragon from a quick search on lycos... or some really poor kanji writers).
That being said, it would be impossible to set up all possible configurations where composite symbols would redirect to the 'obvious' site. (i.e. www.golddragon.com, no matter how it's spelled in kanji or whatever would not necessarily all go to the same site.) It would be a neat trick if it could, but it would require registering dozens of permutations.
Kierthos
Re:Why asian character sets? (Score:1)
argh, i want [mozilla.org] it to be easier to tell urls apart from each other, not harder.
--
Re:Actually, The Current Max Characters is 67... (Score:1)
BTW, are hyphens and tildes inter-changeable? Because I've seen a lot of web-pages with tildes, and only some of them turn into hyphens when reloading.
Kierthos
Re:Quick (maybe stupid) question... (Score:2)
to generate characters.
This is a really crap picture of one:
http://acc6.its.brooklyn.cuny.edu/~phalsall/ima
So many keys each one is barely distinguishable from the next (that's also poor photo quality though)
If fell into disuse fairly swiftly because it was slower than script.
Our typewriters were invented so that they could be faster than script.
They lose.
FatPhil
What a lot of whining! (Score:2)
Within a few minutes of this story being posted, most of the posts are along the following lines.
I dunno; maybe because the Japanese don't know enough German? Why should the Asians wait for Europe to get its act together before they solve the issues they face every day?
Well, if your only connection to the Asian population is spam email, this should make your isolationism even more simple: the standard uses a standard prefix for RACE-encoded domain names; block those and you're in arrogant English/USian bliss.
I dunno, just a guess, but maybe someone's already thought of this? Perhaps the people who work in kanji all day know something about entering kanji, and have hardware or software solutions around. If you don't normally have to type it, I'm sure your browser will let you CLICK on encoded links just fine.
Missed anything?
Re:Thats a lot of characters... (Score:2)
Spam spam spam spam... (Score:2)
1) Oh God, there's gonna be a MASSIVE amount of spam coming from domains with characters outside of the standard 37.
2) I can block anything and everything coming from domains with characters outside of the standard 37.
-S
Re:Big5 or Unicode (Score:2)
This would allow all transports to ignore the character encoding as long as the encoding only uses bytes with the high bit for non-ascii. It also means that case-independence of non-ascii would be illegal, thus stopping the emergence of a dangerous (for security) mess of incompatable implementations of equality tests for URLs.
This would allow us to use UTF-8 for the URL, for the page contents, for email, for everything, and we would not have this horrid mess of prefixes and mime types.
Yes, some programs, routers, etc, would not pass this stuff through. Well, tough, those should be obsolete!
Other issues: ASCII fallbacks (Score:2)
Unicode is not supposed to over-unify characters, so the ASCII fallback for Japanese could be the romanji transcription - and therefor registering a Japanese domain name automatically registers the romanji equivalent, except that some kanji have more than one possible romanji transcription.
However, some kanji are unified with Chinese characters, which have a different pinyin trasncription.
Chinese is another problem. The logical ASCII equivalent is pinyin stripped of its diacritical marks. But then, many different characters may have the same transcription.
All Cyrillic languages also have an ASCII trasncription scheme too, but it isn't unified. One character may be trasncribed one way in Russian and another way in Bulgarian. Is there a unified transcription scheme for all Cyrillic languages, and is it truely one-to-one? I don't think so. Look at the character usually transcribed as "j" in Russian, and the one usually transcribed that way in Serbian.
ISO-Latin-1 and -2 fallbacks: For ISO-Latin-1, the fallbacks are pretty obvious: "Champs-Élysée" ==> "Champs-Elysee" or in German "Düsseldorf" ==> "Duesseldorf", but in Czech it's a little less obvious. Does "C hacek" map to "Cz" or "Ch" or "Cs"?
So, here is a possible solution: devise unified ASCII transcritption schemes for each language, admitting whatever ambiguities exist in Japanese or similar languages. Then, when you register a non-ASCII name, you are asked on the form to fill out the transcribed ASCII name that corresponds to it and it is also automatically registered to you.
There is some potential for conflict here, if the ASCII transcription corresponds to an existing registered domain or, as in the case of Chinese more than one foreign name corresponds to the same transcription, but I think the problem is manageable.
Re:Quick (maybe stupid) question... (Score:2)
Re:English Based Systems sending E-Mail? (Score:4)
So how's this gonna work for systems not set up to handle the asian character set?
Read the links.
The proposal implements an ASCII encoding scheme, called RACE. A certain prefix (they list the debugging prefix as "bq-") indicates a RACE-encoded domain name.
The rest of the ASCII encoding either appears in ASCII for dumb browsers, or is converted to Unicode or Big5 or whatever character set it wants.
For "dumb browsers" (not a flame, just an indication of character-set-awareness), you'd see some crazy domain like http://www.bq-ag0970ag00ah07h.or.jp/; for "smart browsers," it would appear in your own kanji font.
It breaks the dns-rfc. (Score:2)
Furthermore, does this limit those domains to 32 chars of length? (unicode, 2 bytes per char, dns system allows a maximum of 64 chars for domainnames
Also, doesn't it kinda suck to make large parts of the net unavailable for most?
--paddy
--
Multilingual Domain Names (Score:3)
The Internet Society [isoc.org] probably isn't too happy about this. They released a statement [isoc.org] on November 8th encouraging NSI to back off and let the IETF [ietf.org] IDN WG [i-d-n.net] do its job.
Also, there are companies that are already currently operating in this market, including WALID [walid.com], which is taking registrations for Arabic domain names (AND RESOLVING THEM), and will soon be adding Hindi, Tamil, and two Chinese scripts before moving into other markets.
Ideogrammatic languages are a pain (Score:2)
One of the major reasons this happened was there was they were trading with different peoples who used ideograms instead of alphabets. Since learning one ideogrammatic written language is hard enough and learning 5 is a single lifetime's achievment, a simpler way was found.
The Chinese were heterogenous and didn't need to deal with anyone other than the Chinese and hence kept their ideogrammatic written language.
It's a simple fact that it's far easier to implement the Roman alphabet on a computer than a zillion independant symbols -- you need less RAM, simpler displays and so on.
What the Chinese need to do is settle on a single way to transliterate spoken Chinese into the Roman alphabet (or even the Cyrillic, Hebraic or Greek if that's what they want). Ideograms are neat, but they're a pain in the ass.
Sorry, it's not cultural imperalism, just pragmatism!
Re:Why asian character sets? (Score:3)
I have dibs on släshdot.org!!
Kanji (Score:2)
I believe there is also a xterm counterpart for kanji.
Mike
"I would kill everyone in this room for a drop of sweet beer."
Re:This is not meant to sound xenophobic, but (Score:2)
canonicalisation issues (Score:3)
(See the Unicode bugs recently in IIS, where a unicode representation of '../' is used to navigate upwards in the directories of the server to view files outside of the server root.)
Now, does a company have to register all possible permutations of byte sequences which all map to the same character sequence? As well as doing so in
We'll see.
English Based Systems sending E-Mail? (Score:4)
Great, if not already blocked (Score:2)
--
WolfSkunks for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.keenspace.com";
Why asian character sets? (Score:3)
Easier to test etc..
Big5 or Unicode (Score:4)
(eg www.ê.com) will not be the same as the unicode equivalent..
Appearance of names (Score:2)
Will this be a good kick in the butt for internationalization of your OS?
Re:RFC (Score:2)
True, they say that any name part consisting entirely of USASCII characters are not allowed to be encoded this way, but they would have to go out of their way if they wanted to ensure that double-wide SJIS romaji were not confusingly registered. Then again, we can already do "s1ashdot.org" with just plain ASCII.
In addition, there's the inherent difficulty in the fact that a Chinese website using a Simplified Chinese set of ideographs could hijack surfers wanting to go to a site with the same name, but with Traditional Chinese ideographs.
IIRC, in Unicode, Chinese and Japanese ideographs all map to the same code if they're basically the same character, with the differences considered font-specific. In the extreme case, one common radical is rendered with one less stroke in Japanese, which could have created hundreds of extra codes.
Most simplified kanji/hanzi should be unique, but a few, at least in Japanese, use an already existing, more common character. Generally, though, this won't be a problem if Unicode is used.
Re:RFC (Score:3)
Re:It breaks the dns-rfc. (Score:2)
Such an authoritarian title. Are you sure? It proposes ASCII encoding, not a Unicode or other mbcs usage directly.
Also, doesn't it kinda suck to make large parts of the net unavailable for most? Don't you think the Chinese and Japanese people could say the same thing about English?
IMO about time (Score:2)
Re:Quick (maybe stupid) question... (Score:2)
RACE Encoding scheme is not very PC (Score:2)
TLD (Score:2)