Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
News

XML 1.1 Spec Hits Some Snags 259

oever writes "News.com reports that the new XML 1.1 specification defines a new newline character, making it incompatible with the 1.0 specifiation. Apparently, IBM has been pushing the new character to avoid having to modify their software, thereby invalidating everybody else's XML software."
This discussion has been archived. No new comments can be posted.

XML 1.1 Spec Hits Some Snags

Comments Filter:
  • MS Office (Score:1, Interesting)

    by Shinsei ( 120121 ) <[caledorn] [at] [fein.no]> on Friday October 18, 2002 @10:21AM (#4478223) Homepage
    I wonder if this will have any impact on MS plans for making the next generation of Office. AFAIK, they're planning to make all the applications work together through XML... Then again, it is "only" a newline character... :P
  • Simple Solution (Score:3, Interesting)

    by Anonymous Coward on Friday October 18, 2002 @10:22AM (#4478234)
    Why don't they make new-lines overridable? Then IBM can put the override at the beginning of their files.
  • by Dave21212 ( 256924 ) <dav@spamcop.net> on Friday October 18, 2002 @10:26AM (#4478273) Homepage Journal
    Considering what some other vendors have done to standards, one tiny addition (which is an improvement) proposed by IBM shouldn't be a big deal. Sure, it feeds the news hounds, but seriously, compare the scale of the impact of one desirable change to all the suffering caused by other such changes in emerging standards (Microsoft's in particular).

    IBM has contributed so much, it's only natural that some changes might be characterized in the news as benefitting them more than other parties. Is anyone that worried about adding a new EOL character in 1.1 that XML 1.0 "chokes" on ?
  • Re:So? (Score:2, Interesting)

    by Anonymous Coward on Friday October 18, 2002 @10:28AM (#4478289)
    Fair point, but how many people do you think have actually used those characters in an XML1.0 document?

    IBM would appear to be right, too, when they note that an application should look at the version identifier which is present at the top of the XML stream.
  • Here's A Good Point (Score:5, Interesting)

    by LISNews ( 150412 ) on Friday October 18, 2002 @10:28AM (#4478295) Homepage
    From the article, which kind of put it into perspective for me:


    "The truth is that there are a lot of IBM mainframe systems out there, and they're very important," said Ronald Schmelzer, an analyst with ZapThink. "The truth is that this is not really for IBM's benefit, it's for IBM's customers' benefit. And I think that's fair. An international standard shouldn't change for the benefit of a company's future project, but it's clear that end-of-line characters are not a strategic business strategy for IBM."

  • CRLF in EBCDIC (Score:4, Interesting)

    by spacefight ( 577141 ) on Friday October 18, 2002 @10:44AM (#4478407)
    is 0x156C in my programming area, 'nough said. EBCDIC is still live. Did you know that about 90% of todays enterprise data is stored in EBCDIC chars? You better update the XML specs :)
  • by RAMMS+EIN ( 578166 ) on Friday October 18, 2002 @10:44AM (#4478413) Homepage Journal
    Anybody care to explain to me _why_ we need so many different newline characters | sequences? I see a point in having a single \x0a character, because a newline is one character. I see a point in having \x0a\x0d and \x0d\x0a, because they represent more accurately how a typewriter does it (and conform better to the original ASCII standard, I think). However, one of these is kind of redundant, and history seems to have decided that this is \x0a\x0d. But why, for goodness's sake, do we need all those others??? Why is it that people always do things their own way instead of following standards that work fine???
  • End of Line For XML? (Score:1, Interesting)

    by Anonymous Coward on Friday October 18, 2002 @10:52AM (#4478480)
    Does this mean that XML has reached the end of the line and it is time to start working on the next big thing?
  • by FooBarWidget ( 556006 ) on Friday October 18, 2002 @11:04AM (#4478576)
    Doesn't make this XML files uneditable with most editors, like vi, pico and gedit? They all use \n (byte 10) as newline character.
  • Re:version naming (Score:4, Interesting)

    by Fweeky ( 41046 ) on Friday October 18, 2002 @11:06AM (#4478589) Homepage
    if these two specifications are NOT compatible, then it would make sense that they would name the new one XML2.0 no?

    Not really. The change isn't exactly huge; it makes XML a bit more consistant with regard to UTF, but I don't see it breaking anything other than for those who both:
    • Failed to specify a prologue (and hence charset, meaning they accepted the default utf-8), and;
    • Actually used #x85 or #x2028 to encode anything useful other than newline.

    TBH if you were that lax in specifying your XML version and characterset, and then made use of non-printable characters that actually had known uses in the default charset, you deserve everything you get.
  • by Anonymous Coward on Friday October 18, 2002 @11:06AM (#4478590)
    Now let's take IBM out of this phrase and replace it with Microsoft, wonder how fast the responses would change from "it's not that bad" to "who do they think they are?"... Mod this as a troll if you will, but it's something to ponder.

    'Here's A Good Point (Score:5)
    by LISNews on Friday October 18, @10:28AM (#4478295)
    (User #150412 Info | http://www.lisnews.com)
    From the article, which kind of put it into perspective for me:

    "The truth is that there are a lot of Microsoft systems out there, and they're very important," said Ronald Schmelzer, an analyst with ZapThink. "The truth is that this is not really for Microsoft's benefit, it's for Microsoft's customers' benefit. And I think that's fair. An international standard shouldn't change for the benefit of a company's future project, but it's clear that end-of-line characters are not a strategic business strategy for Microsoft."
  • Re:Full Details (Score:5, Interesting)

    by Anonymous Coward on Friday October 18, 2002 @11:14AM (#4478669)
    So I want off and read it (Or at least, what appears to be it. There is a rant someway down the page you link to. Is that it?)

    So anyway, I read it. Surprise the surprise, the guy doesn't actually offer any actual examples of where this change would actually cause a break in itself. All he basically does is cry that 0x85 is designated as a new line character, and how dare IBM do such a thing! Then he goes into a rant about IBM, monopolies and patents. Uh huh.

    The fact is that 0x0085 is designated as NEL (NEw Line) as part of the Unicode specification. XML 1.1 allows the use of Unicode, which XML 1.0 did not. Therefore, if you are using XML 1.1, and you are using 0x85 and expect to see a grave a, your document isn't a Unicode compliant document anyway, and you shouldn't be complaining that a non compliant document doesn't work with a compliant parser.

    If all these people want to use 0x85 in their XML 1.1 documents, then they'll have to properly convert them to Unicode as the specification allows. Surprising, that.
  • by PainKilleR-CE ( 597083 ) on Friday October 18, 2002 @11:16AM (#4478685)
    I see a point in having \x0a\x0d and \x0d\x0a, because they represent more accurately how a typewriter does it (and conform better to the original ASCII standard, I think). However, one of these is kind of redundant, and history seems to have decided that this is \x0a\x0d. But why, for goodness's sake, do we need all those others??? Why is it that people always do things their own way instead of following standards that work fine???

    Because ASCII doesn't work for the character sets that over 50% of the world's (literate) population reads and writes. Hence the Unicode standard, which of course tries to make it's overlap with ASCII compatible when possible.
  • by forevermore ( 582201 ) on Friday October 18, 2002 @11:26AM (#4478771) Homepage
    I'll admit that I don't know much about the technical side of xml (and I really can't see all of the great advantages to it, either), but since when does a parser care about whitespace? Wouldn't it make more sense to let the newline character match that of the overlying OS so people can actually TYPE those newline characters? Switching to unicode is fine and dandy, but what about all of those legacy systems that don't support it?
  • Re:Full Details (Score:2, Interesting)

    by nijhof ( 44330 ) on Friday October 18, 2002 @11:27AM (#4478785) Homepage
    There is only a rant on that page, no examples.

    And you know what? I think an XML v1.1 document would be incompatible with any non-updated program, no matter what the changes in v1.1 are -- for if the program wasn't upgraded, it can't know what XML v.1.1 means. And there must be some difference, otherwise it wouldn't have a different version number

    Jeroen
  • by smallpaul ( 65919 ) <paul@@@prescod...net> on Friday October 18, 2002 @12:10PM (#4479151)

    The Slashdot commentary has been pretty one-sided so I'll try and address the other side. First, IBM has said that this fix is for their mainframe customers, not for themselves. But nobody in the XML world has heard from these customers. As far as I know, no user has submitted a request for this NEL feature. No user has sent a message to the many XML mailing lists. No user has posted to Slashdot. Updating all of the XML parsers in the world is really expensive and if the mainframers don't care enough about the problem to storm the gates then maybe it isn't hurting them that badly. So from a democratic point of view, we're going to make life harder for the people who care enough to scream out loud in order to make life easier for the small minority who perhaps are not even that badly impacted.

    Further discussion [xml.com] is on xml.com.

  • by ebresie ( 123014 ) on Friday October 18, 2002 @12:27PM (#4479345) Homepage Journal
    Okay...maybe I'm not looking at this incorrectly, but...

    If IBM problem is they don't want to force everyone to update their Mainframes and cause them a head ache...but won't they still have to upgrade their Mainframes to support XML 1.1 with new XML 1.1 compatible parsers?

  • by smallpaul ( 65919 ) <paul@@@prescod...net> on Friday October 18, 2002 @12:28PM (#4479352)

    And what will an XML 1.0 parser ("millions served") do with an XML 1.1 document? When your IBM mainframe serves up 1.1 data with NEL to my Windows 98 with IE 5.5, IE will complain that the document is not well-formed. This means that there is a period of time where the XML world is split. It will be a LONG time before these mainframe users will be able to use NEL and confidently send the data to anyone else. It might have been cheaper to just fix the software.

  • by cornicefire ( 610241 ) on Friday October 18, 2002 @12:59PM (#4479669)
    I'm dealing with some cross-platform XML these days. It's generally pretty wonderful, but the newline character is something that drives me a bit batty. If anyone can bring some unity to this disunity, I'm sure that all of the XML world and the Java world would be better off. It's an anachronism.

Make sure your code does nothing gracefully.

Working...