Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Australia Government Open Source News

Australian Stats Agency Goes Open Source 51

Posted by timothy
from the as-it-should-be dept.
jimboh2k writes "The Australian Bureau of Statistics will use the 2011 Census of Population and Housing as a dry run for XML-based open source standards DDI and SDMX in a bid to make for easier machine-to-machine data, allowing users to better search for and access census datasets. The census will become the first time the open standards are used by an Australian Federal Government agency."
This discussion has been archived. No new comments can be posted.

Australian Stats Agency Goes Open Source

Comments Filter:
  • by goombah99 (560566) on Friday December 17, 2010 @12:05AM (#34584190)

    I'm perplexed why people continue to use XML when there is YAML. What is it that makes XML so attractive as a durable format? it's not human readable in a practicale sense, and YAML very much is. Since it's delimeters are comlicated and variable, It's harder to parse in ad hoc ways than yaml (line and white space) which means that for rapidly extracting things there are no shorcuts to instantiating a whole document. It's hard to grep. And both formats can fully do the other ones job so they are interchangeable.

    • by goombah99 (560566) on Friday December 17, 2010 @12:16AM (#34584242)

      To see how clean YAML is to reads for humans and to parse by machine look at a Sample Document [wikipedia.org]. And here's something truly impressive, a Yaml Quick reference card [yaml.org] written entirely in YAML itself. Not only is it a marvelously short card, it's human and machine readable. It's a superset of JSON too.

      • Interesting. How does YAML handle validation and user defined grammars?
        • Re: (Score:2, Informative)

          by Anonymous Coward

          Interesting. How does YAML handle validation and user defined grammars?

          Multiple ways of varing stringency. For the simple case you can define types (.e.g. floats, ints, or user defined types). For the vast majority of uses that's all you need for validation. Now if you want to define a schema there are several different ones that are used. Kwalify and Rx are two. Finally, there are YAML 2 XML converters. So you can just convert the YAML to XML and use your favorite XML validator. Thus the validation itself other than the types is not baked into the definition and thus

      • by tabrisnet (722816)

        Great for human readability. Terrible (due to some python-like indent rules) for humans to add content to.

        Meanwhile, XML might not be quite as nice as YAML for reading, but it is easier to figure out where you made a mistake, assuming you're pretty printing it (but the best thing is that pretty printing it is unnecessary).

        • by goombah99 (560566) on Friday December 17, 2010 @01:08AM (#34584410)

          Great for human readability. Terrible (due to some python-like indent rules) for humans to add content to.

          Oh come on man. This is like the ancient discarded whitespace lament about python. I was once like you before I started writing python. Then I saw the huge huge light of why white space indenting is so great. I could explain but I'm not sure I could have convinced even myself before trying it.

          Bottom line. it's freakin easy to get the white space right and any decent editor with context sensitive tabs does it for you. emacs, vim, bbedit, eclipse. Is there any that don't?

          This is a NON ISSUE

          Meanwhile, XML might not be quite as nice as YAML for reading, but it is easier to figure out where you made a mistake, assuming you're pretty printing it (but the best thing is that pretty printing it is unnecessary).

          Ha! you make me laugh. So now we need special editors and printers for XML reading. Were we not just complaining about white space. Now you pretty print to put perfect white space in XML?

          • by tabrisnet (722816)

            Notepad, which is so often used by the technically non-clueful. Of which, I seem to work with a few.

            Of course, you should use a real editor. This somehow doesn't prevent people from using notepad b/c they don't know better, or using vim but not knowing HOW to use vim and still we lose all indenting.

            and I never said you needed a special editor for XML. Not even that you need one for JSON or YAML.

            Pretty printing isn't MANDATORY for XML... which is really the point. With it NOT necessary, means you can fuck up

        • by Anonymous Coward

          Great for human readability. Terrible (due to some python-like indent rules) for humans to add content to.

          Apparently you are not aware that YAML, being a superset of JSON, can be written entirely in JSON, or a mixof the two. in JSON you don't need to use white space. So you use the white space in YAML when it makes sense (nearly always) and when you get into absurd edge cases then you toss in a little JSON syntax when apropos.

          So sorry, you just don't have a case to make here unless you want to say something bad about JSON as well.

          • by tabrisnet (722816)

            I use JSON (and occasionally YAML), but only for data interchange formats where I don't expect a human to need to modify it.

            Yes, I am aware that JSON and YAML are largely related. And I a few times tried to write up files in JSON, just as a mockup of my intended data structure. Yes, I used a real editor with proper tab indenting. It still got to be pretty unreadable. I use Data::Dumper whenever I want the data format to be as explict as possible, but only for debugging.

            But it's so much worse than that. XML

      • Does anyone else feel like they just looked at some COBOL source when looking at the YAML example?

    • I'm perplexed why people continue to use XML when there is YAML. What is it that makes XML so attractive as a durable format? it's not human readable in a practicale sense, and YAML very much is. Since it's delimeters are comlicated and variable, It's harder to parse in ad hoc ways than yaml (line and white space) which means that for rapidly extracting things there are no shorcuts to instantiating a whole document. It's hard to grep. And both formats can fully do the other ones job so they are interchangeable.

      I would actually dispute all of your comments, but picking up on the last point in bold, one of XML's key features is "mixed content [w3schools.com]", which is apparently (according to http://yaml.org/xml.html [yaml.org]) not possible in YAML.

    • XML is perfectly suitable for long term data storage and exchange. You have namespaces, schemas, and a millions of tools to handle it.

      YAML is OK for storing configuration data. It's not even that good for anything else.

      Also anyone who "parses in ad hoc ways" deserves to be slapped in the face.

    • by c0lo (1497653)

      I'm perplexed why people continue to use XML when there is YAML.

      Can you point to me, please, to the reference on how one can define in YAML the equivalent of a schema?
      You know, to act as the "contract" for the data exchange protocol... extensions (to allow 3rd party custom data sections) and namespaces (to isolate the 3rd party extensions that I'm not interested in) would be a real bonus.

    • by kwerle (39371)

      I'm perplexed why people continue to use XML when there is YAML...

      The real answer is: who cares? They're both easy [enough] to parse data formats. It's about as interesting as arguing about what your favorite editor is and why. Or your favorite database. Everyone knows the ins and outs, and nobody cares (except maybe you and the person you're arguing with). We all have libraries. We all have parsers. It really doesn't matter.

      The trivial answer to your question is: because YAML is very new in the grand scheme of things. And it's not so different that it's really in

  • "The census will become the first time the open standards are used by an Australian Federal Government agency."

    Really?
    http://xena.sourceforge.net/ [sourceforge.net]
  • by the_other_one (178565) on Friday December 17, 2010 @12:24AM (#34584270) Homepage

    Australia is openly embracing census data and enhancing it's availability.
    Canada's government is going out of its way to prevent census data collection.

    • by Jamz (89107)

      Seems logical - as a Tax Payer, the data should be available to me.
      Although I hope its not leveraged too heavily by the commercial sector.

    • by bryxal (933863)

      Take action to change that!

      http://www.liberal.ca/open/ [liberal.ca]

      The Liberal Open Government Initiative will:

      * Immediately restore the long-form census;
      * Make as many government datasets as possible available to the public online free of
      charge at opendata.gc.ca in an open and searchable format, starting with Statistics
      Canada data, including data from the long-form census;

    • by Noughmad (1044096)

      Australia is openly embracing census data and extending it's availability.
      Canada's government is going out of its way to extinguish census data collection.

      FTFY

  • Meanwhile, in other government agencies and private enterprise there are open file formats such as the geophyical SEGD and SEGY formats that have been used since at least the 1980s. That means you can read data files from 1982 on current software.
    Closed file formats are an "innovation" of Microsoft and similar companies. It's really any different from the bastards that write unreadable code in an attempt to provide job security.
    hopefully in the future some of the practices of elements of Microsoft and man
  • We should find out what percentage of the population thinks that this is a good idea....

  • ...and here's why:

    It's official - Munich Linux migration is "dead - abandoned in all but name." - Linux

    Yes, you read right: "Dead - abandoned in all but name". [fixunix.com]

    • Munich Linux migration is "dead - abandoned in all but name."

      Last I heard it was a migration to open source and they were successfully using open source desktop applications. The operating system may be Windows rather than Linux but this still seems to be a victory for open source. On the desktop the applications are far more important than the operating system.

    • Open source standards, no open source code. Very different issue.

  • There is some difference. I'm not clear from the summary exactly what's going on.

    • by c0lo (1497653)
      TFA mentions "open standards" in the opening and only once. I reckon the reporter (or the proof-readers? or editor?) had a slip-of-fingers on the keyboard. 'Tis clear they speak of Open Standards rather.
  • How many Jedi's currently live in Australia.

    • by c0lo (1497653)

      How many Jedi's currently live in Australia.

      None: for the moment, Assange is retained by the dark side of the force and too dry Australia is for master Yoda.

  • by Anonymous Coward

    As the author of the Perl module YAML::Tiny, and the current maintainer of the original YAML.pm I call troll on the parent.

    YAML as a specification is way more complex than XML and it's way harder to implement.

    And who in their right mind is going to read the raw census statistical quads directly? The point is moot.

    XML is ideal for machine to machine communication. It's easily machine readable, and easily debuggable by nerds (which is the bit of "readable" that really matters here). And machine readable is wh

  • by Anonymous Coward

    The census will become the first time the open standards are used by an Australian Federal Government agency.

    What the hell are you talking about? We use a variety open standards every day of every minute across every department with any modern IT assets, I think what you meant to say was the first time that open standards are being used by an Australian Federal Government agency to communicate with the general public. Even then, it's not exactly news, it was going to happen eventually.

No user-servicable parts inside. Refer to qualified service personnel.

Working...