Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Wikipedia Science

The Next Big Step For Wikidata: Forming a Hub For Researchers 61

The ed17 writes Wikidata, Wikimedia's free linked database that supplies Wikipedia and its sister projects, is gearing up to submit a grant application to the EU that would expand Wikidata's scope by developing it as a science hub. ... This proposal is significant because no other open collaborative project ... can connect the free databases in the world across disciplinary and linguistic boundaries. ...the project will be capable of providing a unique open service: for the first time, that will allow both citizens and professional scientists from any research or language community to integrate their databases into an open global structure, to publicly annotate, verify, criticize and improve the quality of available data, to define its limits, to contribute to the evolution of its ontology, and to make all this available to everyone, without any restrictions on use and reuse.
This discussion has been archived. No new comments can be posted.

The Next Big Step For Wikidata: Forming a Hub For Researchers

Comments Filter:
  • Folks, Wikipedia is a starting place, but its ever-changing content contributed by whoever is not acceptable for academic references. This has been discussed before.

    No one references Encyclopedia Britannica in their Masters Thesis...

    • by Livius ( 318358 )

      But this is about Wikidata.

      • But this is about Wikidata.

        Yes, and my WordPress blog has an underlieing database, too.

        • Just read the fucking article this time. I was going to select bits to quote for you, but honestly either you are deliberately misunderstanding or don't possess the reading skills to make sense of words.

          Have someone read it to you, and ask them to use small words if that helps.

          This is not related to Wikipedia, aka "The encyclopedia that like 12 friends of Jimbo can edit" except in that special way where you are related to any random-ass member of humanity.

    • Re: (Score:3, Insightful)

      by Anonymous Coward

      Put the pipe down, bro. Pointing out the obvious is cool and all, but kinda OT in this case.

      In case the room was too smokey to see your screen properly - from TFA; "...would expand Wikidata's scope by developing it as a science hub. The proposal, supported by more than 25 volunteers and half a dozen European institutions as project partners, aims to create a virtual research environment (VRE) that will enhance the project's capacity for freely sharing scientific data."

      We're not talking about wikipedia, but

      • [...] and to make all this available to everyone, without any restrictions on use and reuse.

        The fundamental problem remains, however. Even if scientists curate the data honestly and comprehensively, what's to stop people from taking the material, editing/changing it, and publishing/claiming their version is correct? The only way to protect against this is to make the data read-only downstream, eg only credentialed scientists will get to create or modify data - and that's a pretty fundamental restriction

        • Right now, except for obscure articles, you can't edit without getting reverted because of article "ownership" by some "in" editor or an Admin. Oh sure, you can correct bad spelling or grammar, but don't go beyond that without permission of the article "owner".

          If you want to go to the model of locked down and edited by professionals, there is a encyclopedia that has been around for a long time that works that way.

    • by Anonymous Coward

      Haters gonna hate. That doesnt mean their not stupid though. Get a grip. Wikipedia is an invaluable reference.

  • I wish them good luck to them for the EU grant procedure. The procedures are such a maze that usually EU grant experts are required.

    • Re:EU grant (Score:4, Interesting)

      by JanneM ( 7445 ) on Saturday January 03, 2015 @10:35PM (#48728257) Homepage

      They have four partner universities and several other research institutions, most or all of who already have one or more full-time staff dedicated to help projects with their grant application process.

      Yes, EU grant applications are big and cumbersone - though the payoff is commensurate - but the process is not going to be the main hurdle. With all the available expertise at their disposal, if they can't navigate the application process then they're unlikely to successfully steer a major project over several years either.

  • by Anonymous Coward

    wikipedia started out as a web site where volunteers could edit articles, before entry into the nupedia website. nupedia is now dead. wikipedia has been engaging in bigger fund raising drives, and has more paid employees. Now it is trying to do more stuff to justify those more employees, just like when wikipedia spent a bunch of money trying to develop better wikipedia page editing software. I bet the heads of wikipedia now have bigger salaries.

    I would just like the number of humans maintaining wikipedia to

  • by Nutria ( 679911 ) on Saturday January 03, 2015 @09:35PM (#48728055)

    I can't be the only one who thinks that is a terribly bad idea... It would rip the guts right out of repeatability, and confidence that "this" is what $RESEARCHER found.

    • Give it a chance (Score:5, Insightful)

      by Okian Warrior ( 537106 ) on Saturday January 03, 2015 @10:25PM (#48728215) Homepage Journal

      I can't be the only one who thinks that is a terribly bad idea...

      When I first heard about wikipedia and the theory driving it I thought it was a terribly bad idea at the time... but ya know, I find it really useful. It's got lots of problems but on balance it's s lot more useful than problematic.

      We've identified many deep problems with scientific research on this very forum, and to my knowledge little progress has been made over the last decade.

      Can't we at least *try* different solutions?

      Where is it written(*) that the old ways are the best?

      (*) The script to Skyfall of course. I got that from Wikiquotes [wikiquote.org].

      • by Nutria ( 679911 )

        We've identified many deep problems with scientific research on this very forum

        Most revolving around laziness and academic corruption. Allowing data (for example: historical weather gauge readings, or IQ scores, or any other data having to do with hot-button topics) to be edited is an invitation to socio-political fraud on an unheard-of scale.

    • by ranton ( 36917 ) on Saturday January 03, 2015 @11:27PM (#48728389)

      I can't be the only one who thinks that is a terribly bad idea... It would rip the guts right out of repeatability, and confidence that "this" is what $RESEARCHER found.

      Um, have you never heard of versioning? It would be pretty trivial to add the statement "Used the XXX v3.5.1 dataset to perform these calculations" to your research paper.

      • by Nutria ( 679911 )

        (1) Wikidata would either have to keep (many) multiple copies of possibly quite large data sets, or keep diffs. How much of a strain does it put on a busy server to generate a dataset from a huge original and lots of large diffs.

        (2) Not too many people pay attention to Wikipedia changelogs. If only the current form of the data is easily visible, that's what most people -- especially amateurs and those with political motivations -- will use.

        • by ranton ( 36917 )

          Wikidata would either have to keep (many) multiple copies of possibly quite large data sets, or keep diffs. How much of a strain does it put on a busy server to generate a dataset from a huge original and lots of large diffs.

          First off none of the problems you list are unmanageable; they just make it more expensive and more difficult to design. One technique could be to only store data sets from published papers. Versions cited in published papers and the latest data will be the ones most frequently accessed, and all other versions could be handled with diffs. They may even decide to only use diffs, but keep track of which versions are most frequently downloaded and store them in full. There are many more ways they could archite

          • by Nutria ( 679911 )

            holding back new research tools because amateurs and politically motivated groups could misuse them is very scary indeed.

            An analogy: we hold back guns from four year olds -- even when we show it to them and say, "Very dangerous! Never touch!", but not from legally competent adults; when said four year old gets his hands on a gun, bad things can happen.

            Likewise, we should not hold back *copies* of data from the world. However, so as to protect the "chain of provenance", edit privileges should be limited in some way, so as to prevent abuse by sock puppets and the anonymous. Maybe something as simple as requiring editors to l

      • I can't be the only one who thinks that is a terribly bad idea... It would rip the guts right out of repeatability, and confidence that "this" is what $RESEARCHER found.

        Um, have you never heard of versioning? It would be pretty trivial to add the statement "Used the XXX v3.5.1 dataset to perform these calculations" to your research paper.

        Versioning only ensures that anyone who subsequently performs the calculations will reach the same result - it does not verify the data is complete or correct. That's the

        • by khallow ( 566160 )

          Versioning only ensures that anyone who subsequently performs the calculations will reach the same result - it does not verify the data is complete or correct.

          Repeatability was exactly the concern addressed. Having said that, one key difference appears to be that the data is just dumped rather than interpreted. That particular version isn't going to become more or less correct and complete just because my sock puppet army is at work.

          And what really can or should a content management system do here to verify correctness and completeness? I think the original insistence on repeatability is precisely because completeness and correctness is a hard problem beyond t

        • by dkf ( 304284 )

          Versioning only ensures that anyone who subsequently performs the calculations will reach the same result - it does not verify the data is complete or correct.

          Nothing much ensures that the data is complete or correct now either, other than peer review over a long period of time by people who are wholly unconnected with the original work (and its funding). In fact, in some sciences you're not going to get complete data in a public venue anyway (some sciences work with data that in raw form can identify individual people; think medical research). Correctness is hard to evaluate; what does it even mean for raw data in the first place?

          But keeping versioned data does

  • by Mikkeles ( 698461 ) on Saturday January 03, 2015 @09:53PM (#48728117)

    Congratulations! You are the one-millionth user to log into our system. If there's anything special we can do for you, anything at all, don't hesitate to ask!

    I want no Beta, 15 mod points per day, and a pony!

  • I and a co-author pitched this notion in 2006. We had pitched it as a smaller element of a "research match-maker" idea. And, man, were the academics violently opposed. No one saw value in the work and most felt either directly threatened or otherwise unsure how to objectively gauge the value of the contribution with author name and affiliation removed. It was depressing.
  • Most of the commenters here did not even bother to catch that difference. RTFA, folks.
  • The main problem with scientific data is retention. Often the results are kept, but the data that led to the results is long lost. Even 5 years later, it's hard to find the data. There is a reason for this: there's a lot! Regardless of what their database size, most particle physics experiments can fill it in less than a day. It's not technologically feasible to gather the information into one system, at our current level of technology.

    While wikipedia has editing and flame wars problems, this project wou

  • This is what the Texas Digital Library [tdl.org] aims to do. Though it's not quite one big wiki, it actually is a push to archive and collaborate using various data types and formats.
  • Is this kind of like the research database that Jordi can access (and update) with his tablet on star trek?

"We don't care. We don't have to. We're the Phone Company."

Working...