All Researchers To Be Allocated Unique IDs 164
ananyo writes with information on a new scheme to help uniquely identify authors in the face of ambiguous names. From the article: "In 2011, Y. Wang was the world's most prolific author of scientific publications, with 3,926 to their name — a rate of more than 10 per day. Never heard of them? That's because they are a mixture of many different Y. Wangs, each indistinguishable in the scholarly record. The launch later this year of the Open Researcher and Contributor ID (ORCID), an identifier system that will distinguish between authors who share the same name, could soon solve the problem, allowing research papers to be associated correctly with their true author. Instead of filling out personal details on countless electronic forms associated with submitting papers or applying for grants, a researcher could also simply type in his or her ORCID number. Various fields would be completed automatically by pulling in data from other authorized sources, such as databases of papers, citations, grants and contact details. ORCID does not intend to offer such services itself; the idea is that other organizations will use the open-access ORCID database to build their own services."
Unique IDs eh? (Score:5, Funny)
Hmm. A new program to uniquely track and identify scientists springs up in the middle of an all out war between science and the idiocracy. Totally coincidental. *adjusts tin foil hat*
Re: (Score:2)
Re: (Score:2)
No armbands. They'll just be identified by their pieces of flair.
Re: (Score:2)
Great News for Virginia! (Score:5, Interesting)
Hmm. A new program to uniquely track and identify scientists springs up in the middle of an all out war between science and the idiocracy. Totally coincidental. *adjusts tin foil hat*
No need to adjust your tinfoil hat. I read this article and thought "Oh, great, now Virginia's Attorney General can conduct more accurate witch hunts [slashdot.org]." (he was unable to properly identify over 30 scientists and researchers)
Re: (Score:2)
Alex Jones, is that you?
Is Y. Wang the inventor of Wang computers? I hear women in offices love their Wangs.
Why not solve this problem by just using the full name?
Oh well. (shrug)
Re: (Score:2)
Re: (Score:2)
Full names are not necessarily unique either.
Re: (Score:2)
Full names are not necessarily unique either.
Indeed. A few years ago, I ran across a US Census Bureau web page that gave the number of people with specific first or last names, and an estimate (likely from multiplying the fractions) of the number of people in the US with a given first+last name. It said that there are about 1800 people in the US with the same name as me, and my family name isn't even Smith or Jones or any of the other top 100.
Through the years, I've seen a number of bibliographies that list things that I've written, intermixed wi
Re: (Score:2)
There is 1 person with my name in the U.S.A.
So how many people are there with your name in China? ;-)
Actually, my wife also has a name that's unique in the US, and probably the world. But more fun is that right now it gets over a million hits on google. This is in part because her name is an English sentence. But the top hits are because her name was also a widespread news headline in 1822 in most of the English-speaking world. That should be enough of a clue to figure it out.
There are lots of fun obscure facts about names.
Re: (Score:2)
Or constant, for that matter. Even in academia, some people (mostly women) still change their names when they marry, which can add to the confusion. Imagine tracking all the papers by Mary Jane Smith nee Jones. Having a unique personal ID would solve the changing name problem as well as the non-unique name problem.
Re:Unique IDs eh? (Score:4)
. . .some people (mostly women) still change their names. . .
. . .not to mention the difficulties faced by, e.g., Lynn Conway [wikipedia.org]. Being able to generate a new identity for oneself can have advantages [umich.edu].
Lynn was fired, and forced to leave the field of computer science/engineering after telling her bosses at IBM that she was to undergo sex reassignment surgery in 1968. She could re-enter the field only because she could create a new identity (this time as a woman), starting a new career all over again at the bottom of the ladder.
Who is this person, you may ask? As a man, in the 1960s, she invented processor multiple-issue out-of-order dynamic instruction scheduling. After her transition? Oh, nothing . . . only co-authoring (with Carver Mead [wikipedia.org]) Introduction to VLSI Systems which, by promoting the use of standard cells, automated design tools, and silicon foundry services (e.g., MOSIS [wikipedia.org]), revolutionized the field of digital integrated circuits. Virtually every digital chip today is designed in this way; and there are many in the field who cannot conceive of any other way to do design ("Wasn't it always done this way?").
If Lynn could not have generated a new identity and re-entered the field as she did, these critical advances may have been delayed for years.
Re:Unique IDs eh? (Score:5, Insightful)
Why not solve this problem by just using the full name?
Because it wouldn't solve the problem at all. There are many researchers with the exact same full name. One reason we have Social Security numbers in the US is because full names have a strong tendency to be similar.
That said, I'm sure the Wangs can come up with a solution. huh-huh...
Re: (Score:2)
Completely correct, but this would do even less to solve the problem in China. China has had a problem with a lack of unique full names for quite some time. According to this [hindu.com], there's 100,000 people named Wang Tao. I imagine that at least a few of them are in similar fields. There's a pretty simple explanation. [wikipedia.org] Basically, the 100 most common surnames are used by 85% of the population. There's only between 3000-4000 surnames currently being used at all. Compare that to the United States, which has well over 100,000 surnames in common use.
True- thank you for the informative links. To complete the picture in the US, here's the US Census data about surnames from the 2000 census [census.gov].
Re: (Score:3)
They should ask Hollywood celebrities for advice on broadening their pool of names for babies. They come up with all sorts of fascinating name.
Re: (Score:3)
"They should ask Hollywood celebrities for advice ..."
Don't, they take perfectly unique names like 'Thomas Cruise Mapother IV' and change it to Tom Cruise.
Re: (Score:2)
Re:Unique IDs eh? (Score:4, Interesting)
Assigning UIDs to researchers, to resolve ambiguity in publications and attribution?
This sounds like a new twist on the old "Your papers, please" .
Re:Unique IDs eh? (Score:4, Insightful)
Is it any different from an Email address or a bank account number?
If you have gone to the effort to research, write and publish a paper the last thing you want is for people not to know who or where you are.
To make it really useful, you should be able to register as an independent researcher and take it with you wherever you go.
The only downside is thst it might become like the Chinese record of achievement.
Re: (Score:3)
Do I REALLY need to stick "smileys" in every damn comment? :-) :-) :-)
Get it? twist on "Papers".
Subtle wordplay dies, when subjected to explanation.
Re: (Score:2)
I share my store cards with my girlfriend. God knows what Tesco's data mining algorithms think of me.
Could be tricky collecting the Nobel prize though.
Re:Unique IDs eh? (Score:4, Funny)
"GUID, bitte."
Re: (Score:2)
Re: (Score:2)
Except before they didn't even need an ID.
Re: (Score:3)
This sounds like a new twist on the old "Your peer-reviewed papers, please" .
FTFY.
Re: (Score:2)
Scientists are at war with a movie?
This post is one of the most brilliant illustrations of Poe's Law that I've ever seen. What scares me is that I can't tell if the brilliance is intentional or not.
Re: (Score:2)
Weird. I did a search for Poe's law on bing and it came up with zero results. I can't remember the last time I did a search and got zero results.
Re: (Score:2)
There are many similar systems... (Score:5, Interesting)
Re: (Score:3)
Individual researchers will be able to get an ORCID number for free as of later this year, whereas universities, companies and other organizations will pay tiered-subscription charges. So far, the scheme has been sustained by members working in kind, as well as by donations of US$574,000 and loans of $1.2 million. Once membership fees begin flowing, they are expected to raise $2.5 million each year.
With $1.2 million dollars of debt already, and with the expectation that ORCID will become a tiered paid subscription service, I don't see any reason why anyone would want to use ORCID instead of researcherID [researchid.com]. What will happen to your ID if ORCID goes bankrupt?
Also, what happens when you're a prolific researcher into two different field of studies that usually do not mix very well? Can the new system assign you two different IDs?
16-digit ID (Score:4, Insightful)
I'm so glad they made the ID a fixed length 16-digit number. Experience shows that we are very good at predicting the total number of IDs ever to be needed.
Plus 54 bits should be more than enough, so no need to make the number extensible, thus wasting one precious bit as a field extension identifier.
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
"How big of a population explosion are you expecting?"
It depends if T Wang's Cloning project is a success.
It could be a massive explosion.
Re:16-digit ID (Score:4, Insightful)
Re: (Score:2)
The number of researchers, scientists and engineers is going down, not up.
In terms of proportion of population, that may be true; in terms of absolute numbers, I'm pretty sure it's not. The number of papers published continues to grow at a more-or-less exponential rate, and while it's true that the "publish or perish" mentality forces researchers to have their names on more papers now than ever before (which is easier than it used to be, because author lists are also getting longer; it's not unusual for papers in biology to have ten or more authors listed) I have a hard time bel
Re: (Score:2)
which is easier than it used to be, because author lists are also getting longer; it's not unusual for papers in biology to have ten or more authors listed
So true. Here's [nature.com] a paper with 27 authors listed.... here's [nature.com] another one with a whopping 80 authors!
Re: (Score:2)
A sum of numbers that are decreasing may still be infinite.
Re: (Score:2)
If this program runs long enough for this to be an issue then it would have to be very very successful for many hundreds of years.
Re: (Score:2)
With a 16 digit number, it would be successful for at least 100,000,000 years, assuming EVERYONE was a researcher....
With a more realistic estimate of the number of researchers, the 16 digit number ought to be good for well beyond the lifetime of the Universe....
Re: (Score:3)
If the Machines take over and use researchers as power cells with their unique 16-digit number to identify them, each researcher could take up 1.5 square feet and they would still run out of land area on Earth before than ran out of IDs.
Re: (Score:2)
Are you predicting that there are going to be 10^16 scientists anytime soon? If they are all on Earth, that would be just less than 20 scientists per square meter (or about 65 scientist per square meter if you just count land area).
Re: (Score:2)
Well, scientists have a tendency to die after 100 years or less, so it is possible to to cram 20 or more on a square meter when they are under ground and made of ash :)
Re: (Score:2)
9 999 999 999 999 999
I have no idea what number that is. What comes after trillions? Anyway that's 10,000 trillion people that can be identified with this system..... more than the total number of humans that have ever lived on earth. More than the 40 billion that lived on Asimov's metal world/capital planet called Trantor. (Or when Lucas ripped it off: Coruscant.)
Re: (Score:2)
9 999 999 999 999 999
I have no idea what number that is. What comes after trillions?
It's called Quintillions
actually (in short/american count) it's quadrillions. (10e15 is ten quadrillion.)
and the only book that I've read that would even approach that would be Niven's Ringworld... and I'm sure that even that would fall short.
a ringworld as wide as the earth and at our orbit would have roughly 5 trillion square miles (~8000 miles * ~100e6 miles * 2 * pi) (inside) surface area.
10e15/5e12 is 2000 people per square mile, slightly less than bangladesh, and about 24x america -- feasible, if not terribly probably.
Ringworld itself is unlikely to have anything close to this "now", given what the Puppeteers did to it, but i suppose it might have back when th
Re: (Score:2)
You know, a financial payment card (credit card, debit card, etc) are 16 digits in length, The first 6 are special as is the last, which mean there are 9 unique ID digits in it. Yet we don't seem to be running out of numbers even though when a bunch get "liberated" from a payment processor, most financial institutions simply re-issue a new number to you.
Re: (Score:2)
No, we're constantly running out of account numbers. It's such a common problem that several stop-gap solutions have become commonplace, which is why you never noticed.
Re: (Score:2)
hmm, 10^16 = 10,000,000,000 * 1,000,000
So enough researcher IDs for everyone on earth to get a new one every year for the next million years.... Somehow I suspect the system will be replaced by something else for completely unrelated reasons long before they start running out of available IDs.
Re: (Score:2)
There were equally convincing arguments when we chose seven digit phone numbers, 16 digit account numbers, and 32 bit IP numbers.
Yet we ended up running of each one of those. The reason why is that once an identifier succeeds, its use gets extended beyond its original purpose. For example, phone numbers were supposed to be one per household, yet my household with only two adults has seven phone numbers attached to it.
Slightly lesser known (Score:5, Funny)
Re: (Score:2, Funny)
He is working on the new handshaking protocol, right?
public key (Score:5, Insightful)
Re: (Score:2)
The problem isn't attribution as much as it is cross referencing. Say I want to refer to Y. Wang's paper on network theory. I wouldn't use his signature in my paper (signatures are not searchable.)
Re: (Score:2)
Anyone can generate a public/private key, so we don't need an organization to manage (collect fees) the handing out of numbers. Or deciding who is a scientist and who deserves to get a number.
Attribution would be a nice bonus.
Re: (Score:2)
You could use his key fingerprint to reference him.
The problem with doing that kind of thing though is that over time people move to new keys. Either because they want a stronger key or because the old one is lost or compromised and of course some people may deliberately create multiple persona's for themselves (whether this is a good thing or a bad thing is a matter of opinion).
Problem? (Score:2)
Re:Problem? (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
A person's email address is fairly unlikely to be static over the span of their career. Especially in case of researchers who may very well move to other universities (in other countries).
Re: (Score:2)
Actually, there is. My wife is a PhD who studies stroke and epilepsy. There is another scientist in Germany who has the same first initial and last name who also studies stroke. He's been writing papers ~20 years longer than my wife has, but when you search for her papers, you get craploads(SAE standard measure) of his, as well.
Re: (Score:3)
Yes. And people switch institutions, and fields. At the moment, if someone has a common name, looking up their papers is an exercise in AI. With a unique identifier you'd be able to tell Google Scholar "get me all the other papers by this author in the last ten years."
Re:Problem? (Score:5, Insightful)
I am sure it happens, but (a) it seems unlikely that they would be in the same field
I have a few name collisions just in my own reference database (i.e., list of papers to which I've referred in my own work.) I can pretty much guarantee that if you look at the author lists for any major single-subject journal, you'll find a whole bunch of identical $FIRST_INITIAL $LAST_NAME entries which are not, in fact, the same people.
Hell, I have a pretty rare (in the US, at least) last name -- and occasionally I still get e-mails from people who think I'm the Daniel Dvorkin who wrote a paper on psoriasis in 1989. It's not entirely unreasonable, since my name appears on a couple of papers related to inflammatory disease, but I'm a grad student in Colorado, not a dermatologist in Pennsylvania ...
and (b) it seems even less likely that they would be at the same institution and (c) even less likely that their contact information would be the same so are there really cases where there is confusion over who wrote a paper
True enough, but people who are looking at author names are not necessarily looking at the entire paper (where contact information is usually given.) A related problem is that journal publications are increasingly subject to various kinds of text data mining, and rightly or wrongly, the format for fields like author institution and contact information isn't standardized from journal to journal -- and in academia, both institutions and e-mail addresses are subject to frequent change. If you published a paper five years ago while at the University of East Dakota and your e-mail in the corresponding author field was given as betterunixthanunix@eastdak.edu, and you're now at South Virginia State with the address butu@svs.edu, good luck getting any database to make that connection without human assistance.
Re: (Score:3)
We developed recently a web service for recommending papers, reviewers and journals out of the citations of a paper ( http://theadvisor.osu.edu/ [osu.edu] ). Having conflict in the names can be problematic. Many paper recommendation algorithms use the property that two papers share the same authors, they must be somewhat related. Having name conflict lower the quality of that assumption. Though, some database are already disambiguated. For instance DBLP adds an ID to the name in case there is more than one. (but it
Re: (Score:3)
Think of the birthday paradox [wikipedia.org]. It's quite unlikely that any given researcher has the same name as any other given researcher. But there are n!/(2(n-2)!) pairs to consider, which gets big really fast, so you have to adjust your expectations for multiple comparisons.
Re: (Score:2)
People move between institutions and their contact information changes. So if you insist on a match on all those fields you will reduce the chances of incorrectly indicating papers as being by the same author but you will increase the chance of incorrectly indicating papers as being by different authors.
Re: (Score:2)
The funny thing is, similarnames seem to have similar levels of achievement. Perhaps parents were in similar social circles.
Re: (Score:2)
Re: (Score:2)
> Is there a serious problem with authors sharing names?
For the FIRST and LAST time, names are NOT unique; they are a just a convenient label as history clearly shows: Henry I, Henry II, Henry III, ... Henry VIII, etc.
Unique numbers are the only proper way to solve this problem once and for all.
Re: (Score:2)
I have often been contacted via email by correspondents under the misguided impression that I am an Italian author and/or an Israeli public intellectual. I am not sure if these two are the same man, but I know from childhood that someone with my name once won some kind of prestigious prize for his fiction writing (was it a Pulitzer? I don't remember).
For the record, I'm the computer programmer by this name from Northeastern America.
What??!?!? (Score:2)
Re:What??!?!? (Score:4, Funny)
Oh great, another guy with the same name as me. No wonder no one knew I was the one that published on Creating Cold Fusion Using Two Matchsticks, A Fake Mustache, and a Left-Handed Monkey Wrench.
Make them all adopt unique names! (Score:5, Interesting)
The Writers Guild of America requires that all members have unique names. There cannot be two of the same person as to prevent confusion. This is evident with David X. Cohen, well known as a writer for The Simpsons and Futurama. His real name is David S. Cohen but the Writers Guild of America already had a David S., so he took David X. Cohen.
France used to do that, to some extent (Score:3)
France used to require government approval for children's names when registering births. This was a francophone thing, not a uniqueness thing. But it could have been expanded to use a uniqueness check. Corporation and D/B/A names have to be unique within their jurisdiction.
Names in China used to be disambiguated by asking "What is your village?" This is no longer very helpful.
Re: (Score:2)
David H. Lawrence XVII, the guy who played the Puppetmaster on Heroes, had a similar problem. His Wikipedia entry [wikipedia.org] explains: "The 'XVII' in his name was a way for Lawrence to distinguish himself from previous David Lawrences already registered with SAG. At the time, he was the 17th David Lawrence listed on IMDB, and appended the number to his name upon his own registry."
Re: (Score:2)
Re:Make them all adopt unique names! (Score:4, Informative)
There are many, many more scientists than there are members of WGA.
Re: (Score:2)
I've always wondered why people name their kids names that already exist anyway. It's a nightmare for database administration to say the least and then on top of that I think the human race has just given up when I see this. I mean - at one point all the names that exist were new names that never existed before. People made them up. So when exactly did the creation of new names stop and why? Have we evolved imagination right out of our brains or just become lazy?
Re: (Score:2)
Well, fuck you very much!
Sincerely,
Stephen H. Foerster
Re: (Score:3)
Jesus H. Christ, do you have to be so belligerent?
Moot problem? (Score:4, Funny)
Overall, I thought having multiple researchers with the same name was a good thing.
Then we could each take credit for one another's work, and we'd all collectively be the biggest badass in science. It'd sure make research funding easier, in any case.
Google Scholar (Score:2)
Re: (Score:2)
Taking an interesting postion (Score:2)
It is interesting on their position on this, we will create the method but someone else will have to create the database and maintain it. What I see here is that they see a bag of worms when it comes to privacy issues and do not what to touch that part of it. If an issue results in some aspect of the collection of such information, ORCID’s only involvement will be the DB structure. They had better include some temple or recommended best know practices on how a collection of this data should be hand
Re: (Score:2)
It is interesting on their position on this, we will create the method but someone else will have to create the database and maintain it. What I see here is that they see a bag of worms when it comes to privacy issues and do not what to touch that part of it. If an issue results in some aspect of the collection of such information, ORCID’s only involvement will be the DB structure. They had better include some temple or recommended best know practices on how a collection of this data should be handled.
Creating it is one thing, operating such a creation should also be addressed before untended consequences happen.
FWIW, I think privacy and it's evil twin identity theft are probably issues that they shouldn't be solving since they are proposing some sort of cheezy author validated "password" system. They don't seem to have any way to address anything about keeping people from taking credit for publications of others with the same "name" (CV fraud), or publishing crap papers under someone's name to ruin their reputation, or other types of identity theft, so hopefully they aren't trying to do this. Although they are m
Mistaken Identity is a Common Problem in China (Score:2)
http://en.wikinews.org/wiki/Not_enough_names_to_go_around_in_China,_ministry_says [wikinews.org]
Why? (Score:2)
Why cant they just do "Researchername,DOB"?
If you have 20 researchers all named I.P. Freely and are all born on 12/13/1992 then I think there is a bigger problem here.
Re: (Score:2)
If you have 20 researchers all named I.P. Freely and are all born on 12/13/1992 then I think there is a bigger problem here.
Yeah, definitely a potential overflow problem.
Re: (Score:2)
Why cant they just do "Researchername,DOB"?
There have been numerous reports from many countries about duplicate government ID numbers [usatoday.com] due to schemes like this. There was a recent story about a similar case in Canada, with two people born the same day in the same hospital that were given identical names.
Yes, the probabilities are low, but they aren't zero. If the money has any legal or financial impact, duplicates inevitably lead to lawsuits, lost time, etc, etc.
If the ID number is important, you need to guarantee that two people don't get ass
huge for some students (Score:2)
Don't need a database (Score:2)
Authors can generate their own unique id: keyword UUID (http://en.wikipedia.org/wiki/Universally_unique_identifier)
So no central database is needed. Just some conventions.
S
Oh what a Great Idea. Exaclty what we need. (Score:2)
ANOTHER Unique number!
Besides:
http://en.wikipedia.org/wiki/Virtual_International_Authority_File [wikipedia.org] , NDL Authorities, http://en.wikipedia.org/wiki/Library_of_Congress_Control_Number [wikipedia.org] , or http://en.wikipedia.org/wiki/Universal_Authority_File [wikipedia.org]
And these are only the ones I found assigned to a single author in Wikipedia.
Why not just use one of these?
Sounds familiar... (Score:2)
Re: (Score:2)
If you read a paper that you found interesting wouldn't you want a way to find more papers from that individual?
This isn't about statistics, it's about use and the confusion of names. It's about making your job easier if you are a researcher.
Re: (Score:2)
If you read a paper that you found interesting wouldn't you want a way to find more papers from that individual?
Papers generally have the authors' contact information; if you are really having problems just using a person's name and the field they are in, you can send them an email or visit their website.
Re: (Score:2)
People move insitutions, that "John Smith" published with "Princeton University" as his affiliation 25 years ago will probably be enough to find out who it is, but it won't be trivial.
I suspect this is more useful for people trying to find well published people. You know a pharma company saying: "We want to find an oncologist who publishes a lot and gets cited a lot so we can convince them to do some trial work on our new drug and hence publish on it a lot. Find me the top 25 publishers on X in the last 10
Re: (Score:2)
Contact information changes, and frequently the paper only carries contact info for one author.
Re: (Score:2)
heh,
My name "Peter M Green" is on a paper that also has "Peter R Green" and "Peter N Green" in the authors list.
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5488046&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5488046 [ieee.org]