Forgot your password?
typodupeerror
Stats Education Math

Statistics Losing Ground To CS, Losing Image Among Students 115

Posted by Unknown Lamer
from the big-bad-data dept.
theodp (442580) writes Unless some things change, UC Davis Prof. Norman Matloff worries that the Statistician could be added to the endangered species list. "The American Statistical Association (ASA) leadership, and many in Statistics academia," writes Matloff, "have been undergoing a period of angst the last few years, They worry that the field of Statistics is headed for a future of reduced national influence and importance, with the feeling that: [1] The field is to a large extent being usurped by other disciplines, notably Computer Science (CS). [2] Efforts to make the field attractive to students have largely been unsuccessful."

Matloff, who has a foot in both the Statistics and CS camps, but says, "The problem is not that CS people are doing Statistics, but rather that they are doing it poorly. Generally the quality of CS work in Stat is weak. It is not a problem of quality of the researchers themselves; indeed, many of them are very highly talented. Instead, there are a number of systemic reasons for this, structural problems with the CS research 'business model'." So, can Statistics be made more attractive to students? "Here is something that actually can be fixed reasonably simply," suggests no-fan-of-TI-83-pocket-calculators-as-a-computational-vehicle Matloff. "If I had my druthers, I would simply ban AP Stat, and actually, I am one of those people who would do away with the entire AP program. Obviously, there are too many deeply entrenched interests for this to happen, but one thing that can be done for AP Stat is to switch its computational vehicle to R."
This discussion has been archived. No new comments can be posted.

Statistics Losing Ground To CS, Losing Image Among Students

Comments Filter:
  • by sinij (911942) on Wednesday August 27, 2014 @07:58AM (#47764227) Journal
    Statistical analysis is now more complex, and statistics are better understood in science than a decade ago. There are number of software packages and libraries that simplifies and standardizes techniques.

    Correctly applying all of these require subject matter expertise. You need to understand what you analyzing. As a result pure statistician is not very useful - generic analysis can be performed by software, in-depth analysis requires specific knowledge.

    This is not unlike complaining that assembly coding is dying. Well, yes, we now have less need to code everything that way because we have better tools.
  • by BorisSkratchunkov (642046) on Wednesday August 27, 2014 @08:02AM (#47764273) Journal
    Most notably psychology, economics, mathematics and beer brewing. In fact, most of the developments in stats have come about as a result of a need arising in a different discipline. Stats is inherently an applied discipline, so this is not unusual.

    What is concerning is how many statistical tools, each with their own set of assumptions, have blossomed up within the past few decades. There are so many stats now that stats can no longer be an ancillary to other disciplines- it needs to be given its own space and statisticians need to be given respect for their unique expertise. There is simply too much knowledge in that domain for those in more theory-driven fields to be able to claim both expertise in the conceptual models of their fields and statistics.
  • by Anonymous Coward on Wednesday August 27, 2014 @08:07AM (#47764295)

    Machine learning is an example in the article. This is a blatant attack on all CS students, researchers and professors.

    Let’s consider the CS issue first. Recently a number of new terms have arisen, such as data science, Big Data, and analytics, and the popularity of the term machine learning has grown rapidly.

    He seems to not really know CS. Statistics and probability are a tool to CS since the very inception. This is no news.

  • by Anonymous Coward on Wednesday August 27, 2014 @08:24AM (#47764405)

    Correctly applying all of these require subject matter expertise. You need to understand what you analyzing. As a result pure statistician is not very useful - generic analysis can be performed by software, in-depth analysis requires specific knowledge.

    From my experience, statisticians tend to be far more successful acquiring subject matter expertise than people in other fields have in using proper statistical procedures for their problems.

    It's like saying mathematicians are not useful because calculators. It's simply not true, and while software can perform generic analysis, it is only quite a tiny part of doing a statistical problem correctly. What we have now are coders who think that computers can set up and interpret their problems correctly, and thus we have an increase in bad results.

  • by wisnoskij (1206448) on Wednesday August 27, 2014 @08:48AM (#47764569) Homepage
    I completely disagree. Pretty much everyone is complete shit at statistics. It is a very very advanced and unique field that is continually and horribly bungled by scientists and everyone else. We need statisticians, that said I cannot imagine anyone wanting to go into stats.
  • by sinij (911942) on Wednesday August 27, 2014 @08:55AM (#47764613) Journal
    Following is anecdote, but when someone I knew approached multiple statisticians with a model question (related repeated measures), the understanding of concept was not there. If your view that "everyone is complete shit at statistics", that should include statisticians.
  • by DoofusOfDeath (636671) on Wednesday August 27, 2014 @09:05AM (#47764673)

    I'm not very trained in statistics, but I've read more than my fair share of academic computer science papers over the years.

    Even with my limited training in statistics, I've known enough to be appalled by the errant statistical reasoning used. Or even not used. I.e., "We don't know how many times to run a program to get a 'valid' average running time, so we ran it three times. Here's the average: ..." The authors seemingly aren't just ignorant of how to get the answer; they often seem to have not thought through what questions they're trying to answer in the first place with their measurements and resulting statistics.

    I think a few problems come into play here:

    • The mathematics of statistics can be hard.
    • Thinking through the meanings of statistics requires careful thought, especially for experimental design and/or system performance characterization. Many CS practitioners would prefer to not invest mental energy in this aspect of their work because they don't enjoy it; it's a distraction to what they want to do.
    • Because so many people in CS are bad at statistics, peer reviewers tend to let it slide. This helps foster a culture problem. If I'm under the wire to get a paper published and I'm near deadline, do I take an extra 20 hours to get the statistics right? Especially knowing that I'm judged by the number of published papers, and that the peer reviewers won't notice or care about poor statistical reasoning?
    • It's easy to make statistical reasoning errors without noticing it. Especially if you're not surrounded by statisticians.

    Despite CS majors thinking we're so smart about mathematical issues, I think this might be one area where that confidence is delusional. I suspect most psychology majors who paid attention in their Experimental Design courses are more capable in the appropriate mathematics than are most CS majors.

  • by u38cg (607297) <calum@callingthetune.co.uk> on Wednesday August 27, 2014 @09:50AM (#47765031) Homepage
    Get real. Anyone doing "statistics" who doesn't understand the concept of a prior is just pretending to do statistics. That is a problem.
  • Re:Not surprised (Score:4, Insightful)

    by u38cg (607297) <calum@callingthetune.co.uk> on Wednesday August 27, 2014 @09:59AM (#47765121) Homepage
    >> Take one set of data and produce two diametrically opposed answers and have them both correct?

    You missed the point of the lesson. The point was that you didn't have enough data to demonstrate that your model was valid. That's all.

  • Re:Not surprised (Score:4, Insightful)

    by ColdWetDog (752185) on Wednesday August 27, 2014 @10:05AM (#47765183) Homepage

    Take one set of data and produce two diametrically opposed answers and have them both correct? Sounds like rumor, gossip, and BS to me, not science.
    No wonder there are lies, damn lies, and statistics!

    Somebody missed the lecture on assumptions.

What the world *really* needs is a good Automatic Bicycle Sharpener.

Working...