Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
AI News

Are 625 Pixels Enough To Identify Sex? 143

mikejuk writes "A Spanish research team have patented a video camera and algorithm that can tell the difference between males and females based on just a 25x25 pixel image. This means that there is enough information in such low resolution images to do the job! They also demonstrate that an old AI method, linear discriminant analysis, is as good and sometimes better than more trendy methods such as Support Vector Machines..."
This discussion has been archived. No new comments can be posted.

Are 625 Pixels Enough To Identify Sex?

Comments Filter:
  • SVMs vs. LDA (Score:5, Informative)

    by hoytak ( 1148181 ) on Saturday April 16, 2011 @10:34PM (#35844586) Homepage

    The algorithm is also interesting in that it proves that an older and fundamental pattern recognition technique - linear discriminant analysis is just as good as the more trendy Support Vector Machines if used correctly and much more efficient.

    A bit of clarity might be useful here. Support vector machines use linear discriminants as the central part of the algorithm. These linear discriminates -- simply hyperplanes separating two regions, are defined by a subset of the data points (called the support vectors). The other key part of an SVM is that it projects the data into a high-dimensional space in which hyperplanes can appear as curves or other shapes in the original space. This higher dimensional space is determined from the data using distances between the points in the data set (it's a kernel space).

    The net result of all this is that SVMs are pretty much guaranteed to always perform better in terms of misclassification error than a simple linear discriminant, as every possible linear discriminant is considered in building the SVM. But it can be slower, and it can overfit.

    So what's going on here? Linear discriminant analysis is an old statistical technique (1930s) that fixes a hyperplane based on distributional assumptions about the two classes. This allows the two classes to be plotted in a simple histogram by projecting them to the normal of this hyperplane, as shown in the picture in the article. It's used all over in statistics, and it works very well when dealing with two symmetric Gaussian distributions (that's what the theory assumes).

    Thus the reason it works well here is that they've managed to transform their data in such a way that the two classes look like this sort of distribution. That's the insight here, not the choice of classifier. When the simplest model works, more complex techniques will overfit, meaning that you train on noise instead of the underlying structure of the data.

Always draw your curves, then plot your reading.

Working...