StackOverflow sees quite a few threads deleted, usually for good reasons. Among the stinkers, though, lies the occasionally useful or otherwise interesting one, deleted by some pedantic nitpicker - so I resurrect them. 👻
Note: Because these threads are older, info may be outdated and links may be dead. Feel free to contact me, but I may not update them... this is an archive after all.
Overwhelmed by Machine Learning - is there an ML101 book?
Question asked by StackUnderflow on Feb 28, 2009
It seems like there are so many subfields linked to Machine Learning. Is there a book or a blog that gives an overview of those different fields and what each of them do, maybe how to get started, and what background knowledge is required?
Comments
+1 good question. I would be interested in this as well – Erik Ahlswede Feb 28 '09 at 21:38
It's laughable how many good, useful questions are closed on SO. This question has 155 upvotes and 234 stars at the time of this writing, and the accepted answer has 153 upvotes. – weberc2 Oct 9 '14 at 20:25
If your not into math and are into programming, I suggest you look at this: karpathy.github.io/neuralnets – Karl Morrison Apr 1 '15 at 4:34
Answer by Jeff Moser (Feb 28, 2009)
Here's the best description I've ever heard of Machine Learning:
Machine learning is actually a software method. It's a way to generate software. So, it uses statistics but it's fundamentally... it's almost like a compiler. You use data to produce programs. - John Platt, Distinguished Scientist at Microsoft Research in his Future of AI series talk (2:17:53)
Some even argue that "everything that algorithms was to computer science 15 years ago, machine learning is today."
For more details, I'd recommend starting out with a fun intro to what's possible such as Peter Norvig's Theorizing from Data talk, a peek at what DeepMind is doing, or more recently the Future of AI series of talks (that I quoted from above).
Next get your hands dirty with Jeremy Howard's "Getting In Shape For The Sport of Data Science." It's a great pragmatic overview of actually working with data.
Once you've played around a bit, watch Ben Hamner's "Machine Learning Gremlins" for a nice pragmatic disclaimer of what can easily go wrong when doing machine learning.
I wrote a blog post "Computing Your Skill" after spending months trying to understand TrueSkill, the ML system that does matchmaking and ranking on Xbox Live. The post goes into some foundational statistics needed for further study in machine learning.
Perhaps the best way to learn is to just try it. One approach is to try a Kaggle competition that sounds interesting to you. Even though I don't do great on the leaderboards there, I always learn things when I try a competition.
After that you've done the above, I'd then recommend something more formal like Andrew Ng's online class. It's at the college level, but approachable. If you've done all the above steps, you'll be more motivated to not give up when you hit some harder things.
As you continue, you'll learn about things such as R and its many packages, SciPy, Cross Validation, Bayesian thinking, Deep Learning, and much much more.
DISCLAIMER: I work at Kaggle and several of the above links mention Kaggle, but I believe they're a fantastic place to start.
Answer by Imran (Mar 01, 2009)
videolectures.net has a large collection of Machine Learning videos . One very good technical introductory lecture on the site is Machine Learning, Probability and Graphical Models by Sam Roweis.
A good overview of the field is Tom Mitchell's seminar The Discipline and Future of Machine Learning. Here is a direct link to the video [mov]. And the Syllabus page has a good list of recommended texts:
- Neural Networks for Pattern Recognition by Christopher Bishop
- Pattern Classification by Richard Duda et. al.
- The Elements of Statistical Learning: Data Mining, Inference and Prediction by T.R. Hastie et. al.
- Information Theory, Inference, and Learning Algorithms by David MacKay
- Machine Learning by Tom Mitchell
Answer by dmcer (Mar 19, 2010)
Ethem Alpaydin's Introduction to Machine Learning is a pretty accessible overview of the field.
If you're feeling overwhelmed by the other options you might want to start with it first.
Answer by Mr Fooz (Feb 28, 2009)
Two of the best textbooks out there are:
Pattern Classification by Duda and Hart, and
Pattern Recognition and Machine Learning by Bishop.
Another good resource is MIT's Open CourseWare site for their Machine Learning class.
Answer by Tirrell Payton (Feb 15, 2012)
I found "Programming Collective Intelligence" to be the book that really helped me (with practical examples) and an "Algorithm Beastiary" at the end.
Answer by Volatil3 (Jul 06, 2012)
Dr Yaser Abu Mustafa's Intro course is also in detailed and he explained it quite well
http://work.caltech.edu/telecourse.html
Answer by Matias Rasmussen (Sep 28, 2012)
I really like the Machine Learning course on Coursera. I find the short lectures very easy to digest.
Answer by theycallmemorty (Apr 01, 2009)
Artificial Intelligence: A Modern Approach is the most common text book for introductory AI courses.
Witten and Frank's book on Data Mining is a little easier to digest if that topic is what appeals to you.
Answer by Pete (Feb 28, 2009)
You are right to feel that there are lots of sub-fields to ML.
Machine Learning in general is basically just the idea of Algorithms which improve over time. If you're simply curious, some random topics that come to mind include:
Classification, Association analysis, Clustering, Decision Trees, Genetic Algorithms, Concept Learning
As far as books go:
I'm currently using Introduction to Data Mining for a course right now. It covers quite a few of the topics I've listed above and usually has examples of algorithms/uses in each section.
You don't need too much background knowledge to understand a lot of the topics. Most algorithms have some math underlying them which is used to improve the results, and you obviously need to be comfortable with general programming/data structures.
Answer by Genjuro (Dec 05, 2011)
i'd recommand you take a look at ml-class.org.
Answer by lmsasu (Feb 12, 2012)
Try A First Encounter with Machine Learning, it's a freely available course for undergraduate level.
Answer by vikram360 (Sep 12, 2011)
I've been using 'Machine Learning: An algorithmic Perspective' by Stephen Marsland. And I think the approach is awesome. The author has put up the python code on his site. So you can actually download the code and look at it just to take a peek at how things work.
http://www-ist.massey.ac.nz/smarsland/MLbook.html
Answer by unj2 (Jul 29, 2009)
The Machine Learning subreddit has interesting links for all levels.
Shared with attribution, where reasonably possible, per the SO attribution policy and cc-by-something. If you were the author of something I posted here, and want that portion removed, just let me know.