What statistics texts and other analytics books should we recommend to people?
On a message board I frequent, two different guys have asked for recommendations for statistics textbooks, in a kind of general knowledge vein. One phrases it as:
I’m looking for a general purpose statistics textbook for reference purposes.
giving his background as
I took Calculus-level Statistics in college. (i.e. 2 semesters of Calc was a prerequisite; this was the stats class that stat majors took.)
He was a computer science major and is now a professional programmer. (And if somebody can use a tournament-chess-smart programmer with outstandingly clear communication skills in the Buffalo area, I’m pretty sure he’d be glad to know about the opportunity. But I digress …)
The other is a law student with a more general need, which he phrases as
I want to use them for work to help identify trends; do multiple regressions; put values on things that aren’t easy to quantify, etc.
Economics I already know most of the basics from my undergrad studies, but I need more advanced economic theory and such.
He’s interested in what I’d call “pop” analytics books as well as hardcore stuff; e.g., the one book he’s identified already is “Competing with Analytics.” I’m thinking some good vendor white papers might be just as useful for him as that class of books. But he obviously also wants to learn the hardcore stuff as well.
I haven’t attended or taught a college course since 1981, and I tend to find the business books on analytics too simple for my tastes, so I’m not the right guy to answer from his own experience.
Does anybody have any helpful thoughts? Thanks!
Comments
11 Responses to “What statistics texts and other analytics books should we recommend to people?”
Leave a Reply
The request by the first person is surprisingly common and for me, hard to parse. When most people say “reference” they mean a book that, after previously studying and understanding the material, one uses to remember some concept, technique, formula or proof. But that is not what this person is asking for, because they have not studied statistics.
My experience is that this person is asking for a book that will teach them statistics in the shortest possible time. They are probably smart and believe that they should be able to absorb this topic from a good textbook. I have pointed two or three such people to real introductory statistics textbooks only to later get some embarrassed looks and excuses about how little time they have to read.
So now what I do is to help such people manage expectations: unless they are willing to put in a *lot* of work, they really fall into the category of the latter person, who is looking for a book that they can read on the train. They want to treat statistics like magic and learn what it can do for them but not how. I try to help them see that it is not embarrassing to want to use statistics like this. It is certainly easier to recognize your goals up front than to buy $200 worth of textbooks that you will serve only to sit on your bookshelf and make you feel guilty.
But unfortunately, I do not have experience with business level statistics books, so here my advice ends and I point them to someone else.
For a person who is looking for a book filled with very fun probability puzzles (but who has not studied much probability), I point them to A First Course in Probability by Sheldon Ross. This book is full of very fun and interesting examples and problems as well as a good explanation of introductory probability. A guy named John Weatherwax (Google weatherwax ross) did a free solutions manual for this book. His answers are not all correct, but that doesn’t matter for a non-student.
If one can get through this first book, Ross wrote a book on Probability Models that also has some very good examples (Ross is great for examples and fun problems). I believe that one could actually start with the probability models book, depending on their goals.
Most people will not get through either of these books, and all this means is that they need to reset their goals and look for the more business-like books.
I learned basic statistics in high school, studied math in college and took a very good econometrics course in my 4th year at the University of Chicago.
For those not familiar with the distinction between econometrics and statistics, the difference is subtle. In economics, it is rarely possible to conduct controlled experiments (hey – what do you think the relationship between inflation does to unemployment is? Let’s do a controlled experiment with the UK and France!). Econometrics is an evolution of statistics to deal with observational data and a lack of controlled studies.
Our econometrics class didn’t necessarily use a book per-se, it was mostly notes and problem sets written by the professor.
However, there are two good books on econometrics that I would recommend:
http://www.amazon.com/Course-Econometrics-Arthur-S-Goldberger/dp/0674175441
http://www.amazon.com/Econometric-Analysis-William-H-Greene/dp/0135132452/ref=sr_1_1?ie=UTF8&s=books&qid=1244057439&sr=1-1
Both books cover the theoretical and practical basics, including linear regressions, time series, etc.
They also address how to deal with situations when the underlying assumptions of these models do not hold true (e.g. heteroskedacity – when your observational errors are not independently distributed).
MIT’s OpenCourseWare also has a class on econometrics that uses the goldberger book:
http://ocw.mit.edu/OcwWeb/Economics/14-32Spring-2007/Readings/index.htm
One of my favorite stats books is
“Statistics Hacks – Tips & Tools For Measuring the World and Beating the Odds” by Bruce Frey.
I like it as statistics can be lots of fun and Bruce does an excellent job putting statistics to fun and interesting uses. No need to get bored to death with this book. Enjoy.
Bob
Thanks. Those all sound like excellent suggestions!
As an intro to pattern analysis and learning, I like Pattern Classification by Duda, Hart, Stork. It’s the 2nd edition of a 30 year old classic. I am not sure that a beginner would get anywhere with most of it, but chapter 1 is really great with no math, and it can probably be had for free on Google books. I would recommend that everyone who works with data read it.
Perhaps a better approach might be to download R from http://r-project.org/ and check out the demos and available packages. Several support statistics books at various levels. Find a demo/package/book combo that suits the need and go from there.
— twitter/JAdP
I should have noted that the link to Task Views from The Comprehensive R Archive Network, http://cran.cnr.Berkeley.edu/index.html (or other local mirror), is very helpful in locating the right package among the 1828 available packages 😉
It might be helpful to indicate what level of math is needed. Calculus of a single variable? Measure theory?
If it’s the latter, we’re talking about a specialized audience. I even botched the measure theory in the first chapter of my dissertation, re-proving a version the Kolmogorov Extension Theorem because neither I nor my advisor knew such a thing existed.
My experience is that it is incredibly rare to find someone who, years out of school and working in a non-mathematical field, will spend the time on their own to learn enough stats to be useful in a job.
I usually try to point people to books with zero math, or recommend that they take a grad class or two.
I just don’t see it being realistic to point someone who doesn’t work with math regularly (and I don’t mean “I’m a programmer, and logic comes from math” kind of thing), to a stats textbook, and have them come back in a few months with a good understanding of the material. Or at least, I have never seen it happen.
I have seen someone get a bit of good insight out of a summary of regression that used no math, but explained a few basic concepts like least squares. That’s about as far as I would expect without enrolling in a class.
Maybe I’m being too cynical…
I would recommend “Practical Nonparametric Statistics, Third Edition” by WJ Conover. People without training will often use parametric statistics that are not suited for the data they are analyzing. Using nonparametric tests solves that issue. Inside the front cover is a table of what tests are available for each kind of data (nominal, ordinal, interval) and where to find them in the book.
Required math level is pre-calculus.
Emphasis on “Practical”.
I just finished a class in quantitative analysis and the book that was used was:
http://www.amazon.com/Second-Course-Statistics-Regression-Analysis/dp/0130223239/ref=sr_1_3?ie=UTF8&s=books&qid=1244232799&sr=8-3
It’s a college textbook so it’s a little intimating at first, but the author does a good job of making it readable.