### Data Science E-Books

Many of these books are statistically-oriented, but then a big part of data science involves drawing conclusions from the data. Hence, the line between the list below and the next list on statistics resources may be a bit blurry.

*Analyze Survey Data for Free*edited by Anthony Joseph Damico - this edited online resource, which assumes knowledge of R, offers step-by-step instructions for exploring online survey data; entries are contributed by different users*and*some entries are still awaiting a contributor if you're so inclined!*Think Python*by Allen B. Downey - an introduction to one of the most popular programming languages for data science, Python*Think Stats: Exploratory Data Analysis in Python*by Allen B. Downey - an intro to stats and probability using Python, written by the same author as*Think Python*above; while this book is meant to introduce statistics to programmers, it could also be a good way for statisticians to get their feet wet in Python*Deep Learning*by Ian Goodfellow, Yoshua Bengio, & Aaron Courville - a free e-book on machine learning, specifically deep learning*R for Data Science*by Garrett Grolemund & Hadley Wickham - this book teaches you how to pull data into R, and clean, model, and visualize; this book was definitely talked up at the data science conference (thanks to a reader for sharing the link to the free e-book version!)*Ten Signs of Data Science Maturity*by Peter Guerra & Kirk Borne - Borne's was one of my favorite presentations from the data science conference I attended; this e-book highlights what indicates an organization is ready to venture into data science*The Elements of Statistical Learning Data Mining, Inference, and Prediction*by Trevor Hastie, Robert Tibshirani, & Jerome Friedman - predictive modeling and machine learning approaches*An Introduction to Statistical Learning with Applications in R*by Gareth James, Daniela Witten, Trevor Hastie, & Robert Tibshirani - covers many of the same topics as*Elements*above, but geared more toward beginners in statistical learning; if these are new concepts for you, read this book before*Elements of Statistical Learning**Python Programming WikiBook*- another introduction to Python, which also includes extensions into other programming languages and additional resources/links*R Programming WikiBook*- an introduction to programming in R, another popular programming language for data science*School of Data Handbook*- this handbook, which goes along with the courses available through School of Data, offers recipes for scraping, cleaning, and filtering data to get you started on your data science journey

### Statistics E-Books

*Correlation and Causation: The Trouble with Story Telling*by Lee Baker - a sort of follow-up to my previous discussion of spurious correlations, this book discusses the notion of probability and alternative explanations for correlations*The Probability Cheatsheet*by William Chen - technically not an e-book; it's a short PDF document that summarizes key probability concepts, like Simpson's paradox, the Law of Large Numbers, and conditional probability*OpenIntro Statistics*by David M. Diaz, Christopher D. Barr, & Mine Çetinkaya-Rundel - a free introductory statistics textbook and additional statistical resources*Think Bayes: Bayesian Statistics Made Simple*by Allen B. Downey - yet another free e-book from Downey (see*Think Python*and*Think Stats*above), introducing Bayes in mathematical notation (if you prefer mathematical notation when learning stats; not everyone does); it also uses Python for computer-aided analysis, so this book also straddles the statistics-data science line- Research and Statistical Support Services Short Courses by Richard Herrington & Jonathan Starkweather - also not exactly an e-book: this site, part of the R&SS at University of North Texas, contains multiple short documents teaching the basics of statistical software, and a few other computer tools that could aid in research
*How to Share Data with a Statistician*by Jeff Leek - this GitHub document describes how to format data to be shared with a statistician, in order to facilitate efficient and timely analysis*Introduction to Applied Bayesian Statistics and Estimation for Social Scientists*by Scott M. Lynch - an introduction to Bayesian analysis and the use of what are called MCMC (Markov chain Monte Carlo) methods; this book starts with a refresher of classical statistics before introducing the Bayesian notion of probability*Learning Statistics with R*by Daniel Navarro - what started off as lecture notes for an introductory statistics class taught with R became an e-book; there's even an R package (lsr) to go along with the book

Like free stuff? Here are some free meta-analysis tools.

## No comments:

## Post a Comment