"Science" Part in Data Science

"Science" Part in Data Science

I want to share my list of favorite math books that, I think, every data scientist should read and understand well to deserve "scientist" part of data scientist title. Of course, the list is not exhaustive. If you have any comment or addition, please drop it below.

1-Book of Proof by Richard Hammack "This book is an introduction to the language and standard proof methods of mathematics. It is a bridge from the computational courses (such as calculus or differential equations) that students typically encounter in their first year of college to a more abstract outlook. It lays a foundation for more theoretical courses such as topology, analysis and abstract algebra. Although it may be more meaningful to the student who has had some calculus, there is really no prerequisite other than a measure of mathematical maturity. Topics include sets, logic, counting, methods of conditional and non-conditional proof, disproof, induction, relations, functions and infinite cardinality."

2-Introduction to Probability, 2nd Edition by Dimitri P. Bertsekas (Author), John N. Tsitsiklis (Author) An intuitive, yet precise introduction to probability theory, stochastic processes, and probabilistic models used in science, engineering, economics, and related fields. The 2nd edition is a substantial revision of the 1st edition, involving a reorganization of old material and the addition of new material. The length of the book has increased by about 25 percent. The main new feature of the 2nd edition is thorough introduction to Bayesian and classical statistics. 

The book is the currently used textbook for "Probabilistic Systems Analysis," an introductory probability course at the Massachusetts Institute of Technology, attended by a large number of undergraduate and graduate students. The book covers the fundamentals of probability theory (probabilistic models, discrete and continuous random variables, multiple random variables, and limit theorems), which are typically part of a first course on the subject, as well as the fundamental concepts and methods of statistical inference, both Bayesian and classical. It also contains, a number of more advanced topics, from which an instructor can choose to match the goals of a particular course. These topics include transforms, sums of random variables, a fairly detailed introduction to Bernoulli, Poisson, and Markov processes. 

The book strikes a balance between simplicity in exposition and sophistication in analytical reasoning. Some of the more mathematically rigorous analysis has been just intuitively explained in the text, but is developed in detail (at the level of advanced calculus) in the numerous solved theoretical problems. 

Written by two professors of the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, and members of the prestigious US National Academy of Engineering, the book has been widely adopted for classroom use in introductory probability courses within the USA and abroad.

3-Introduction to Linear Algebra, Fifth Edition (2016) by Gilbert Strang Gilbert Strang's textbooks have changed the entire approach to learning linear algebra -- away from abstract vector spaces to specific examples of the four fundamental subspaces: the column space and nullspace of A and A'.This new fifth edition has become more than a textbook for the basic linear algebra course. That is its first purpose and always will be. The new chapters about applications of the SVD, probability and statistics, and Principal Component Analysis in finance and genetics, make it also a textbook for a second course, plus a resource at work. Linear algebra has become central in modern applied mathematics. This book supports the value of understanding linear algebra.

Introduction to Linear Algebra, Fifth Edition includes challenge problems to complement the review problems that have been highly praised in previous editions. The basic course is followed by eight applications: differential equations in engineering, graphs and networks, statistics, Fourier methods and the FFT, linear programming, computer graphics, cryptography, Principal Component Analysis, and singular values.

4-Ordinary Differential Equations (Dover Books on Mathematics)  Morris Tenenbaum (Author), Harry Pollard (Author) This unusually well-written, skillfully organized introductory text provides an exhaustive survey of ordinary differential equations — equations which express the relationship between variables and their derivatives. In a disarmingly simple, step-by-step style that never sacrifices mathematical rigor, the authors — Morris Tenenbaum of Cornell University, and Harry Pollard of Purdue University — introduce and explain complex, critically-important concepts to undergraduate students of mathematics, engineering and the sciences.

5-Partial differential equations by Lawrence C Evans  This is the second edition of the now definitive text on partial differential equations (PDE). It offers a comprehensive survey of modern techniques in the theoretical study of PDE with particular emphasis on nonlinear equations. Its wide scope and clear exposition make it a great text for a graduate course in PDE. For this edition, the author has made numerous changes, including a new chapter on nonlinear wave equations, more than 80 new exercises, several new sections, a significantly expanded bibliography.

6-Understanding Analysis (Undergraduate Texts in Mathematics) by Stephen Abbott This lively introductory text exposes the student to the rewards of a rigorous study of functions of a real variable. In each chapter, informal discussions of questions that give analysis its inherent fascination are followed by precise, but not overly formal, developments of the techniques needed to make sense of them. By focusing on the unifying themes of approximation and the resolution of paradoxes that arise in the transition from the finite to the infinite, the text turns what could be a daunting cascade of definitions and theorems into a coherent and engaging progression of ideas. Acutely aware of the need for rigor, the student is much better prepared to understand what constitutes a proper mathematical proof and how to write one.

Fifteen years of classroom experience with the first edition of Understanding Analysis have solidified and refined the central narrative of the second edition. Roughly 150 new exercises join a selection of the best exercises from the first edition, and three more project-style sections have been added. Investigations of Euler’s computation of ζ(2), the Weierstrass Approximation ­ Theorem, and the gamma function are now among the book’s cohort of seminal results serving as motivation and payoff for the beginning student to master the methods of analysis.

7-Mathematical Analysis, Second Edition by Tom M. Apostol  It provides a transition from elementary calculus to advanced courses in real and complex function theory and introduces the reader to some of the abstract thinking that pervades modern analysis.

8-Introduction to Linear Optimization (Athena Scientific Series in Optimization and Neural Computation, 6)  by Dimitris Bertsimas (Author), John N. Tsitsiklis (Author), John Tsitsiklis (Author) This book provides a unified, insightful, and modern treatment of linear optimization, that is, linear programming, network flow problems, and discrete optimization. It includes classical topics as well as the state of the art, in both theory and practice.

9-Convex Optimization by Boyd and Vandenberghe Convex optimization problems arise frequently in many different fields. This book provides a comprehensive introduction to the subject, and shows in detail how such problems can be solved numerically with great efficiency. The book begins with the basic elements of convex sets and functions, and then describes various classes of convex optimization problems. Duality and approximation techniques are then covered, as are statistical estimation techniques. Various geometrical problems are then presented, and there is detailed discussion of unconstrained and constrained minimization problems, and interior-point methods. The focus of the book is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. It contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance and economics.

10-The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) Trevor Hastie (Author), Robert Tibshirani (Author), Jerome Friedman (Author) This book describes the important ideas in a variety of fields such as medicine, biology, finance, and marketing in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of colour graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book.

This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorisation, and spectral clustering. There is also a chapter on methods for "wide'' data (p bigger than n), including multiple testing and false discovery rates.






To view or add a comment, sign in

Others also viewed

Explore content categories