Practical Data Science Cookbook（Second Edition）

上QQ阅读APP看书，第一时间看更新

About the Authors

Prabhanjan TattarÂ has 9 years of experience as a statistical analyst. His main thurst has been to explain statistical and machine learning techniques through elegant programming which will clear the nuances of the underlying mathematics. Survival analysis and statistical inference are his main areas of research/interest, and he has published several research papers in peer-reviewed journals and also has authored two books on R: R Statistical Application Development by Example, Packt Publishing, and A Course in Statistics with R, Wiley. He also maintains the R packages gpk, RSADBE, and ACSWR.

I would like to thank the readers for their encouragement and feedback that lead to the improvements in this edition and hope that they find the current edition useful. Thanks are due to Tushar Gupta for introducing me to this project, Cheryl Dsa for bearing with the delays, Karan Thakkar for the eagle-eyed editing, and the entire Packt team for every little support. The authors of the first edition need to be thanked by me as their platform is largely carried forward. On the personal front, I continue to thank my family: Pranathi the kiddo, Chandrika the wifey, Lakshmi the goddess mother, and Narayanachar the beloved father.

Tony Ojeda is an accomplished data scientist and entrepreneur, with expertise in business process optimization and over a decade of experience creating and implementing innovative data products and solutions. He has a master's degree in finance from Florida International University and an MBA with a focus on strategy and entrepreneurship from DePaul University. He is the founder of District Data Labs, is a cofounder of Data Community DC, and is actively involved in promoting data science education through both organizations.

Sean Patrick Murphy spent 15 years as a senior scientist at The Johns Hopkins University, Applied Physics Laboratory, where he focused on machine learning, modeling and simulation, signal processing, and high performance computing in the Cloud. Now, he acts as an advisor and data consultant for companies in San Francisco, New York, and Washington DC. He completed graduation from The Johns Hopkins University and got his MBA from the University of Oxford. He currently co-organizes the Data Innovation DC meetup and co-founded the Data Science MD meetup. He is also a board member and co-founder of Data Community DC.

Benjamin Bengfort is an experienced data scientist and Python developer who has worked in the military, industry, and academia for the past 8 years. He is currently pursuing his PhD in Computer Science at the University of Maryland, College Park, doing research in Metacognition and Natural Language Processing. He holds a Master's degree in Computer Science from North Dakota State University, where he taught undergraduate Computer Science courses. He is also an adjunct faculty member at GeorgetownÂ University, where he teaches Data Science and Analytics. Benjamin has been involved in two data science start-ups in the DC region: leveraging large-scale machine learning and Big Data techniques across a variety of applications. He has a deep appreciation for the combination of models and data for entrepreneurial effect, and he is currently building one of these start-ups into a more mature organization.

Abhijit Dasgupta is a data consultant working in the greater DC-Maryland-Virginia area, with several years of experience in biomedical consulting, business analytics, bioinformatics, and bioengineering consulting. He has a PhD in biostatistics from the University of Washington and over 40 collaborative peer-reviewed manuscripts, with strong interests in bridging the statistics/machine-learning divide. He is always on the lookout for interesting and challenging projects, and is an enthusiastic speaker and discussant on new and better ways to look at and analyze data. He is a member of Data Community DC and a founding member and co-organizer of Statistical Programming DC (formerly R Users DC).