The job of the data scientist is to ask the right questions.

- Hilary Mason

Big data isn't about bits, it's about talent.

- Douglas Merrill

Data is a precious thing and will last longer than the systems themselves.

- Tim Berners-Lee

Data Science First

lines

Data Science First: Using Language Models in AI-Enabled Applications explains how practicing data scientists can integrate language models in data science workflows without abandoning essential principles of reliability, accuracy, and efficacy.

This book offers crystal-clear guidance on when, where, and how data scientists can integrate language models into their existing workflows without exposing themselves or their companies to unnecessary risks. It walks you through strategic design patterns for incorporating language models into real-world data science projects.

It avoids strategies and techniques that rely heavily on proprietary tools that are likely to evolve very quickly (or could disappear entirely) in the near future. Instead, the author presents foundational methodologies that will remain valuable regardless of how individual platforms or services change.

Getting Data Science Done

lines

Data Science involves managing complicated multi-displinary projects that are highly likely to fail. There can be problems with expectations of what is achievable, moving goalposts, uncertainty around how to measure impacts and concern about the risk of doing something new.

Most data science courses ignore all of these topics and focus exclusively on technical skills and mathematical fundamentals. It is only when we start working on real-world data science projects that we discover the many additional complications that need to be navigated.

John Hawkins is a data scientist with a PhD in applied machine learning from the University of Queensland. He has been building, deploying and consulting on data science solutions across a range of projects in industry and academia for the past 16 years. Getting Data Science Done is the distillation of all this experience into a sequential guide to delivering pragmatic data science results.

Topics

lines

Problem Framing

Communication

Estimating ROI

Experimentation

Deploying Models

Solution Monitoring

Sample Pages

lines

Subscribe

lines