This is the online version of industRial data science, a book with tools and techniques for data analysis in Product Development and Manufacturing. It is organized around Case Studies in a “cookbook” approach, making it easier to directly adopt the tools. The examples come from varied manufacturing industries, mostly where repetitive production in massive quantities is involved, including: pharmaceuticals, food, electronics, watch making and automotive.

Product Development and Manufacturing are very important activities in society because bringing innovative products to the market has an immense potential to improve the quality of life of everyone.

Additionally Data Science brings new powerful approaches to the engineering and manufacturing of consumer goods, helping minimizing environmental impact, improving quality and keeping costs under control.

How to use this book

We assume the reader is familiar with product development and manufacturing quality methodologies such as dmaic and six sigma and the related statistical concepts. We also assume the reader is already a user of the R programming language. The Case Studies then bring all the areas together in a practical way.

This book is better used as a reference book by using the navigation bar on the left to go a specific industrial domain. To reproduce the examples all the case studies data sets, example functions and the textbook original files can be downloaded as a package called {industRial}. For guidelines on how to use them refer to the sections Datasets and Functions

Complementing the text, a series of tutorials can be accessed either online or locally to practice dynamically key statistical concepts. For the online option no specific software installation is required. A list of web links and detailed instructions on local installation can be seen in the section Tutorials.

In the appendix we provide a detailed Index and a Glossary and we refer to several good quality books on both Data Science and Product Development that have served to provide the required theoretical background. These cover key disciplines such as six sigma, statistics and computer programming. This book aims complementing them and showcase how to benefit from recent software in this area. A full list can be found in References.

Content overview

The case studies are organized according the a logical product development flow. The text starts with case studies in the domain of Design for Six Sigma. These are some practical tools in that help prioritizing problems and get focus on how to tackle them. The next group of case studies covers the domain of Measurement System Analysis, an initial important step when developing a product or manufacturing process. Here is discussed how to analyze the response of a measurement device in terms of its bias and its uncertainty. The next large group of case studies is the Design of Experiments. This corresponds to the core of the R&D activities and provides approaches to minimize the quantity of trials and time to reach to a sufficient knowledge of how the product or system works. Also on how to obtain the right balance on its features and properties to get the desired output. A final group of case studies presents ways to get the manufacturing process in control according to what was defined in the product development phase. These are the well known Statistical Process Control and Capability studies.


I would like to express my gratitude to the instructors and colleagues who have spent time sharing their knowledge, answering my questions and giving me inputs: Enrico Chavez, Iegor Rudnytskyi, Giulia Ruggeri, Harry Handerson and Bobby Stuijfzand from the EPFL ADSCV team; Jean-Vincent Le Bé, Jasmine Petry, Yvan Bouza, James Clulow and Akos Spiegel from the Nestlé STC team; Frank Paris from DOQS; Théophile Emmanouilidis and Sélim Ach from Thoth.

To report any issue or make suggestions please open an issue on the book repository: industRialds/issues

About the authors

João Ramalho is a Senior Industrial Data Scientist with more than 20 years of experience in the manufacturing industry. He’s been in varied positions in R&D, Operations and IT at Philip Morris, Rolex and Nestlé. He holds a Master in Mechanical Engineering from the IST of Lisbon, a PMP certification from the Project Management Institute and a Data Science certification from DataCamp. He’s currently specializing in Data Visualization at the Swiss technical university EPFL. See full profile at joao-ramalho.com