A data scientist is a professional who works to extract insights and make sense of large sets of data, or ‘big data’.

The data scientists are those making plots on a laptop in R, but also building big ML pipelines on Apache Hadoop, even if those are quite different things.

As the field matures we may fall back to more specific terms like data analyst, data engineer and statistician rather than bundle all of them in “data scientist”.

EB: What is the difference between a data analyst/statistician and a data scientist?

SO: Supposedly, the data scientist also possesses software engineering skills.

These are the original “data scientists” yet find themselves at some disadvantage where there isn’t enough distinction between the role of modelers working in a lab and the role of engineers maintaining the production systems these models affect.

