Notes on Data Science, AI and Data – Introduction

  • by

Data Science is a rather vast and broad term and many times for a good measure terms like analytics, machine learning as well as AI get used along as well.

To simplify this we can agree that the primary purpose of data science is to generate the insights required to run a business. To generate insights in a scientific manner we need data ( all of it , big data, regular data 🙂 ). We apply well defined mathematical approaches or analysis to come up with insights.

The term AI is rather broad – but AI is driven by Machine learning. There are some specific machine learning methods that can be applied on data to provide insights.

I have reproduced this Venn diagram based on reference articles where they show how these fields intersect and helps see data science in context [1] [3]

Once we have this understanding it’s then worth looking at a high level what do we need to do to bring in data science in our work and projects.

Types of Analytics:
Analytics usually fall into these three categories – descriptive, predictive and prescriptive.

  • Descriptive analytics looks at existing data and explains and draws out insights based on what’s already happened.
  • Predictive analytics enables you to predict possible outcomes.
  • Prescriptive analytics goes one step further and advises you to take actions based on certain insights and predictions.

To get analytics insights the data science approaches you can employ can fall under supervised learning, un-supervised learning and miscellaneous (as you will see in examples these are slightly different to the first two).

Supervised learning: There is a famous quote attributed to Andrew Ng that 90% of AI success comes from applying supervised learning. [2]. In supervised learning your algorithms will learn from data you have already labelled to estimate how to categorise your new data. I will highlight two common supervised learning methods
Regression – This allows you to predict a numerical outcome
Classification – This allows you to predict a qualitative outcome, determining categories.

Unsupervised learning: Unsupervised learning on the other hand allows you to do exploratory analysis on your unlabelled data to look for patterns. There are different types of unsupervised learning methods – a good example would be that of clustering where data set can be grouped. Other common unsupervised learning methods include anomaly detection, association rules, dimensionality reduction etc.

Miscellaneous: These are other approaches that do not fall into a supervised vs unsupervised categorization.
There is semi supervised learning method which has a vast majority of data unlabelled with a small number of records labelled during the training exercise. [4]
Another example is that of reinforcement learning where the model learns by trial and error.

This concludes the basic introductory notes around data science and AI

References

1 http://[https://www.sartorius.com/en/knowledge/science-snippets/data-science-vs-artificial-intelligence-vs-machine-learning-602514]] accessed on 06.02.2021

2 https://www.cambiahealth.com/techconnect/augmented-intelligence-in-claim-reviews accessed on 13.3.2021

3 https://towardsdatascience.com/role-of-data-science-in-artificial-intelligence-950efedd2579 accessed on 13.3.2021

4 https://www.datarobot.com/wiki/semi-supervised-machine-learning accessed on 3.4.2021