Glossary

The following terminology is used in the context of this course:

Model

A machine learning model is an entity that has been trained to recognize certain types of patterns. This can be a complex CNN with billions of neurons or just a simple linear regression model with only two parameters.

Supervised Learning

Supervised learning (SL) is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

Unsupervised Learning

Unsupervised learning refers to the use of machine learning to identify patterns in data containing samples that are neither classified nor labeled.

Overfitting

Overfitting is a concept in data science, which occurs when a model fits very good or exactly against its training data. When the model memorizes the noise and fits too closely to the training set, the model becomes “overfitted” and it is unable to generalize well to new data.

Sample

An instance, data point, or observation with one or more features (eg. one specific pixel in an RGB image). In a data matrix, samples are represented as rows.

Feature

The individual elements (measurable properties) of a sample (eg. the red band of an RGB image). In a data matrix, features are represented as columns.