Key terms

Machine learning (ML) - a subset of AI that involves training algorithms with data rather than developing hand-crafted algorithms. A machine learning solution uses a data set to train an algorithm, typically training a classifier that says what type of thing this data is (e.g. this picture is of a dog ); a regressor, which estimates a value (e.g. the price of this house is £400,000.) or an unsupervised model,such as generative models like GPT-3, which can be used to generate novel text). Confusingly, when data scientists talk about regression it means something completely different than is meant when software engineers use the same terminology.

Model - In machine learning a model is the result of training an algorithm with data, which maps a defined set of inputs to outputs. This is different from the standard software use of the term ‘data model’ - which is a definition of the data entities, fields, relationships etc for a given domain, which is used to define database structures among other things.

Algorithm - we use this term more or less interchangeably with model. (There are some subtle differences, but they’re not important and using the term ‘algorithm’ prevents confusion with the other kind of data models).

Ground-truth data - a machine-learning solution usually needs a data set that contains the input data (e.g. pictures) along with the associated answers (e.g. this picture is of a dog, this one is of a cat) - this is the ‘ground-truth.’

Labelled data - means the same as ground-truth data.

Last updated