Workshop 5: Introduction¶

Machine learning is a broad field concerned with devoloping and using computational algorithms to learn something and/or make better use of our data. The term machine learning is used in a lot of different contexts, such that defining it any more specific terms turns out to be very tricky. It is heavily related to the concept of artificial inteligence, however the definition of artificial intelligence is a somewhat moving target and seems to change throughout the years. Linear regression is most certainly a method to perform a machine learning task, but is it considered it a method for artificial intelligence? Certainly not on its own.

For the analysis of biological data, machine learning tasks can almost always fit into one of two categories:

For supervised learning, we employ learning algorithms which are able to generate estimates of an outcome variable based on a set of predictor variables. In contrast, for unsupervised learning, we do not have an outcome variable. Instead we employ learning algorithms in order to identify previously unknown relationships between observations. The most common type of unsupervised learning is clustering, in which we split our data points into relatively closely related clusters based on a measure of their similarity. For biological inference, this is typically followed by a additional analyses to assign meaning to these labels. Both R and Python versions are available for this workshop.