Artificial Intelligence explained – Part 1: Technology

13. June 2023

by Severin Trösch

Topics

Artificial Intelligence

Blog

Data Science

13. June 2023

We address the following questions in our new blog-series “Understanding Artificial Intelligence”: What is artificial intelligence (AI)? What is behind ChatGPT? Where do we find AI in our everyday lives? What are the most cutting-edge applications? And what does the “AI revolution” mean for us as a society?
In three parts, we provide an overview of the technical basics (→ Part 1: Technology), highlight concrete use cases (→ Part 2: Application), and discuss societal implications of AI (→ Part 3: Society).

Begin at the beginning: What is AI?

Artificial intelligence (AI) can be understood as the ability of a machine to learn and solve problems. Originally, the field mainly included systems that worked according to fixed, human-programmed rules. In recent decades, however, algorithms that can learn and progressively improve their performance by interacting with data (so-called “machine learning”) have become increasingly important. Machine learning is now the central component of AI and includes “deep learning” (systems based on artificial neural networks) as well as GPT models (networks specialized in text generation such as ChatGPT). Figure 1 shows the relationship of these terms graphically.

Page, Text, Diagram — *Figure 1: Venn diagram of AI*.

Even if different machine learning applications sometimes have completely different goals and characteristics, they are all based on the same basic principle.

The basic principle: Learning models and making predictions

In machine learning, computer systems find patterns in existing data, formalize them mathematically, and then apply the learned relationships to new cases. For example, Figure 2 shows the height (X-axis) and weight (Y-axis) of 30 people (blue dots). The general relationship “the taller, the heavier” can be clearly seen by eye.

*Figure 2: Linear model of height and weight*.

Using a suitable machine learning model – in this case a simple linear model – this pattern can be captured mathematically: Using the 30 data points, the computer determines the exact linear relationship between height and weight (green line) and can then “predict” a weight for all possible body heights (i.e., calculate the target variable “weight” using the fitted model; see Figure 3). In simple terms, this model describes how many kilograms an additional centimeter of body size will add.

*Figure 3: From the input “height” to the output “weight”.*

More complex AI: Image recognition and ChatGPT

This input-model-output principle can be generalized beyond the linear case to arbitrarily complex functions. The model type (a linear model in the example above) is chosen to match the task at hand (which input comes in on the left and which output should go out on the right?). For example, if a model is to determine what content can be seen on a photograph, the pixels of the image are given as input to a neural network specialized in image recognition (“convolutional neural network” or CNN). If a CNN is shown thousands of such images, it can learn the important patterns and correctly classify new images:

*Figure 4: Image classification with specialized neural networks*.

Auch für Sprachverarbeitung gibt es spezialisierte Modelle: Sogenannte «Transformer» (das «T» in «ChatGPT» steht für Transformer) werden an riesigen Textbibliotheken trainiert, um Muster in menschlicher Sprache zu lernen. Sie können dann gebraucht werden, um das jeweils nächste Wort in einer Sequenz vorherzusagen (Lesen Sie mehr zu ChatGPT in unserem entsprechenden Post [LINK]). Abbildung 5 illustriert dies anhand des Datahouse-Clamis:

*Figure 5: Prediction of the next word by ChatGPT*.

Although the complexity of the models applied and the variable to be predicted vary, all machine learning systems are ultimately expressions of the same, simple principle: A mathematical model learns patterns in existing data and derives predictions for new cases.

In the second part of our series, we will build on this knowledge and look at concrete applications of AI systems: Where do we find AI in our everyday lives? And what are the “cutting-edge” applications? Stay tuned to the blog series to find out.