Kontaktieren Sie uns

Blog, Data science, Data science, Diverse, News, Software, Software

Machine learning as a web appli­cation

11 Mar 19

Everyone is currently using buzzwords like artificial intel­ligence (AI) and machine learning (ML). Our data scientist can explain what they mean in simple terms. Train your own models using Predictoor, a practical web appli­cation, and the Random Forest algorithm.

Abbildung 1: Venn-Diagramm der künstlichen Intelligenz

What is machine learning?

Can you predict the sale price of a detached house? Which of your customers are likely to cancel their contract soon? How can you distinguish between suspicious transactions and normal bank transfers? Questions like these lead directly into the world of artificial intel­ligence and machine learning, in which data is turned into infor­mation.

The subgroup of supervised learning uses algorithms that analyse countless data examples to find connections between a target value and expla­natory variables. After this training phase, the algorithm can apply the learned connections to new cases and thus predict the target value.

Variables

Variables explain a dataset and help the algorithm find connections.

Abbildung 2: Lineares Modell von Grösse und Gewicht.

Target variable

What do you want to predict? For example, which employees are at risk of leaving the company?

Abbildung 3: Vom Input «Grösse» zum Output «Gewicht».

Cases

Every line is an obser­vation. The algorithm learns from these to predict new cases.

Abbildung 4: Bildklassierung mit spezialisierten neuronalen Netzwerken.

No longer just for experts

For a long time, only specialists could develop and use complex prediction models. They tested, supple­mented and trans­formed input data, selected a suitable model class, optimised the model parameters and validated the accuracy of their predictions. Modern algorithms and sensible prepa­ratory steps now enable large parts of this process to be automated. The resulting models supply robust and accurate predictions – provided the data permits this.

However, the prepa­ratory steps in particular are unavo­idable and can be rather complex. In order to ensure that the learning algorithm can make exact predictions, a structured database is required. As part of this, the target variable must be defined as precisely as possible (a process known as “feature engineering”).

“Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning’ is basically feature engineering.”

Andrew Ng

The proof of the pudding is in the eating

Perhaps you already have a dataset that describes your products, customer data or your compe­titors’ offers, for example, and would like to predict a key variable of this dataset with the aid of the other infor­mation. In this way, you could, for instance, estimate future sales of your products, identify and contact customers with a high cancel­lation risk or take a closer look at your rivals’ price models.

Why not give it a try? Predictoor is your artificial intel­ligence, learning how to predict the desired target value from your datasets on the basis of your selected infor­mation.

Predictoor

Predictoor prüft zuerst den Umfang und die Vollstän­digkeit Ihres Daten­satzes und entscheidet, welche Variablen für die Model­lierung geeignet sind. Predictoor entscheidet sich aufgrund der gewählten Zielgrösse automatisch für ein passendes Regressions- oder Klassi­fi­ka­ti­ons­modell (Vorhersagen von Zahlen­werten wie z.B. Preisen oder Vorhersagen von Kategorien wie z.B. «kündigt» / «kündigt nicht»). Während der Lernphase optimiert er die Modell­pa­rameter und prüft die Modellgüte, indem er seine Vorhersagen für zuvor ausge­schlossene Daten­punkte mit den tatsäch­lichen Werten vergleicht (dieser wichtige Check wird Kreuz­va­li­dierung genannt). Die Vorher­sa­ge­ge­nau­igkeit und die Wichtigkeit der einzelnen erklä­renden Variablen zeigt er Ihnen verständlich auf. Schliesslich können Sie mit dem trainierten Modell Vorhersagen für neue Fälle generieren.

It’s based on a random forest

Predictoor currently learns using a well-known and powerful algorithm known as Random Forest (developed by Leo Breiman in 1999). Here, it generates countless variations of decision trees easily influenced by chance. To make a prediction, it takes the average of the results of all the individual trees. The name derives from the fact that many trees make a forest.

You can also acquaint yourself with Predictoor without your own data. A number of different sample datasets are available to help you train your first models. So let Predictoor predict the survival chances of Titanic passengers, house prices or the quality of certain wines!

Additional information

Find out what modern algorithms can do with your datasets! We’ll gladly support you in this by providing additional data, prepa­ratory steps or speci­fically optimised models to enable you to turn data into infor­mation.