Smart Predict

In the first part of our HowTo blog series, we already showed how you can use the Smart Predict feature to train an AI based on data. We demonstrated this using data from a fictitious store.

After we have created our predictive model, it is first useful to look at the results of the training to identify the drivers and trends that led to these results. SAP Analytics Cloud Smart Predict provides us with comprehensive tools to explore all the necessary details of the underlying model. These tools are customized for the respective model types (classification, regression and time series) and thus differ slightly between the different scenarios. In this part of the paper, we will only discuss the analysis of a regression model.

Root Mean Square Error for measuring quality

The quality of a regression model can be measured by the so-called root mean square error (RMSE). This indicator, which shows the mean square deviation, is a statistical tool. It is used to assess the quality of an estimate. Thus, the RMSE provides information about the robustness of the model. It allows similar statements to be made for new data sets with a high level of confidence.

Examination of the model with SAC Smart Predict

SAC Smart Predict divides our training data set into two parts. One part is used to train the regression model. The other part is used to validate the trained model. The Root Mean Square Error given in this example is calculated from these two data sets. In Target Statistics, we can see additional information such as the mean and standard deviation for the individual partitions of the training data set.

In our case, we achieved a confidence of 95.21% with this model. This is just above the recommended confidence level of 95%. Ideally, a confidence level of over 99% should be aimed for. We have an error of 127.47. This means that the true value has a difference of +/- 127.47 to our prediction. Ideally, this value should be smaller than the standard deviation, and thus better than a very naive model consisting of a mean +/- standard deviation.

Influencer Contributions

The Influencer Contributions are quite self-explanatory. However, for the sake of completeness, we would like to discuss them in detail in more detail. Influencers are variables that have an influence on the target. By default, all columns and dimensions are considered as Influencers. After training, these are reduced to the most necessary columns and dimensions. In our case, we can see that sales and discounts have the biggest impact on profit. In the Influencer Conributions view, there is then once again a slightly more detailed view about the so-called influencers.

Predicted vs. Acutal Graph

The Predicted vs. Actual Graph allows us to determine the accuracy of our model at a glance. The graph consists of three different curves.

  • Green - Perfect Model: The curve represents a hypothetical perfect model.
  • Blue: - Validation Actual: This curve shows the actual target value as a function of the prediction.
  • Blue dotted - Validation Error Min/Max: These two curves represent the expected minimum and maximum deviation of the validation data set. The range between the two curves is the confidence interval.

How can these graphs now be interpreted?

Ideally, we have a model in which the green and blue curves are close together and have a similar shape. In this case, we can be confident that our model can make smart predictions about unknown values.
If this is not the case, it means that the quality and robustness of our model are not very good. Then the model should be trained with larger or new data sets. Also, new influencers should possibly be considered.

If the curves are mostly the same and differ only in certain segments, this indicates that the model itself is good, but that improvements are still possible. It is likely that there was not enough training data for the segments with large deviations. Again, one should possibly expand the training data set or add new influencers.

In our case, we are satisfied with our model for now. In the next part of our How To blog series, we will apply our predictive model to a new data set to draw profit predictions.

Categories:

Tags:

#!trpst#trp-gettext data-trpgettextoriginal=71#!trpen#WordPress Cookie Plugin by Real Cookie Banner#!trpst#/trp-gettext#!trpen#