Predictive Modeling and Predictive Models

A predictive model is a system created and used to perform prediction. Predictive models can predict or forecast variety of things and events. For example, future share prices, credit defaults, insurance claims, customer ordering products, and so on. Predictive models are developed from past historical data or from purposely collected data through sampling. Typical examples may include;

Annual insurance policy applications and claims records can be used to develop models that can predict probability (or level of risk) of insurance claims. Predictive models use demographic and financial information of policy holders along with characteristics of insured objects in determining risk levels. For more, please read Insurance Risk Modeling.

Credit loans
Predicting default risk for credit loan applications is another use of predictive models. Data collected from past customer loans, including demographic and financial information of borrowers, can be used to build predictive models that can predict likelihood of loans being defaulted. For more, please read Credit Risk Modeling.

Predictive modeling can be used for various tasks. For example, from past customer purchasing records, you can develop models that can select customers who are likely to buy your new products. Another example is customer churn detection. Using past customer information, models that can predict customers who are likely to churn near future. This can be very useful in retention marketing. For more, please read customer retention.

* For predictive modeling software used here, please read Data Mining Software. Software download is available from the page.

Predictive Modeling Software Tools

Predictive modeling is done automatically by computer software that can learn patterns from data. CMSR Data Miner supports powerful predictive modeling tools. Users can build models with the help of intuitive model visualization tools. Application of models is very easy. Users can apply models directly to user data using built-in database interface tools. CMSR comes with the following predictive modeling software programs;

  • Predictive Neural Network
    Neural network is a predictive model which is based on the architecture of, say, our brains. It can be used to predict both numerical values and categorical classifications. Generally speaking, neural net offers most accurate and versatile predictive models. For more, please read Neural Network.

  • Decision Tree
    Decision tree develops predictive models based on recursive segmentation. Decision tree models have tree-like structures. As the rule, decision is made based on the democratic principle: the winner takes all. If a category of a decision node has the largest number of cases, it will be the predicted category. Of course, this leads to certain limitations! To overcome this, StarProbe data miner also uses probability. For more, please read Decision Tree Classifier Software

  • Regression
    Compared to above methods, regression may be very limiting and inflexible, since all categorical information should be encoded into numerical variables. However, regression can be very useful in developing mathematically oriented models with simple variable sets. Especially, time-series regression analytics are very useful in balanced scorecard applications.

Decision tree classification predictive modeling. Neural network predictive modeling.

* For predictive modeling software used here, please read Data Mining Software. Software download is available from the page.

How can you develop predictive models? To learn more about predictive modeling,
please read The Cookbook for Predictive Analytics.

Key requirement for predictive modeling

The most important factor that can lead to successful implementation of predictive modeling is the availability of useful information. It is noted that predictive models are statistically-developed patterns extracted out of past historical data or purposely collected sampling survey data. With proper data representing predictive patterns of application domains, accurate predictive models can be developed quite easily. For more, please read Cookbook for Predictive Analytics.

Do regression methods work?

Generally speaking, regression methods don't work well for complex modeling. This is especially true if modeling data have severe skews. It tends to produce rather randomly predictions. The following histograms show comparison between different modeling techniques under severe data skew;

By Neural network
Neural network is a very powerful modeling framework. As shown in the left figure, it can learn in very detail. Most green areas are located below 0.4. Most red areas are located above 0.4.
By Cramer Tree Segmentation
The left is a result from probability modeling using Cramer decision tree segmentation. Although it is not as good as neural network, it still produces useful result patterns.
By Regression: General Linear Model
This result is produced with general linear regression models. With general linear model of RR=0.99936, it produces totally useless predictive patterns! This figure shows no patterns in distributions of reds.

For predictive modeling software used here, please read Data Mining Software. Software download is available from the page.