Deviation Analysis and Detection

Deviation analysis can reveal surprising facts hidden inside data. CMSR Data Miner provides tools that can be used to detect deviations, anomalies, and outliers. Detection is needed for various reasons;

  • knowledge discovery: often such information is vital part of important business decisions and scientific discovery.
  • auditing: examining such information can reveal problems and mal-practices.
  • fraud detection: fraudulent claims often carry inconsistent information. Such information can reveal fraud cases. More on healthcare fraud detection.
  • data cleaning: such information can be from mistakes in data entry which should be corrected.

Cross Tables and Hidden Patterns

CMSR Data Miner supports very powerful deviation detection methods for Cross Tables. It shows deviation in terms of over performining and under performing cells. The methods can reveal hidden patterns and hidden information hidden inside cross table numbers. Note that this tool is also available as Excel Addin Cross Tables. The following shows deviation detection;

Cross table - Excell Addin

Predictive Modeling

Predictive Modeling, such as decision tree, rule induction and neural network, can be used to detect deviations. To detect anomalies in categorical fields, all three tools can be used. For numerical fields, however, only neural network can be employed. Note that decision tree and rule induction cannot predict values for numerical fields. With CMSR Data Miner, this works as follows;

  • Build predictive models for targeted fields, using other fields as induction fields.
  • Apply the models to data in database and save results onto database rows.
  • Identify records with different predicted values. For numerical fields, ratios of values can be used. You can perform this easily with simple SQL statements.

Hotspot Analysis

Hotspot Analysis can detect outliers. More specifically, this will detect patterns of outliers, defined in terms of profile conditions. Outliers can have extremely high or low averages, probabilities, etc. With CMSR Data Miner, you can perform as follows;

  • Search hotspot profiles.
  • Query database using the hotspot profiles and examine the result rows.

Clustering

Clustering objects based on similarity and analyzing clusters may reveal outliers. With CMSR Data Miner, you can perform as follows;

  • Cluster objects based on similarity.
  • Examine clusters using cluster visualization tools.