|Rosella Machine Intelligence & Data Mining|
Big Data Analytics and Tools for Big Databases: Hive/Hadoop, MySQL, PostgreSQL, ...
CMSR Data Miner / Machine Learning / Rule Engine Studio is an advanced analytics system incorporating machine learning, rule engine and big data analytics. It provides easy to use big data analytic tools for big data database users. It incorporates dimensional data analysis reporting with advanced time-series regression such as moving average/exponential smoothing, seasonal adjustment, chisquare static analysis, etc. Especially visualization tools provide intuitive information.
Supported and fully tested database systems include Apache Hive+Hadoop, MySQL, MariaDB, PostgreSQL, MS SQL Server, MS Office Access. Other systems such as Oracle, DB2, etc., should work as well. In addition, CSV/TSV text files also supported.
CMSR Studio runs on Windows, Linux, and MacOS.
Main features include;
For more information and downloads, please visit CMSR Data Miner / Machine Learning / Rule Engine Studio.
Big Data Time-series Trend Analysis
The following figures show group-by-group time-series trend analysis tables. It incorporates moving average/exponential smoothing, seasonal adjustment and chisquare statistic analysis. Green columns are series data. Orange columns are projected values with regression. It uses advanced function fitting techniques to determine best regression functions.
Big Data Bar Charts
Categorical bar charts also can reveal trends and patterns as in the following figure;
Big Data 3D Bar Charts
3D bar charts provide three dimensional view of information;
Big Data Cross Table Reports
CMSR cross table reports incorporate chi-square analysis and hotspot visualization as in the following figure;
Big Data Group-by Report Table
Group-by tables are very common use in report generation. CMSR incorporates hotspot visualization, time-series regression with smoothing and seasonal adjustment. This can be used when database columns represent time-series data.
Big Data SQL Tools
CMSR provides metadata browsing and data transfers between databases and CSV/TSV files. In addition, the SQL tool can be used to prepare SQL statements and execute. The following figure shows a Hive SQL DDL example for Hadoop CSV/TSV files.
How to turn CSV/TSV files into Hive Database Tables
To turn CSV/TSV files into Hive Database Tables, perform the followings;
1. Create a Hadoop directory for your CSV/TSV file as follows. Change the path names for your data
hadoop fs -mkdir hive/csvfiles hadoop fs -mkdir hive/csvfiles/yourcsvfile
2. Load the CSV/TSV file into Hadoop server as follows;
hadoop fs -put /root//yourcsvfile.csv hive/csvfiles/yourcsvfile/
3. Define a Hive EXTERNAL table as follows. You can do this from the CMSR SQL tool.
CREATE EXTERNAL TABLE yourcsvfile ( CID int, GENDER string, RACE string, AGE int, SALARY int, BUYFLAG int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' -- for TSV, use \t LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/user/hive/csvfiles/yourcsvfile' -- hadoop csv file path tblproperties ("skip.header.line.count"="1"); -- when header contain column names
4. Now you are ready to use CMSR Big Data Analytic Tools.
For more and download, please visit CMSR Data Miner / Machine Learning / Rule Engine Studio.