Rosella Machine Intelligence & Data Mining | |||
Big Data Analytics and Tools for Big Databases: Hive/Hadoop, MySQL, PostgreSQL, ...Big data means just a big data. Size alone doesn't make it special. Prolonged business activity can create big data. Analyzing time-series patterns of such data can reveal many useful information. CMSR Data Miner / Machine Learning / Rule Engine Studio is an advanced analytics system incorporating machine learning, rule engine and big data analytics. It provides easy to use big data analytic tools for big data database users. It incorporates dimensional data analysis reporting with advanced time-series regression such as moving average/exponential smoothing, seasonal adjustment, chisquare static analysis, etc. Especially visualization tools provide intuitive information. Supported and fully tested database systems include Apache Hive+Hadoop, MySQL, MariaDB, PostgreSQL, MS SQL Server, MS Office Access. Other systems such as Oracle, DB2, etc., should work as well. In addition, CSV/TSV text files also supported. CMSR Studio runs on Windows, Linux, and MacOS. Main features include;
For more information and downloads, please visit CMSR Data Miner / Machine Learning / Rule Engine Studio. Big Data Time-series Trend AnalysisThe following figures show group-by-group time-series trend analysis tables. It incorporates moving average/exponential smoothing, seasonal adjustment and chisquare statistic analysis. Green columns are series data. Orange columns are projected values with regression. It uses advanced function fitting techniques to determine best regression functions.
Forecasting with Seasonal Adjustment Using Neural NetworkAs an alternative approach to regression, neural network can be used to capture time-series trends and seasonal patterns. Note that regression is limited in terms of information used. Neural network can include various related indicators. Neural network is a robust modeling tool. It can capture time-series trends along with seasonal patterns. Details are discussed in the following link. The link also describe how to import neural network models into Excel sheets. The following YouTube video shows how to develop Time-series Neural Network Models;
Big Data Bar ChartsCategorical bar charts also can reveal trends and patterns as in the following figure;
Big Data 3D Bar Charts3D bar charts provide three dimensional view of information;
Big Data Cross Table ReportsCMSR cross table reports incorporate chi-square analysis and hotspot visualization as in the following figure;
Big Data Group-by Report TableGroup-by tables are very common use in report generation. CMSR incorporates hotspot visualization, time-series regression with smoothing and seasonal adjustment. This can be used when database columns represent time-series data.
Big Data SQL ToolsCMSR provides metadata browsing and data transfers between databases and CSV/TSV files. In addition, the SQL tool can be used to prepare SQL statements and execute. The following figure shows a Hive SQL DDL example for Hadoop CSV/TSV files.
How to turn CSV/TSV files into Hive Database TablesTo turn CSV/TSV files into Hive Database Tables, perform the followings; 1. Create a Hadoop directory for your CSV/TSV file as follows. Change the path names for your data hadoop fs -mkdir hive/csvfiles hadoop fs -mkdir hive/csvfiles/yourcsvfile 2. Load the CSV/TSV file into Hadoop server as follows; hadoop fs -put /root//yourcsvfile.csv hive/csvfiles/yourcsvfile/ 3. Define a Hive EXTERNAL table as follows. You can do this from the CMSR SQL tool. CREATE EXTERNAL TABLE yourcsvfile ( CID int, GENDER string, RACE string, AGE int, SALARY int, BUYFLAG int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' -- for TSV, use \t LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/user/hive/csvfiles/yourcsvfile' -- hadoop csv file path tblproperties ("skip.header.line.count"="1"); -- if header has column name 4. Now you are ready to use CMSR Big Data Analytic Tools. For more and download, please visit CMSR Data Miner / Machine Learning / Rule Engine Studio. |
|||