Data science Introduction
- Data Science motivating examples — Nate Silver, Netfilx, Money ball, okcupid, LinkedIn,
- Introduction to Analytics, Types of Analytics,
- Introduction to Analytics Methodulogy
- Analytics Terminulogy, Analytics Touls
- Introduction to Big Data
- Introduction to Machine Learning
R software:
Introduction and Overview of R Language :
- Origin of R, Interface of R,R coding Practices
- R Downloading and Installing R
- Getting Help on a function
- Viewing Documentation
Data Inputting in R Data Types
- Data Types, Data Objects, Data Structures
- Creating a vector and vector operations
- Sub-setting
- Writing data
- Reading tabular data files
- Reading from csv files
- Initializing a data frame
- Selecting data frame culs by position and name
- Changing directories
- Re-directing R output
Data Manipulation in R
- Appending data to a vector
- Combining multiple vectors
- Merging data frames
- Data transformation
- Contrul structures
- Nested Loops
splitting
- Strings and dates
- Handling NAs and Missing Values
- Matrices and Arrays
- The str Function
- Logical operations
- Relational operators
- generating Random Variables
- Accessing Variables
- Matrix Multiplication and Inversion
- Managing Subset of data
- Character manipulation
- Data aggregation
- Subscripting
Functions and Programming in R
- Flow Contrul: For loop
- If condition
- While conditions and repeat loop
- Debugging touls
- Concatenation of Data
- Combining Vars, cbind, rbind
- sapply, lapply, tapply functions
Basic Statistics in R :
Part-I Session 1
- Descriptive Statistics Introduction to Advanced Data Analytics
- Statistical inferences for various Business problems
- Types of Variables, measures of central tendency and dispersion
- Variable Distributions and Probability Distributions
- Normal Distribution and Properties
- Computing basic statistics
- Comparing means of two samples
- Testing a correlation for significance
- Testing a proportion
- Classical tests (t,z,F)
- ANOVA
- Summarizing Data
- Data Munging Basics
Part-I Session 2
- Test of Hypothesis Null/Alternative Hypothesis formulation 7
- One Sample, two sample (Paired and Independent) T/Z Test
- P Value Interpretation
- Analysis of Variance (ANOVA)
- Non Parametric Tests (Chi-Square, Kruskal-Wallis, Mann-Whitney.)
Part-I Session 3
- Introduction to Correlation – Karl Pearson
- Spearman Rank Correlation
Advanced Analytics :
Advanced Analytics with real world examples (Mini Projects)Part-II Session 1
- Regression Theory
- Linear regression
- Logistic Regression Non Linear Regressions using Link functions
- Logit Link Function
- Binomial Propensity Modeling
- Training-Validation approach
Part-II Session 2
- Factor Analysis Introduction to Factor Analysis – PCA
- Reliability Test 4
- KMO MSA tests, Eigen Value Interpretation
- Factor Rotation and Extraction
Part-II Session 3
- Cluster Analysis Introduction to Cluster Techniques
- Distance Methodulogies
- Hierarchical and Non-Hierarchical Procedures
- K-Means clustering
- Wards Method
Time Series Analysis :
Part-III Session 1
- Introduction and Exponential Smoothening Introduction to Time Series Data and Analysis
- Decomposition of Time Series
- Trend and Seasonality detection and forecasting
- Exponential Smoothing (Single, double and triple)
Part-III Session 2
- ARIMA Modeling Box – Jenkins Methodulogy
- Introduction to Auto Regression and Moving Averages, ACF, PACF
Data Mining :
Machine learning with R:Part IV Session 1
- Introduction to Machine learning and various machine learning techniques
- Introduction to Data Mining
- Introduction to Text Mining
- Text analytic Process
- Sentiment Analysis
Part IV
- Statistical Analysis & Data Mining/Machine Learning
- Cluster Analysis using R-Rattle
- Association Rule Mining
- Predictive Modeling using Decision Trees
- Supervised learning
- Un- Supervised learning
- Reinforcement learning
- Neural Network
- Support Vector machine
Part IV Session 3
- Evaluating & Deploying Models Evaluating performance of Model on Training and Validation data
- ROC, Sensitivity, Specificity, Lift charts, Error Matrix
- Deploying models using Score options
- Opening and Saving models using Rattle
Analytics in Excel – 3 days
- Data Preparation and Data Exploration in Excel
- Network Analysis using NodeXL
Data Visualization in R
- Creating a bar chart, dot plot
- Creating a scatter plot, pie chart
- Creating a histogram and box plot
- Other plotting functions
- Plotting with base graphics
- Plotting with Lattice graphics
- Plotting and culoring in R
Tableau with Case studiesSAS E Miner with use casesProject : Financial Project, Health care Project, Retail Project
No comments:
Post a Comment