AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS
UNIT I INTRODUCTION TO DATA SCIENCE
Need for data science – benefits and uses – facets of data – data science process – setting the research goal – retrieving data – cleansing, integrating, and transforming data – exploratory data analysis – build the models – presenting and building applications.
UNIT II DESCRIPTIVE ANALYTICS
Frequency distributions – Outliers –interpreting distributions – graphs – averages – describing variability – interquartile range – variability for qualitative and ranked data – Normal distributions – z scores –correlation – scatter plots – regression – regression line – least squares regression line – standard error of estimate – interpretation of r2 – multiple regression equations – regression toward the mean.
UNIT III INFERENTIAL STATISTICS
Populations – samples – random sampling – Sampling distribution- standard error of the mean – Hypothesis testing – z-test – z-test procedure –decision rule – calculations – decisions – interpretations – one-tailed and two-tailed tests – Estimation – point estimate – confidence interval – level of confidence – effect of sample size.
UNIT IV ANALYSIS OF VARIANCE
t-test for one sample – sampling distribution of t – t-test procedure – t-test for two independent samples – p-value – statistical significance – t-test for two related samples. F-test – ANOVA – Two-factor experiments – three f-tests – two-factor ANOVA –Introduction to chi-square tests
UNIT V PREDICTIVE ANALYTICS
Linear least squares – implementation – goodness of fit – testing a linear model – weighted resampling. Regression using StatsModels – multiple regression – nonlinear relationships – logistic regression – estimating parameters – Time series analysis – moving averages – missing values – serial correlation – autocorrelation. Introduction to survival analysis.
Reviews
There are no reviews yet.