Python:Advanced Predictive Analytics
上QQ阅读APP看书,第一时间看更新

Chapter 4. Statistical Concepts for Predictive Modelling

There are a few statistical concepts, such as hypothesis testing, p-values, normal distribution, correlation, and so on without which grasping the concepts and interpreting the results of predictive models becomes very difficult. Thus, it is very critical to understand these concepts, before we delve into the realm of predictive modelling.

In this chapter, we will be going through and learning these statistical concepts so that we can use them in the upcoming chapters. This chapter will cover the following topics:

  • Random sampling and central limit theorem: Understanding the concept of random sampling through an example and illustrating the central limit theorem's application through an example. These two concepts form the backbone of hypothesis testing.
  • Hypothesis testing: Understanding the meaning of the terms, such as null hypothesis, alternate hypothesis, confidence intervals, p-value, significance level, and so on. A step-by-step guide to implement a hypothesis test, followed by an example.
  • Chi-square testing: Calculation of chi-square statistic. A description of usage of chi-square tests with a couple of examples.
  • Correlation: The meaning and significance of correlations between two variables, the meaning and significance of correlation coefficients and calculating and visualizing the correlation between variables of a dataset.