上QQ阅读APP看书,第一时间看更新
Introduction
In the previous chapter, we saw how to build plots using the built-in function of pandas, and learned how to estimate the mean, median, and other descriptive statistics about specific consumer or product groups.
In this chapter, we will learn about clustering, a form of unsupervised learning technique, and then begin a discussion of how to calculate the similarity between two data points. Next, we will discuss how to standardize data so that multiple data features can be used without one overwhelming the others. We will also go through how similarity can be calculated by computing the distance between data points. Finally, we will discuss k-means clustering, how to perform it, and how to explore the resulting groups.