Hands-On Big Data Analytics with PySpark
上QQ阅读APP看书,第一时间看更新

Parallelization with Spark RDDs

Now that we know how to create RDDs within the text file that we received from the internet, we can look at a different way to create this RDD. Let's discuss parallelization with our Spark RDDs.

In this section, we will cover the following topics:

  • What is parallelization?
  • How do we parallelize Spark RDDs?

Let's start with parallelization.