上QQ阅读APP看书,第一时间看更新
Looking into the basics of the Apriori algorithm
The Apriori algorithm is part of our affinity analysis methodology and deals specifically with finding frequent itemsets within the data. The basic procedure of Apriori builds up new candidate itemsets from previously discovered frequent itemsets. These candidates are tested to see if they are frequent, and then the algorithm iterates as explained here:
- Create initial frequent itemsets by placing each item in its own itemset. Only items with at least the minimum support are used in this step.
- New candidate itemsets are created from the most recently discovered frequent itemsets by finding supersets of the existing frequent itemsets.
- All candidate itemsets are tested to see if they are frequent. If a candidate is not frequent then it is discarded. If there are no new frequent itemsets from this step, go to the last step.
- Store the newly discovered frequent itemsets and go to the second step.
- Return all of the discovered frequent itemsets.
This process is outlined in the following workflow: