Deep Learning for Beginners
上QQ阅读APP看书,第一时间看更新

Diving into the ML ecosystem

From the typical ML application process depicted in Figure 1.1, you can see that ML has a broad range of applications. However, ML algorithms are only a small part of a bigger ecosystem with a lot of moving parts, and yet ML is transforming lives around the world today:

Figure 1.1 - ML ecosystem. ML interacts with the world through several stages of data manipulation and interpretation to achieve an overall system integration

Deployed ML applications usually start with a process of data collection that uses sensors of different types, such as cameras, lasers, spectroscopes, or other types of direct access to data, including local and remote databases, big or small. In the simplest of cases, input can be gathered through a computer keyboard or smartphone screen taps. At this stage, the data collected or sensed is considered to be raw data.

Raw data is usually preprocessed before presenting it to an ML model. Raw data is rarely the actual input to ML algorithms, unless the ML model is meant to find a rich representation of the raw data, and later be used as input to another ML algorithm. In other words, there are some ML algorithms that are specifically used as preprocessing agents and they are not at all related to a main ML model that will classify or regress on the preprocessed data. In a general sense, this data preprocessing stage aims to convert raw data into arrays or matrices with specific data types. Some popular preprocessing strategies include the following:

  • Word-to-vector conversions, for example, using GloVe or Word2Vec
  • Sequence-to-vector or sequence-to-matrix strategies
  • Value range normalization, for example, (0, 255) to (0.0, 1.0)
  • Statistical value normalization, for example, to have zero mean and unit variance

Once these preprocessing measures take place, most ML algorithms can use the data. However, it must be noted that the preprocessing stage is not trivial, it requires advanced knowledge and skills with respect to operating systems and sometimes even electronics. In a general sense, a real ML application has a long pipeline touching different aspects of computer science and engineering.

The processed data is what you will usually see in books like the one you are reading right now. The reason is that we need to focus on deep learning instead of data processing. If you wish to be more knowledgeable in this area, you could read data science materials such as Ojeda, T. et.al. 2014 or Kane, F. 2017.

Mathematically speaking, the processed data as a whole is referred to using the uppercase, bold font, letter X, which has N rows (or data points). If we want to refer to the specific i-th element (or row) of the dataset, we would do that by writing: Xi. The dataset will have d columns and they are usually called features. One way to think about the features is as dimensions. For example, if the dataset has two features, height and weight, then you could represent the entire dataset using a two-dimensional plot. The first dimension, x1, (height) can be the horizontal axis, while the second dimension, x2, (weight) can be the vertical axis, as depicted in Figure 1.2:

Figure 1.2 - Sample two-dimensional data

During production, when the data is presented to an ML algorithm, a series of tensor products and additions will be executed. Such vectorial operations are usually transformed or normalized using non-linear functions. This is then followed by more products and additions, more non-linear transformations, temporary storage of intermediate values, and finally producing the desired output that corresponds to the input. For now, you can think of this process as an ML black box that will be revealed as you continue reading. 

The output that the ML produces in correspondence to the input usually requires some type of interpretation, for example, if the output is a vector of probabilities of objects being classified to belong to a group or to another, then that may need to be interpreted. You may need to know how low the probabilities are in order to account for uncertainty, or you may need to know how different are the probabilities to account for even more uncertainty. The output processing serves as the connecting factor between ML and the decision-making world through the use of business rules. These rules can be, for example, if-then rules such as, "If the predicted probability of the maximum is twice as large as the second maximum, then issue a prediction; otherwise, do not proceed to make a decision." Or they can be formula-based rules or more complex systems of equations.

Finally, in the decision-making stage, the ML algorithm is ready to interact with the world by turning on a light bulb through an actuator, or to buy stock if the prediction is not uncertain, by alerting a manager that the company will run out of inventory in three days and they need to buy more items, or by sending an audio message to a smartphone speaker saying, "Here is the route to the movie theater" and opening a maps application through an application programming interface (API) call or operating system (OS) commands.

This is a broad overview of the world of ML systems when they are in production. However, this assumes that the ML algorithm is properly trained and tested, which is the easy part, trust me. At the end of the book, you will be skilled in training highly complex, deep learning algorithms but, for now, let us introduce the generic training process.