上QQ阅读APP看书，第一时间看更新

Tensors on GPU

We have learned how to represent different forms of data in tensor representation. Some of the common operations we perform once we have data in the form of tensors are addition, subtraction, multiplication, dot product, and matrix multiplication. All of these operations can be either performed on the CPU or the GPU. PyTorch provides a simple function called cuda() to copy a tensor on the CPU to the GPU. We will take a look at some of the operations and compare the performance between matrix multiplication operations on the CPU and GPU.

Tensor addition can be obtained by using the following code:

#Various ways you can perform tensor addition
a = torch.rand(2,2) 
b = torch.rand(2,2)
c = a + b
d = torch.add(a,b)
#For in-place addition
a.add_(5)

#Multiplication of different tensors

a*b
a.mul(b)
#For in-place multiplication
a.mul_(b)

For tensor matrix multiplication, lets compare the code performance on CPU and GPU. Any tensor can be moved to the GPU by calling the .cuda() function.

Multiplication on the GPU runs as follows:

a = torch.rand(10000,10000)
b = torch.rand(10000,10000)

a.matmul(b)

Time taken: 3.23 s

#Move the tensors to GPU
a = a.cuda()
b = b.cuda()

a.matmul(b)

Time taken: 11.2 µs

These fundamental operations of addition, subtraction, and matrix multiplication can be used to build complex operations, such as a Convolution Neural Network (CNN) and a recurrent neural network (RNN), which we will learn about in the later chapters of the book.