2.5 装载TensorFlow_TensorFlow移动端机器学习实战-QQ阅读中文历史网

上QQ阅读APP看本书，新人免费读10天

设备和账号都新为新人

2.5 装载TensorFlow

TensorFlow的安装比较简单，官网上提供了详细的说明，具体内容请参阅TensorFlow的网上链接。

需要提醒的是，建议使用Virtualenv来安装TensorFlow，安装完TensorFlow和GPU （图形加速器）支持后，需要验证。

由于某些用户在安装TensorFlow GPU支持时会遇到问题，因此接下来将介绍如何安装GPU支持。

首先，开发者要在Developer Nvidia Website上注册。

然后，按照此链接安装GPU。

接着，还需要安装CUDA Toolkit 9.0, tensorflow.org中的链接始终指向最新的CUDA版本，现在是9.2版本。但是不要使用9.2版本，除非TensorFlow支持它。请使用上面链接的CUDA 9.0版本。

同样，请下载并安装cuDNN v7.1.4 for CUDA 9.0, tensorflow.org中的链接指向的最新版cuDNN是CUDA 9.2的v7.1.4版本。安装并运行如下命令：

$ nvcc -V
nvcc: NVIDIA （R） Cuda compiler driver
Copyright （c） 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

接着，运行命令“$ nvidia-smi”，得到如下结果：

Fri Jun 15 22:21:08 2018
+---------------------------------------------------------------------+
| NVIDIA-SMI 384.130                   Driver Version: 384.130                       |
|----------------------------+------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id       Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|       Memory-Usage | GPU-Util  Compute M. |
|============================+====================+====================|
|    0  Quadro K600           Off  | 00000000:05:00.0 Off |                      N/A |
| 25%    48C     P0     N/A /  N/A |       0MiB /    979MiB |       0%       Default |
+----------------------------+--------------------+--------------------+

+----------------------------------------------------------------------+
| Processes:                                                                  GPU Memory |
|  GPU        PID    Type    Process name                                   Usage       |
|======================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------+

此外，还要在CUDA示例代码中运行deviceQuery，以确保GPU正常工作。

Device 0: "Quadro 600"
CUDA Driver Version / Runtime Version            9.0 / 9.0
CUDA Capability Major/Minor version number:     2.1
Total amount of global memory:                    962 MBytes （1009254400 bytes）

执行结果如下：

Quadro M6000

如果你看到类似的结果，说明你的显卡可以支持TensorFlow。

然后，我们执行下面的命令：

$ ./bin/x86_64/linux/release/deviceQuery

执行结果会显示显卡的版本号和各种性能数据：

./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query （Runtime API） version （CUDART static linking）

Detected 1 CUDA Capable device（s）

Device 0: "Quadro M6000 24GB"
  CUDA Driver Version / Runtime Version           9.0 / 9.0
  CUDA Capability Major/Minor version number:  5.2
  Total amount of global memory:              24467 MBytes （25655836672 bytes）
  （24） Multiprocessors, （128） CUDA Cores/MP:      3072 CUDA Cores
  GPU Max Clock rate:                                  1114 MHz （1.11 GHz）
  Memory Clock rate:                                   3305 Mhz
  Memory Bus Width:                                    384-bit
  L2 Cache Size:                                        3145728 bytes
  Maximum Texture Dimension Size （x, y, z）        1D=（65536）, 2D=（65536, 65536）, 3D=（4096, 4096, 4096）
  Maximum Layered 1D Texture Size, （num） layers  1D=（16384）, 2048 layers
  Maximum  Layered  2D  Texture  Size,  （num）  layers   2D=（16384,  16384）,  2048 layers
  Total amount of constant memory:                  65536 bytes
  Total amount of shared memory per block:        49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                             32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:             1024
  Max dimension size of a thread block （x, y, z）: （1024, 1024, 64）
  Max dimension size of a grid size     （x, y, z）: （2147483647, 65535, 65535）
  Maximum memory pitch:                               2147483647 bytes
  Texture alignment:                                   512 bytes
  Concurrent copy and kernel execution:            Yes with 2 copy engine（s）
  Run time limit on kernels:                         Yes
  Integrated GPU sharing Host Memory:              No
  Support host page-locked memory mapping:        Yes
  Alignment requirement for Surfaces:              Yes
  Device has ECC support:                             Disabled
  Device supports Unified Addressing （UVA）:       Yes
  Supports Cooperative Kernel Launch:              No
  Supports MultiDevice Co-op Kernel Launch:       No
  Device PCI Domain ID / Bus ID / location ID:    0 / 4 / 0
  Compute Mode:
    < Default （multiple host threads can use ::cudaSetDevice（） with device simultaneously） >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1
Result = PASS
$ nvidia-smi
+---------------------------------------------------------------------+
| NVIDIA-SMI 384.130                   Driver Version: 384.130                       |
|-----------------------------+--------------------+-------------------+
| GPU  Name          Persistence-M| Bus-Id      Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|       Memory-Usage | GPU-Util  Compute M. |
|=============================+===================+====================|
|    0  Quadro M6000 24GB    Off  | 00000000:04:00.0  On |                    Off |
| 25%    41C     P8     20W / 250W |     488MiB / 24467MiB |       0%      Default |
+----------------------------+---------------------+-------------------+

+----------------------------------------------------------------------+
| Processes:                                                                  GPU Memory |
|  GPU        PID    Type    Process name                                   Usage       |
|=====================================================================|
|     0       2183       G    /usr/lib/xorg/Xorg                              319MiB |
|     0       3796       G    compiz                                                92MiB |
|     0       6095      G    ...-token=32ADD0D4261B4355966B2810A61BBF37  72MiB |
+---------------------------------------------------------------------+

最后，还要安装TensorFlow GPU：

（tensorflow）$ pip install --upgrade tensorflow       # for Python 2.7
（tensorflow）$ pip3 install --upgrade tensorflow      # for Python 3.n
（tensorflow）$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
（tensorflow）$ pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

安装成功之后，可以用下面的命令确认：

（tensorflow） $ python
Python 2.7.12 （default, Dec  4 2017, 14:50:18）
［GCC 5.4.0 20160609］ on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant（"hello"）
>>> sess = tf.Session（）
＊＊2018-06-20   06:54:34.284161:   I   tensorflow/core/platform/cpu_feature_guard.cc:140］ Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA＊＊
＊＊2018-06-20  06:54:34.460555:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:1356］ Found device 0 with properties: ＊＊
＊＊name: Quadro M6000 24GB major: 5 minor: 2 memoryClockRate（GHz）: 1.114＊＊
＊＊pciBusID: 0000:04:00.0＊＊
＊＊totalMemory: 23.89GiB freeMemory: 23.29GiB＊＊
＊＊2017-05-20  06:54:34.460600:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:1435］ Adding visible gpu devices: 0＊＊
＊＊2017-05-20  06:54:34.708584:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:923］ Device interconnect StreamExecutor with strength 1 edge matrix:＊＊
＊＊2017-05-20  06:54:34.708635:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:929］       0 ＊＊
＊＊2017-05-20  06:54:34.708644:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:942］ 0:    N ＊＊
＊＊2017-05-20  06:54:34.709069:  I  tensorflow/core/common_runtime/gpu/gpu_device.cc:1053］  Created  TensorFlow  device  （/job:localhost/replica:0/task:0/device:GPU:0 with 22598 MB memory） -> physical GPU （device: 0, name＊＊: Quadro M6000 24GB, pci bus id: 0000:04:00.0, compute capability: 5.2）＊＊
>>> print（sess.run（hello））
hello

上面的代码明确地显示，开发者正在使用GPU！如果看不到这段代码，说明开发者并没有成功安装GPU，而是在使用CPU。