Hands-On Deep Learning for Games
上QQ阅读APP看书,第一时间看更新

Generating textures with a GAN 

One of the things so rarely covered in advanced deep learning books is the specifics of shaping data to input into a network. Along with shaping data is the need to alter the internals of a network to accommodate the new data. The final version of this example is Chapter_3_3.py, but for this exercise, start with the Chapter_3_wgan.py file and follow these steps:

  1. We will start by changing the training set of data from MNIST to CIFAR by swapping out the imports like so:
from keras.datasets import mnist  #remove or leave
from keras.datasets import cifar100 #add

  1. At the start of the class, we will change the image size parameters from 28 x 28 grayscale to 32 x 32 color like so:
class WGAN():
def __init__(self):
self.img_rows = 32
self.img_cols = 32
self.channels = 3
  1. Now, move down to the train function and alter the code as follows:
#(X_train, _), (_, _) = mnist.load_data() or delete me
(X_train, y), (_, _) = cifar100.load_data(label_mode='fine')
Z_train = []
cnt = 0
for i in range(0,len(y)):
if y[i] == 33: #forest images
cnt = cnt + 1
z = X_train[i]
Z_train.append(z)
#X_train = (X_train.astype(np.float32) - 127.5) / 127.5 or delete me
#X_train = np.expand_dims(X_train, axis=3)
Z_train = np.reshape(Z_train, [500, 32, 32, 3])
Z_train = (Z_train.astype(np.float32) - 127.5) / 127.5

#X_train = (X_train.astype(np.float32) - 127.5) / 127.5
#X_train = np.expand_dims(X_train, axis=3)
  1. This code loads the images from the CIFAR100 dataset and sorts through them by label. Labels are stored in the y variable, and the code loops through all the downloaded images and isolates those to one specific set. In this case, we are using the label 33, which corresponds to forest images. There are 100 categories in the CIFAR100, and we are selecting one category that holds 500 images. Feel free to try to generate other textures from other categories.
    The rest of the code is fairly straightforward, except for the np.reshape call where we reshape the data into a list of 500 images 32x32 pixels by three channels. You may also want to note that we do not need to expand the axis to three as we did before. This is because our image is already scaled to three channels.
  1. We now need to go back to the generator and critic models and alter that code slightly. First, we will change the generator like so:
def build_generator(self):
model = Sequential()
model.add(Dense(128 * 8 * 8, activation="relu", input_dim=self.latent_dim))
model.add(Reshape((8, 8, 128)))
model.add(UpSampling2D())
model.add(Conv2D(128, kernel_size=4, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
model.add(Conv2D(64, kernel_size=4, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(Conv2D(self.channels, kernel_size=4, padding="same"))
model.add(Activation("tanh"))
model.summary()
noise = Input(shape=(self.latent_dim,))
img = model(noise)
return Model(noise, img)
  1. The boldface code denotes the changes. All we are doing for this model is converting the 7x7 original feature map to 8x8. Recall that the original full image size is 28x28. Our convolution starts with a 7x7 feature map, doubled twice, which equals 28x28. Since our new image size is 32x32, we need to convert our network to start with 8x8 feature maps, which doubled twice equals 32x32, the same size as the CIFAR100 images. Fortunately, we can leave the critic model as it is.
  2. Next, we add a new function to save samples of the original CIFAR images, and this is shown here:
def save_images(self, imgs, epoch):
r, c = 5, 5
gen_imgs = 0.5 * imgs + 1
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
for j in range(c):
axs[i,j].imshow(gen_imgs[cnt, :,:,0],cmap='gray')
axs[i,j].axis('off')
cnt += 1

fig.savefig("images/cifar_%d.png" % epoch)
plt.close()
  1. The save_images function outputs a sampling of the original images and is called by the following code in the train function: 
idx = np.random.randint(0, Z_train.shape[0], batch_size)
imgs = Z_train[idx]
if epoch % sample_interval == 0:
self.save_images(imgs, epoch)
  1. The new code is in boldface and just outputs what a sampling of the originals looks like, as follows:
Example of the original images 
  1. Run the sample and observe the output in the images folder again labeled cifar, showing the result of training. Again, this sample can take some time to run, so read on to the next section.

As the sample runs, you can observe how the GAN is training to match the images. The benefit here is that you can generate various textures easily using a variety of techniques. You can use these as textures or height maps in Unity or another game engine. Before we finish up this section, let's jump into some normalization and other parameters.