Deep Learning: Solving Problems With TensorFlow

栏目: IT技术 · 发布时间: 4年前

内容简介:The goal of this article is to define and solve pratical use cases with TensorFlow. To do so, we will solve:Netflix has decided to place one of their famous posters in a building. The marketing team has decided that the advertising poster has to cover an a

Learn how to Solve Optimization Problems and Train your First Neural Network with the MNIST Dataset!

Deep Learning: Solving Problems With TensorFlow

Jan 24 ·10min read

Deep Learning: Solving Problems With TensorFlow

www.forbes.com

Introduction

The goal of this article is to define and solve pratical use cases with TensorFlow. To do so, we will solve:

  • An optimization problem
  • A linear regression problem, where we will adjust a regression line to a dataset
  • And we will end solving the “Hello World” of Deep Learning classification projects with the MINST Dataset.

Optimization Problem

Netflix has decided to place one of their famous posters in a building. The marketing team has decided that the advertising poster has to cover an area of 600 square meters, with a margin of 2 meters above and below and 4 meters left and right.

However, they have not been informed of the dimensions of the building’s facade. We could send an email to the owner and ask him, but as we know mathematics we can solve it easily. How can we find out the dimensions of the building?

Deep Learning: Solving Problems With TensorFlow

The total area of the building is:

Width = 4 + x + 4 = x +8

Height = 2 + y + 2 = y +4

Area = Width x Height = (x + 8)*(y + 4)

And there is the constraint of: x*y = 600

This allows us to write an equation system:

xy = 600 → x = 600/y

S(y)= (600/y + 8)(y + 4) = 600 +8y +4*600/y +32 = 632 + 8y + 2400/y

In an optimization problem, the information of the slope of the function, (the derivative) is used to calculate its minimum. We have to equal the first derivative to 0 and then check that the second derivative is positive. So, in this case:

S’(y) = 8–2400/y²

S’’(y) = 4800/y³

S’(y) = 0 → 0 = 8–2400/y² → 8 = 2400/y² → y² = 2400/8 = 300 → y = sqrt(300) = sqrt(100–3) = sqrt(100)-sqrt(3) = 10-sqrt(3) = 17.32 (we discard the negative sign because it has no physical meaning)

Substituting in x:

x =600 / 10-sqrt(3) = 60 / sqrt(3) = 60-sqrt(3) / sqrt(3)-sqrt(3) = 60-sqrt(3) / 3 = 20-sqrt(3) = 34.64

As for y = 17.32 -> S’’(y) = 0.9238 > 0, we have found the minimum solution.

Therefore, the dimensions of the building are:

Width: x + 8 = 42.64 m

Height: y + 4 = 21.32 m

Have you seen how useful derivatives are? We just solved this problem analytically. We have been able to solve it because it was a simple problem, but there are many problems for which it is very computationally expensive to solve them analytically, so we use numerical methods. One of these methods is Gradient Descent.

What do you say if we solve this problem this time numerically with Tensorflow? Let’s go!

import numpy as np
import tensorflow as tfx = tf.Variable(initial_value=tf.random_uniform([1], 34, 35),name=’x’)
y = tf.Variable(initial_value=tf.random_uniform([1], 0., 50.), name=’y’)# Loss function
s = tf.add(tf.add(632.0, tf.multiply(8.0, y)), tf.divide(2400.0, y), ‘s’)opt = tf.train.GradientDescentOptimizer(0.05)
train = opt.minimize(s)sess = tf.Session()init = tf.initialize_all_variables()
sess.run(init)old_solution = 0
tolerance = 1e-4
for step in range(500):
 sess.run(train)
 solution = sess.run(y)
 if np.abs(solution — old_solution) < tolerance:
 print(“The solution is y = {}”.format(old_solution))
 break
 
 old_solution = solution
 if step % 10 == 0:
 print(step, “y = “ + str(old_solution), “s = “ + str(sess.run(s)))

Deep Learning: Solving Problems With TensorFlow

We have managed to calculate y using the gradient descent algorithm. Of course, we now need to calculate x substituting x = 600/y.

x = 600/old_solution[0]
print(x)
Deep Learning: Solving Problems With TensorFlow

Which matches our results, so it seems to work! Let’s plot the results:

import matplotlib.pyplot as plty = np.linspace(0, 400., 500)
s = 632.0 + 8*y + 2400/y
plt.plot(y, s)

Deep Learning: Solving Problems With TensorFlow

print("The function minimum is in {}".format(np.min(s)))
min_s = np.min(s)
s_min_idx = np.nonzero(s==min_s)
y_min = y[s_min_idx]
print("The y value that reaches the minimum is {}".format(y_min[0]))

Let’s See other Example

In this case, we want to find the minimum of the y = log2(x) function.

x = tf.Variable(15, name='x', dtype=tf.float32)
log_x = tf.log(x)
log_x_squared = tf.square(log_x)optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(log_x_squared)init = tf.initialize_all_variables()def optimize():
  with tf.Session() as session:
    session.run(init)
    print("starting at", "x:", session.run(x), "log(x)^2:", session.run(log_x_squared))
    for step in range(100):  
      session.run(train)
      print("step", step, "x:", session.run(x), "log(x)^2:", session.run(log_x_squared))
      
optimize()

Deep Learning: Solving Problems With TensorFlow

Let’s plot it!

x_values = np.linspace(0,10,100)
fx = np.log(x_values)**2
plt.plot(x_values, fx)print("The function minimum is in {}".format(np.min(fx)))
min_fx = np.min(fx)
fx_min_idx = np.nonzero(fx==min_fx)
x_min_value = x_values[fx_min_idx]
print("The y value that reaches the minimum is {}".format(x_min_value[0]))

Deep Learning: Solving Problems With TensorFlow

Let’s Solve a Linear Regression Problem

Let’s see how to adjust a straight line to a dataset that represent the intelligence of every character in the Simpson’s show, from Ralph Wigum to Doctor Frink.

Deep Learning: Solving Problems With TensorFlow

Deep Learning: Solving Problems With TensorFlow

Let’s plot the distribution of intelligence against the age, normalized from 0 to 1, where Maggie is the youngest and Montgomery Burns the oldest:

n_observations = 50
_, ax = plt.subplots(1, 1)
xs = np.linspace(0., 1., n_observations)
ys = 100 * np.sin(xs) + np.random.uniform(0., 50., n_observations)
ax.scatter(xs, ys)
plt.draw()

Deep Learning: Solving Problems With TensorFlow

Now, we need two tf.placeholders, one to the entry and other to the exit of our regression algorithm. Placeholders are variables that do not need to be assigned a value until the network is executed.

X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

Let’s try to optimizie a straight line of linear regression. We need two variables, the weights (W) and the bias (b). Elements of the type tf.Variable need an initialization and its type cannot be changed after being declared. What we can change is its value, by the “assign” method.

W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
Y_pred = tf.add(tf.multiply(X, W), b)

Let’s define now the cost function as the difference between our predictions and the real values.

loss = tf.reduce_mean(tf.pow(Y_pred - y, 2))

We’ll define now the optimization method, we will use the gradient descent. Basically, it calculates the variation of each weight with respect to the total error, and updates each weight so that the total error decreases in subsequent iterations. The learning rate indicates how abruptly the weights are updated.

learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Definition of the number of iterations and start the initialization using the GPU
n_epochs = 1000
with tf.Session() as sess:
with tf.device("/GPU:0"):
# We initialize now all the defined variables
sess.run(tf.global_variables_initializer())
# Start the adjust
prev_training_loss = 0.0
for epoch_i in range(n_epochs):
for (x, y) in zip(xs, ys):
sess.run(optimizer, feed_dict={X: x, Y: y})
W_, b_, training_loss = sess.run([W, b, loss], feed_dict={X: xs, Y: ys}) # We print the losses every 20 epochs
if epoch_i % 20 == 0:
print(training_loss)
# Ending conditions
if np.abs(prev_training_loss - training_loss) < 0.000001:
print(W_, b_)
break
prev_training_loss = training_loss
# Plot of the result
plt.scatter(xs, ys)
plt.plot(xs, Y_pred.eval(feed_dict={X: xs}, session=sess))

Deep Learning: Solving Problems With TensorFlow

And we have it! With this regression line we will be able to predict the intelligence of every Simpson’s character knowing the age.

MNIST Dataset

Let’s see now how to classify digits images with a logistic regression. We will use the “Hello world” of the Deep Learning datasets.

Deep Learning: Solving Problems With TensorFlow

Deep Learning: Solving Problems With TensorFlow

Let’s import the relevant libraries and the dataset MNIST:

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

We load the dataset by encoding the labels with one-hot encoding (it converts each label into a vector of length = N_CLASSES, with all 0s except for the index that indicates the class to which the image belongs, which contains a 1). For example, if we have 10 classes (numbers from 0 to 9), and the label belongs to number 5: label = [0 0 0 0 1 0 0 0 0].

mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)print("Train examples: {}".format(mnist.train.num_examples))
print("Test examples: {}".format(mnist.test.num_examples))
print("Validation examples: {}".format(mnist.validation.num_examples))
# Images are stored in a 2D tensor: images_number x image_pixels_vector
# Labels are stored in a 2D tensor: images_number x classes_number (one-hot)

print("Images Size train: {}".format(mnist.train.images.shape))
print("Images Size train: {}".format(mnist.train.labels.shape))
# To see the range of the images values
print("Min value: {}".format(np.min(mnist.train.images)))
print("Max value: {}".format(np.max(mnist.train.images)))
# To see some images we will acess a vector of the dataset and resize it to 28x28
plt.subplot(131)
plt.imshow(np.reshape(mnist.train.images[0, :], (28, 28)), cmap='gray')
plt.subplot(132)
plt.imshow(np.reshape(mnist.train.images[27500, :], (28, 28)), cmap='gray')
plt.subplot(133)
plt.imshow(np.reshape(mnist.train.images[54999, :], (28, 28)), cmap='gray')
Deep Learning: Solving Problems With TensorFlow
Deep Learning: Solving Problems With TensorFlow

We have already seen a little of what the MNIST dataset consists of. Now, let’s create our regressor:

First, we create the placeholder for our input data. In this case, the input is going to be a set of vectors of size 768 (we are going to pass several images at once to our regressor, this way, when it calculates the gradient it will be swept in several images, so the estimation will be more precise than if it used only one)

n_input = 784  # Number of data features: number of pixels of the image
n_output = 10 # Number of classes: from 0 to 9
net_input = tf.placeholder(tf.float32, [None, n_input]) # We create the placeholder

Let’s define now the regression equation: y = W*x + b

W = tf.Variable(tf.zeros([n_input, n_output]))
b = tf.Variable(tf.zeros([n_output]))

As the output is multiclass, we need a function that returns the probabilities of an image belonging to each of the possible classes. For example, if we put an image with a 5, a possible output would be: [0.05 0.05 0.05 0.05 0.55 0.05 0.05 0.05 0.05] whose sum of probabilities is 1, and the class with the highest probability is 5.

We apply the softmax function to normalize the output probabilities:

net_output = tf.nn.softmax(tf.matmul(net_input, W) + b)

SoftMax Function

Deep Learning: Solving Problems With TensorFlow
# We also need a placeholder for the image label, with which we will compare our prediction And finally, we define our loss function: cross entropy
y_true = tf.placeholder(tf.float32, [None, n_output])
# We check if our prediction matches the label
cross_entropy = -tf.reduce_sum(y_true * tf.log(net_output))
idx_prediction = tf.argmax(net_output, 1)
idx_label = tf.argmax(y_true, 1)
correct_prediction = tf.equal(idx_prediction, idx_label)
# We define our measure of accuracy as the number of hits in relation to the number of predicted samples
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# We now indicate that we want to minimize our loss function (the cross entropy) by using the gradient descent algorithm and with a rate of learning = 0.01.
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

Everything is now set up! Let’s execute the graph:

from IPython.display import clear_outputwith tf.Session() as sess:  sess.run(tf.global_variables_initializer())  # Let's train the regressor
batch_size = 10
for sample_i in range(mnist.train.num_examples):
sample_x, sample_y = mnist.train.next_batch(batch_size)
sess.run(optimizer, feed_dict={net_input: sample_x,
y_true: sample_y})
# Let's check how is performing the regressor
if sample_i < 50 or sample_i % 200 == 0:
val_acc = sess.run(accuracy, feed_dict={net_input: mnist.validation.images, y_true: mnist.validation.labels})
print("({}/{}) Acc: {}".format(sample_i, mnist.train.num_examples, val_acc))
# Let's show the final accuracy
print('Teste accuracy: ', sess.run(accuracy, feed_dict={net_input: mnist.test.images, y_true: mnist.test.labels}))

Deep Learning: Solving Problems With TensorFlow

We have just trained our first NEURONAL NETWORK with TensorFlow!

Think a little bit about what we just did.

We have implemented a logistic regression, with this formula: y = G(Wx + b), where G = softmax() instead of the typical G = sigmoid().

If you look at the following image, which defines the perceptron (a single-layer neural network) you can see as output = Activation_function(Wx). You see? Only the bias is missing! And notice that the input is a 1? So the weight w0 is not multiplied by anything. Exactly! The weight w0 is the bias, which appears with this notation simply to be able to implement it as a matrix multiplication.

Deep Learning: Solving Problems With TensorFlow

So, what we have just implemented is a perceptron, with

  • batch_size = 10
  • 1 epoch
  • descent gradient as optimizer
  • and softmax as activation function.

Final Words

As always, I hope you enjoyed the post, that you have learned how to use TensorFlow to solve linear problems and that you have succesfully trained your first Neural Network!

If you liked this post then you can take a look at my other posts on Data Science and Machine Learning here .

If you want to learn more about Machine Learning and Artificial Intelligence follow me on Medium , and stay tuned for my next posts!


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

程序员的职业素养

程序员的职业素养

Robert C.Martin / 章显洲、余晟 / 人民邮电出版社 / 2012-9-1 / 49.00元

本书是编程大师Bob 大叔40 余年编程生涯的心得体会, 讲解成为真正专业的程序员需要什么样的态度、原则,需要采取什么样的行动。作者以自己以及身边的同事走过的弯路、犯过的错误为例,意在为后来人引路,助其职业生涯迈上更高台阶。 本书适合所有程序员,也可供所有想成为具备职业素养的职场人士参考。一起来看看 《程序员的职业素养》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

SHA 加密
SHA 加密

SHA 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器