Machine Learning with Python : Building and Training a Simple Linear Model using TensorFlow

Pawit Kochakarn
Aug 29, 2017
5 min read

In this post, I will be demonstrating and showing you guys how to build and train a simple linear model with Python using the popular machine learning library TensorFlow. The reason I am opting to use TensorFlow is because it's really concise and easy-to-use, especially for beginners wanting to learn how to do machine learning. It really breaks down the hard mathematical functions that machine learning requires and transforms it into simple-to-use commands.

What we are attempting to do in this tutorial is for a random line generated in a x,y plot (W*x + b) to be able to gradually adjust its gradient(W) and intercept(b) in order to fit itself through a chosen set of data points of our liking. For this, I am choosing for it to eventually satisfy the equation y = - x + 1 (as seen below on the right):

The 4 points that we want our linear model to cross in this case are (1,0), (2,-1), (3,-2), and (4,-3). We will use various statistical tool such as reduce-squared sums in order to find our error between the current line and the line we want and correct for it using something called a Gradient Descent optimiser, where it trains and optimises the model to have the least error in it. This Gradient Descent method basically plots the error output that we get for every combination of W and b that we input into the model and finds the minima for this function. This is basically done through differentiation. Luckily, TensorFlow has a built-in optimiser function so we don't have to do all the maths by ourselves! After we have plotted our line of best fit, we will then use Matplotlib to also plot our error/loss history over every iteration during our training process. This should visualise the error reducing process, in which the Gradient Optimiser does. Ok, let's now go through the code chunk by chunk to get a firm understanding.

import tensorflow as tf import matplotlib.pyplot as plt import numpy as np W = tf.Variable([2], dtype=tf.float32) b = tf.Variable([0], dtype=tf.float32) x = tf.placeholder(tf.float32) linear_model = W*x + b init = tf.global_variables_initializer() sess = tf.Session() sess.run(init)

First, we shall import all the libraries that we will need for our program. We obviously need TensorFlow as our main machine learning library. There are numerous ways to install this library to our Python repository such as pip/pip3 install, but there can be other extra requirements that you might need, especially if you are using a Mac like I was. A link for all the different install methods can be found here on the official TensorFlow website: https://www.tensorflow.org/install/ . Matplotlib and Numpy is then needed for visualising our linear model.

Let's now initialise all of our variables for the linear model under the form W*x + b. We define both W & b as scalar variables and x as a placeholder. A placeholder basically means a variable that we can assign a data value to at a later time. From there, we define our linear model (W*x + b) and initialise our session using the last three bottom lines of code.

y = tf.placeholder(tf.float32) squared_deltas = tf.square(linear_model - y) loss = tf.reduce_sum(squared_deltas)

optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) lossHistory=[]

Now we shall declare our y variable also as a placeholder (same as x). Let's now get into the maths part of this program. We first have to set our error function to use on training our linear model. As mentioned above, we will be using the Reduce Sum Squares method to get our error, which takes the form :

What this function basically means that we are summing up all the squares of the residual between the actual y values and the corresponding y values of our current regression line. As we can see, squared_deltas produces the square of the error between our linear model and the actual y value. This is then passed on to loss, where it calculates the reduced sum of squared_deltas, using TensorFlow's handy command tf.reduce_sum. Finally we can set our optimiser of choice, which in this case is a Gradient Descent Optimiser, and input in our learning rate. The learning rate is basically how gradual we are setting our model to improve on its W & b values. Too fast, and this model might tend to overshoot. Too slow, and this model might take forever to get to the optimal values. For this, we have thus chosen 0.01 as our learning rate.

We will then pass the loss into the train function while declaring an empty array called lossHistory. We will use this to plot our error over time graph.

x_train = np.array([1,2,3,4]) y_train = np.array([0,-1,-2,-3]) init = tf.global_variables_initializer() sess.run(init) for i in range(1000): sess.run(train,{x:x_train,y:y_train})

acc_W, acc_b, acc_loss = sess.run([W, b, loss],{x:x_train,y:y_train}) lossHistory.append(acc_loss) print("W, %s b: %s loss: %s"%(acc_W, acc_b, acc_loss))

plt.figure() plt.axis([-1,5,-4,1]) plt.plot(x_train, y_train, "ro") plt.plot(x_train,sess.run(W) * x_train + sess.run(b), "b-")

fig = plt.figure() plt.plot(np.arange(0, 1000), lossHistory) fig.suptitle("Training Loss") plt.xlabel("Iterations") plt.ylabel("Loss")

plt.show()

After we have built all of our functions, let's now get into the training process of this model. We will first set our x,y training data that we will feed into the model so that our line of best fit can become the form y= -x + 1. We will use np.array for that. Next, we need to initialise our training session using the init and sess.run(init) lines. From there, we can get into the training loop. In the for loop, we have set our iterations as 1000. This means that the program will run through over 1000 iterations to find the optimal W & b values. The two lines below starts the training process, feeding in the training data into the x & y placeholders. From then on, for each iteration, the model will produce its W, b and loss values (acc_W, acc_b, acc_loss). These values will obviously change for each iteration and therefore we have to append it to our lossHistory to get the full set of losses over time. We can also print these values over time to get a firm understanding of how the model evolves over time. Finally, we can start doing our Matplotlib stuff. This last chunk of code basically plots our final line of best fit over our training points in a graph to visualise how close we got to the desired position. It will also plot our loss over time. This will hopefully show you guys the behaviour of Gradient Descent.

If you run the whole program through terminal, it should look something like this:

As we can see, our loss gets pretty small quite quickly due to our Gradient Descent Optimiser. Finally, we can also see that our final W & b values have gotten close to -1 and 1 respectively, which is what we originally wanted.

I hope this post has given you a brief introduction into Gradient Descent in Machine Learning using Python and Tensorflow. We will definitely be covering more advanced topics in Machine Learning but this should give you a firm understanding of the basic frameworks of a simple algorithm.

#MachineLearning #TensorFlow #Python #GradientDescent