A straight line won’t always be the best fit for our data. In this article, you’ll learn how to generate predictive polynomial functions that leverage the machinery of our linear function algorithms. This technique will allow you to generate sophisticated predictive functions that bend and curve to fit your data.
There are many types of polynomial functions at our disposal that can be used to better fit our data and produce more accurate predictions as a result. The following graph shows a few different kinds of polynomial functions and the shapes they can take on:
You might be surprised to learn that we can produce these more sophisticated polynomical functions without changing the linear function algorithms we’ve already developed. I’ll go through one example to show you how it works.
Mapping Polynomial Functions
Let’s start by looking at our prototypical example, where the price of a house is only based on a single feature, the size of the house.
|House Size||House Price|
To produce a straight line to fit our data, we use our univariate linear hypothesis function:
h(x) = Θ0 + Θ1x
When we have more than one feature, we use our multivariate linear hypothesis function:
h(x) = Θ0x0 + Θ1x1 + Θ2x2 + … + Θnxn
Let’s say that we have a few more data points than the table above suggests. When we plot the data we realise that a straight line isn’t going to give us the best fit, but a nice curve will do the trick. Instead of using a linear function, we want to use a quadratic function, which looks like this:
h(x) = Θ0 + Θ1x + Θ2x2
How do we use this quadratic function within our existing machinery? The trick is to map the quadratic function onto our multivariate linear function. To do this, we simply add a new ‘feature’ to our model, the house size squared.
|House Size||House Size2||House Price|
We use the standard multivariate linear function for these two features, which is:
h(x) = Θ0x0 + Θ1x1 + Θ2x2
where x1 is our first feature, the house size
where x2 is our second feature, the house size squared
Concretely, if we substitute the values of x with our features, (remembering that x0 = 1 by convention) we get the following quadratic function as an outcome:
h(x) = Θ0 + Θ1(size) + Θ2(size)2
That is the general approach that you can take to map any kind of polynomial function onto a multivariate linear function. Instead of modifying the linear function, we modify the data set. Using this simple technique will allow you to create highly sophisticaed nonlinear functions by simply adding new features, based on existing ones you already have. You could also create new composite features, by multiplying the values of two features and storing the product as a new standalone feature.
When you use this technique, please ensure that you use feature scaling, especially in cases where you square or cube an existing feature.
Going forward, you’ll have many ways to model your data in order to get the best fit. This might be a bit overwhelming - what’s the best approach to modelling data, given there are so many ways to model it? Don’t worry too much about this, because later on we’ll get our machine learning algorithms to seek and select the best models for us.