Cost Function
We can measure the accuracy of our hypothesis function by using a cost function.
This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x’s and the actual output y’s.
To break it apart, it is (1/2)xˉ where xˉ is the mean of the squares of hθ(xi−yi )or the difference between the predicted value and the actual value.
This function is otherwise called the “Squared error function”, or “Mean squared error”. The mean is halved (1/2) as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the (1/2) term.
For higher accuracy of the hypothesis, we need to get minimum cost function. Now the questions arise how could we minimize the cost function? So how to choose θ to minimize the cost function?
The following image summarizes what the cost function does:
Read Next: Cost Function Intuition 1
Disclaimer — This series is based on the notes that I created for myself based on various books and videos I’ve read or seen , so some of the text could be an exact quote from some book/videos out there.