...
During training we optimize the loss function, i.e. reduce the error between actual and predicted values. Since we deal with a binary classification problem, the = 1 (for hypothesis ).
can take on just two values, (for hypothesis ) andA popular algorithm to optimize the weights consists of iteratively modifying the weights after each training observation or after a bunch of training observations by doing a minimization of the loss function.
...