Returns a function to compute the gradient of negative conditional log-likelihood with respect to feature weights
make_cll_gradient(class_var, dataset)