Thanks exactly my question was, also why learning rate?
I also found another answer
“the gradient ∇f(a) points in the direction of the greatest increase of f, that is, the direction of steepest ascent. Of course, the opposite direction, −∇f(a), is the direction of steepest descent”
Use small learning rate so that we don’t jump off the slope. need to go down the slope to reach slope~0