What is line search

1,623

Basically, a properly designed line search guarantees the convergence of an algorithm to a locally optimal point in a finite number of steps. This process is called the globalization of an algorithm, which, to be fair, is kind of a bad name since it has nothing to do with finding the globally optimal point. In any case, what we normally require is that the step length of an algorithm give a certain amount of decrease in the objective, which we call the sufficient decrease condition, and that the curvature at this point be sufficient, which we call the curvature condition. One example of these conditions is called the strong Wolfe conditions. As it turns out, there are other ways to guarantee convergence in a finite number of steps such as trust-region algorithms. If you're curious as to how to prove convergence in both line-search and trust-region methods, Nocedal and Wright's book Numerical Optimization has the basic proofs.

In machine learning, the learning rate basically does what the line-search does by adjusting how far we take a step in our chosen direction. However, not all machine learning algorithms check the strong Wolfe conditions, so they are not guarantee to convergence in a finite number of steps. Furthermore, from a practical point of view, a decent line search will improve the performance of the algorithm beyond just theoretical guarantees. A fixed learning parameter or one that is decreased monotonically according to a specified scheme, frankly, performs poorly. Even something as simple as a golden-section search will improve things dramatically. Part of the reason for this is that computing a gradient tends to be expensive whereas computing the function tends to be cheap. As such, if we use a line-search that depends purely on objective computations or directional derivatives, which are cheaper than gradients to compute, we can better utilize our expensive computation by augmenting it with cheaper computations. That and the guarantee of convergence.

Share:
1,623

Related videos on Youtube

Sie Tw
Author by

Sie Tw

Updated on August 01, 2022

Comments

  • Sie Tw
    Sie Tw over 1 year

    I hear a lot about line search in optimization.

    I am fine with the methods like Gradient Descent but what is line search and its uses?

    Intuitively from what I understand right now, line search is basically trying a set of values of a parameter linearly i.e. along a line to see which one gives the best value of the function, right?

    • littleO
      littleO about 7 years
      In gradient descent, at each iteration you move in the direction of steepest descent. But how far should you move? Line search is a strategy for deciding how far to move in that direction.
    • Sie Tw
      Sie Tw about 7 years
      @littleO that is just the learning rate right? why call it line search then...
    • littleO
      littleO about 7 years
      I think "learning rate" is synonymous with "step size" (but the term "learning rate" was coined by machine learning researchers, long after gradient descent with line search was originally developed). "Line search" is a method for selecting the step size or "learning rate".
    • Sie Tw
      Sie Tw about 7 years
      @littleO so, like decaying learning rate is a line search algo? What are the different types of line search algos?
    • littleO
      littleO about 7 years
      The simplest thing is to have no line search at all, and just use a fixed step size. But, using a line search procedure to select step sizes adaptively can improve convergence a lot. The conceptually simplest line search strategy would be to choose the step size that minimizes the objective function value the most. However, it would be too expensive to find this optimal step size at each iteration. So instead you can try a step size (that might be ambitious), and then reduce the step size repeatedly until a certain inequality is satisfied (this inequality comes from the convergence proof).
    • littleO
      littleO about 7 years
      Line search is explained pretty well in Boyd and Vandenberghe, which is free online.
    • Axel Kemper
      Axel Kemper about 7 years
      @littleO: thanks for the hint! The book is here
  • GeorgeOfTheRF
    GeorgeOfTheRF about 5 years
    Is gradient descent a type of Line search algorithm?
  • wyer33
    wyer33 about 5 years
    Gradient descent can be used in a line search algorithm, a trust-region algorithm, or in an algorithm with no globalization at all. Basically, gradient descent just specifies how we find the direction, but not what we do with it. And, look, most people will just say that yes it is, but what I'm strongly contending is that this is not precise. Gradient descent can be used without a line-search. Repeatedly taking the Cauchy point in a trust-region algorithm is also gradient descent.