Ceres 求解 Powell’s function 的最小化

\(\quad\)现在考虑一个稍微复杂一点的例子—鲍威尔函数的最小化。

\(\quad{}\) \(x=[x_1,x_2,x_3,x_4]\) 并且

\[\begin{array}{l}
f_{1}(x)=x_{1}+10 x_{2} \\
f_{2}(x)=\sqrt{5}\left(x_{3}-x_{4}\right) \\
f_{3}(x)=\left(x_{2}-2 x_{3}\right)^{2} \\
f_{4}(x)=\sqrt{10}\left(x_{1}-x_{4}\right)^{2} \\ \\
F(x)=\left[f_{1}(x), f_{2}(x), f_{3}(x), f_{4}(x)\right]
\end{array}
\]

\(\quad{}\) \(F(x)\) 是一个拥有四个参数的函数,并且拥有四个残差块,我们希望找到一个 \(x\) 使得下面的式子得到最小化:

\[\frac{1}{2}||F(x)||^2
\]

\(\quad{}\)同样,第一步是定义对目标函子中的项进行评估的函子。 这是评估的代码,翻译过来不太准确,或者说这其实是在定义残差块。对 \(f_4(x_1,x_4)\) 的定义如下:

struct F4 {
  template <typename T>
  bool operator()(const T* const x1, const T* const x4, T* residual) const {
    residual[0] = sqrt(10.0) * (x1[0] - x4[0]) * (x1[0] - x4[0]);
    return true;
  }
};

\(\quad{}\) 同样,我们也可以类似的定义 \(F_1\), \(F_2\)\(F_3\) 去评估计算 \(f_1(x_1,x_2)\)\(f_2(x_3,x_4)\)\(f_3(x_2,x_3)\)。这样的话,问题将会被定义为如下的格式:

// 赋予初值
double x1 =  3.0; double x2 = -1.0; double x3 =  0.0; double x4 = 1.0;

Problem problem;

// Add residual terms to the problem using the autodiff
// wrapper to get the derivatives automatically.
problem.AddResidualBlock(
  new AutoDiffCostFunction<F1, 1, 1, 1>(new F1), nullptr, &x1, &x2);
problem.AddResidualBlock(
  new AutoDiffCostFunction<F2, 1, 1, 1>(new F2), nullptr, &x3, &x4);
problem.AddResidualBlock(
  new AutoDiffCostFunction<F3, 1, 1, 1>(new F3), nullptr, &x2, &x3);
problem.AddResidualBlock(
  new AutoDiffCostFunction<F4, 1, 1, 1>(new F4), nullptr, &x1, &x4);

\(\quad{}\) 请注意,每个 ResidualBlock 仅依赖于对应残差对象所依赖的两个参数,而不依赖于所有四个参数。

  • 所有的程序如下所示:
#include <vector>

#include "ceres/ceres.h"
#include "gflags/gflags.h"
#include "glog/logging.h"
#include "ceres/internal/port.h"
using ceres::AutoDiffCostFunction;
using ceres::CostFunction;
using ceres::Problem;
using ceres::Solve;
using ceres::Solver;

struct F1
{
  template <typename T>
  bool operator()(const T *const x1, const T *const x2, T *residual) const
  {
    // f1 = x1 + 10 * x2;
    residual[0] = x1[0] + 10.0 * x2[0];
    return true;
  }
};

struct F2
{
  template <typename T>
  bool operator()(const T *const x3, const T *const x4, T *residual) const
  {
    // f2 = sqrt(5) (x3 - x4)
    residual[0] = sqrt(5.0) * (x3[0] - x4[0]);
    return true;
  }
};

struct F3
{
  template <typename T>
  bool operator()(const T *const x2, const T *const x3, T *residual) const
  {
    // f3 = (x2 - 2 x3)^2
    residual[0] = (x2[0] - 2.0 * x3[0]) * (x2[0] - 2.0 * x3[0]);
    return true;
  }
};

struct F4
{
  template <typename T>
  bool operator()(const T *const x1, const T *const x4, T *residual) const
  {
    // f4 = sqrt(10) (x1 - x4)^2
    residual[0] = sqrt(10.0) * (x1[0] - x4[0]) * (x1[0] - x4[0]);
    return true;
  }
};

DEFINE_string(minimizer,
              "trust_region",
              "Minimizer type to use, choices are: line_search & trust_region");

int main(int argc, char **argv)
{
  GFLAGS_NAMESPACE::ParseCommandLineFlags(&argc, &argv, true);
  google::InitGoogleLogging(argv[0]);

  double x1 = 3.0;
  double x2 = -1.0;
  double x3 = 0.0;
  double x4 = 1.0;

  Problem problem;
  // Add residual terms to the problem using the autodiff
  // wrapper to get the derivatives automatically. The parameters, x1 through
  // x4, are modified in place.
  problem.AddResidualBlock(
      new AutoDiffCostFunction<F1, 1, 1, 1>(new F1), nullptr, &x1, &x2);
  problem.AddResidualBlock(
      new AutoDiffCostFunction<F2, 1, 1, 1>(new F2), nullptr, &x3, &x4);
  problem.AddResidualBlock(
      new AutoDiffCostFunction<F3, 1, 1, 1>(new F3), nullptr, &x2, &x3);
  problem.AddResidualBlock(
      new AutoDiffCostFunction<F4, 1, 1, 1>(new F4), nullptr, &x1, &x4);

  Solver::Options options;
  // LOG_IF(FATAL,
  //        !ceres::StringToMinimizerType(CERES_GET_FLAG(FLAGS_minimizer),
  //                                      &options.minimizer_type))
  //     << "Invalid minimizer: " << CERES_GET_FLAG(FLAGS_minimizer)
  //     << ", valid options are: trust_region and line_search.";

  options.max_num_iterations = 100;
  options.linear_solver_type = ceres::DENSE_QR;
  options.minimizer_progress_to_stdout = true;

  // clang-format off
  std::cout << "Initial x1 = " << x1
            << ", x2 = " << x2
            << ", x3 = " << x3
            << ", x4 = " << x4
            << "\n";
  // clang-format on

  // Run the solver!
  Solver::Summary summary;
  Solve(options, &problem, &summary);

  std::cout << summary.FullReport() << "\n";
  // clang-format off
  std::cout << "Final x1 = " << x1
            << ", x2 = " << x2
            << ", x3 = " << x3
            << ", x4 = " << x4
            << "\n";
  // clang-format on
  return 0;
}

\(\quad{}\) 编译并运行可得到如下输出:

Initial x1 = 3, x2 = -1, x3 = 0, x4 = 1
iter      cost      cost_change  |gradient|   |step|    tr_ratio  tr_radius  ls_iter  iter_time  total_time
   0  1.075000e+02    0.00e+00    1.55e+02   0.00e+00   0.00e+00  1.00e+04        0    5.89e-05    5.71e-04
   1  5.036190e+00    1.02e+02    2.00e+01   2.16e+00   9.53e-01  3.00e+04        1    1.80e-04    7.80e-04
   2  3.148168e-01    4.72e+00    2.50e+00   6.23e-01   9.37e-01  9.00e+04        1    3.90e-05    8.32e-04
   3  1.967760e-02    2.95e-01    3.13e-01   3.08e-01   9.37e-01  2.70e+05        1    3.65e-05    8.79e-04
   4  1.229900e-03    1.84e-02    3.91e-02   1.54e-01   9.37e-01  8.10e+05        1    3.57e-05    9.23e-04
   5  7.687123e-05    1.15e-03    4.89e-03   7.69e-02   9.37e-01  2.43e+06        1    3.54e-05    9.67e-04
   6  4.804625e-06    7.21e-05    6.11e-04   3.85e-02   9.37e-01  7.29e+06        1    3.52e-05    1.01e-03
   7  3.003028e-07    4.50e-06    7.64e-05   1.92e-02   9.37e-01  2.19e+07        1    3.53e-05    1.05e-03
   8  1.877006e-08    2.82e-07    9.54e-06   9.62e-03   9.37e-01  6.56e+07        1    3.53e-05    1.10e-03
   9  1.173223e-09    1.76e-08    1.19e-06   4.81e-03   9.37e-01  1.97e+08        1    3.58e-05    1.15e-03
  10  7.333425e-11    1.10e-09    1.49e-07   2.40e-03   9.37e-01  5.90e+08        1    3.54e-05    1.19e-03
  11  4.584044e-12    6.88e-11    1.86e-08   1.20e-03   9.37e-01  1.77e+09        1    3.51e-05    1.24e-03
  12  2.865573e-13    4.30e-12    2.33e-09   6.02e-04   9.37e-01  5.31e+09        1    3.55e-05    1.28e-03
  13  1.791438e-14    2.69e-13    2.91e-10   3.01e-04   9.37e-01  1.59e+10        1    3.55e-05    1.32e-03
  14  1.120029e-15    1.68e-14    3.64e-11   1.51e-04   9.37e-01  4.78e+10        1    3.54e-05    1.37e-03

Solver Summary (v 1.13.0-eigen-(3.3.4)-lapack-suitesparse-(5.1.2)-cxsparse-(3.1.9)-openmp)

                                     Original                  Reduced
Parameter blocks                            4                        4
Parameters                                  4                        4
Residual blocks                             4                        4
Residual                                    4                        4

Minimizer                        TRUST_REGION

Dense linear algebra library            EIGEN
Trust region strategy     LEVENBERG_MARQUARDT

                                        Given                     Used
Linear solver                        DENSE_QR                 DENSE_QR
Threads                                     1                        1
Linear solver threads                       1                        1
Linear solver ordering              AUTOMATIC                        4

Cost:
Initial                          1.075000e+02
Final                            1.120029e-15
Change                           1.075000e+02

Minimizer iterations                       15
Successful steps                           15
Unsuccessful steps                          0

Time (in seconds):
Preprocessor                           0.0005

  Residual evaluation                  0.0000
  Jacobian evaluation                  0.0005
  Linear solver                        0.0001
Minimizer                              0.0009

Postprocessor                          0.0000
Total                                  0.0014

Termination:                      CONVERGENCE (Gradient tolerance reached. Gradient max norm: 3.642190e-11 <= 1.000000e-10)

Final x1 = 0.000146222, x2 = -1.46222e-05, x3 = 2.40957e-05, x4 = 2.40957e-05

\(\quad{}\) 其实我们很容易看出函数的最优解在 \(x_1=0, x_2=0,x_3=0,x_4=0\) ,而最终的计算结果虽然并不是完全为\(0\) 但是也基本接近。