【SciPy库】scipy.optimize.fmin_l_bfgs_b进行L-BFGS优化

LibraryPKU 2022-08-10 发布于北京

展开全文

【时间】2020.01.07

【题目】【SciPy库】scipy.optimize.fmin_l_bfgs_b进行L-BFGS优化

具体用法参考官方文档：scipy.optimize.fmin_l_bfgs_b

x,min_val,info=scipy.optimize.fmin_l_bfgs_b(func, x0, fprime=None, args=(), approx_grad=0, bounds=None, m=10, factr=10000000.0, pgtol=1e-05, epsilon=1e-08, iprint=-1, maxfun=15000, disp=None)

一、参数：主要是loss函数func、待更新参数初始值x0、梯度函数fprime以及maxfun（梯度更新的次数）

注意grad必须是展开的向量（2D），如果x是3D矩阵，需要先flaten.

func : callable f(x,*args)
Function to minimise.最小化的目标，一般是loss函数
x0 : ndarray
Initial guess.最初的猜测，即待更新参数初始值。
fprime : callable fprime(x,*args)
The gradient of func. 梯度函数
If None, then func returns the function value and the gradient (f, g = func(x, *args)), unless approx_grad is True in which case func returns only f.
args : sequence
Arguments to pass to func and fprime. func and fprime函数的参数
approx_grad : bool
Whether to approximate the gradient numerically (in which case func returns only the function value).
bounds : list
(min, max) pairs for each element in x, defining the bounds on that parameter. Use None for one of min or max when there is no bound in that direction.
m : int
The maximum number of variable metric corrections used to define the limited memory matrix. (The limited memory BFGS method does not store the full hessian but uses this many terms in an approximation to it.)
factr : float
The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, where eps is the machine precision, which is automatically generated by the code. Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy.
pgtol : float
The iteration will stop when max{|proj g_i | i = 1, ..., n} <= pgtol where pg_i is the i-th component of the projected gradient.
epsilon : float
Step size used when approx_grad is True, for numerically calculating the gradient
iprint : int
Controls the frequency of output. iprint < 0 means no output.
disp : int, optional
If zero, then no output. If positive number, then this over-rides iprint.
maxfun : int
Maximum number of function evaluations.功能评估的最大数量

二、返回值

x : array_like
Estimated position of the minimum.估计最小值的位置，即loss最小时对应的x
f : float
Value of func at the minimum.最小的Func值,即loss值。
d : dict
Information dictionary.
d['warnflag’] is
0 if converged,
1 if too many function evaluations,
2 if stopped for another reason, given in d['task’]
d['grad’] is the gradient at the minimum (should be 0 ish)
d['funcalls’] is the number of function calls made. 即梯度更新的次数。

info举例：

{'grad': array([-7.65604162, -2.14013386,  3.16267967, ..., -1.03821039,
       -4.23868084, -3.17428398]),
 'task': b'STOP: TOTAL NO. of f AND g EVALUATIONS EXCEEDS LIMIT', 
'funcalls': 51, 
'nit': 47, 'warnflag': 1}

补充：更多scipy库知识：

Python机器学习及分析工具：Scipy篇

Scipy lecture note中文文档