Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: orxonox.OLD/orxonox/trunk/src/util/newmat/newmatnl.h @ 4567

Last change on this file since 4567 was 4565, checked in by patrick, 20 years ago
orxonox/trunk: added the newmat library to the project. needs some translation in directory, temp under util/newmat. is needed by the collision detection engine to perform lin alg operations such as eigenvector decomposition. perhaps we will make our own library to do that later.
File size: 10.8 KB

Rev	Line
[4565]	1	//$$ newmatnl.h definition file for non-linear optimisation
	2
	3	// Copyright (C) 1993,4,5: R B Davies
	4
	5	#ifndef NEWMATNL_LIB
	6	#define NEWMATNL_LIB 0
	7
	8	#include "newmat.h"
	9
	10	#ifdef use_namespace
	11	namespace NEWMAT {
	12	#endif
	13
	14
	15
	16	/*
	17
	18	This is a beginning of a series of classes for non-linear optimisation.
	19
	20	At present there are two classes. FindMaximum2 is the basic optimisation
	21	strategy when one is doing an optimisation where one has first
	22	derivatives and estimates of the second derivatives. Class
	23	NonLinearLeastSquares is derived from FindMaximum2. This provides the
	24	functions that calculate function values and derivatives.
	25
	26	A third class is now added. This is for doing maximum-likelihood when
	27	you have first derviatives and something like the Fisher Information
	28	matrix (eg the variance covariance matrix of the first derivatives or
	29	minus the second derivatives - this matrix is assumed to be positive
	30	definite).
	31
	32
	33
	34	class FindMaximum2
	35
	36	Suppose T is the ColumnVector of parameters, F(T) the function we want
	37	to maximise, D(T) the ColumnVector of derivatives of F with respect to
	38	T, and S(T) the matrix of second derivatives.
	39
	40	Then the basic iteration is given a value of T, update it to
	41
	42	T - S.i() * D
	43
	44	where .i() denotes inverse.
	45
	46	If F was quadratic this would give exactly the right answer (except it
	47	might get a minimum rather than a maximum). Since F is not usually
	48	quadratic, the simple procedure would be to recalculate S and D with the
	49	new value of T and keep iterating until the process converges. This is
	50	known as the method of conjugate gradients.
	51
	52	In practice, this method may not converge. FindMaximum2 considers an
	53	iteration of the form
	54
	55	T - x * S.i() * D
	56
	57	where x is a number. It tries x = 1 and uses the values of F and its
	58	slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
	59	choses x to maximise the resulting function. This gives our new value of
	60	T. The program checks that the value of F is getting better and carries
	61	out a variety of strategies if it is not.
	62
	63	The program also has a second strategy. If the successive values of T
	64	seem to be lying along a curve - eg we are following along a curved
	65	ridge, the program will try to fit this ridge and project along it. This
	66	does not work at present and is commented out.
	67
	68	FindMaximum2 has three virtual functions which need to be over-ridden by
	69	a derived class.
	70
	71	void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
	72
	73	T is the column vector of parameters. The function returns the value of
	74	the function to f, but may instead set oorg to true if the parameter
	75	values are not valid. If wg is true it may also calculate and store the
	76	second derivative information.
	77
	78	bool NextPoint(ColumnVector& H, Real& d);
	79
	80	Using the value of T provided in the previous call of Value, find the
	81	conjugate gradients adjustment to T, that is - S.i() * D. Also return
	82
	83	d = D.t() * S.i() * D.
	84
	85	NextPoint should return true if it considers that the process has
	86	converged (d very small) and false otherwise. The previous call of Value
	87	will have set wg to true, so that S will be available.
	88
	89	Real LastDerivative(const ColumnVector& H);
	90
	91	Return the scalar product of H and the vector of derivatives at the last
	92	value of T.
	93
	94	The function Fit is the function that calls the iteration.
	95
	96	void Fit(ColumnVector&, int);
	97
	98	The arguments are the trial parameter values as a ColumnVector and the
	99	maximum number of iterations. The program calls a DataException if the
	100	initial parameters are not valid and a ConvergenceException if the
	101	process fails to converge.
	102
	103
	104	class NonLinearLeastSquares
	105
	106	This class is derived from FindMaximum2 and carries out a non-linear
	107	least squares fit. It uses a QR decomposition to carry out the
	108	operations required by FindMaximum2.
	109
	110	A prototype class R1_Col_I_D is provided. The user needs to derive a
	111	class from this which includes functions the predicted value of each
	112	observation its derivatives. An object from this class has to be
	113	provided to class NonLinearLeastSquares.
	114
	115	Suppose we observe n normal random variables with the same unknown
	116	variance and such the i-th one has expected value given by f(i,P)
	117	where P is a column vector of unknown parameters and f is a known
	118	function. We wish to estimate P.
	119
	120	First derive a class from R1_Col_I_D and override Real operator()(int i)
	121	to give the value of the function f in terms of i and the ColumnVector
	122	para defined in class R1_CoL_I_D. Also override ReturnMatrix
	123	Derivatives() to give the derivates of f at para and the value of i
	124	used in the preceeding call to operator(). Return the result as a
	125	RowVector. Construct an object from this class. Suppose in what follows
	126	it is called pred.
	127
	128	Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
	129	an iteration limit and an accuracy critierion.
	130
	131	NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
	132
	133	The accuracy critierion should be somewhat less than one and 0.0001 is
	134	about the smallest sensible value.
	135
	136	Define a ColumnVector P containing a guess at the value of the unknown
	137	parameter, and a ColumnVector Y containing the unknown data. Call
	138
	139	NLLS.Fit(Y,P);
	140
	141	If the process converges, P will contain the estimates of the unknown
	142	parameters. If it does not converge an exception will be generated.
	143
	144	The following member functions can be called after you have done a fit.
	145
	146	Real ResidualVariance() const;
	147
	148	The estimate of the variance of the observations.
	149
	150	void GetResiduals(ColumnVector& Z) const;
	151
	152	The residuals of the individual observations.
	153
	154	void GetStandardErrors(ColumnVector&);
	155
	156	The standard errors of the observations.
	157
	158	void GetCorrelations(SymmetricMatrix&);
	159
	160	The correlations of the observations.
	161
	162	void GetHatDiagonal(DiagonalMatrix&) const;
	163
	164	Forms a diagonal matrix of values between 0 and 1. If the i-th value is
	165	larger than, say 0.2, then the i-th data value could have an undue
	166	influence on your estimates.
	167
	168
	169	*/
	170
	171	class FindMaximum2
	172	{
	173	virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
	174	virtual bool NextPoint(ColumnVector&, Real&) = 0;
	175	virtual Real LastDerivative(const ColumnVector&) = 0;
	176	public:
	177	void Fit(ColumnVector&, int);
	178	virtual ~FindMaximum2() {} // to keep gnu happy
	179	};
	180
	181	class R1_Col_I_D
	182	{
	183	// The prototype for a Real function of a ColumnVector and an
	184	// integer.
	185	// You need to derive your function from this one and put in your
	186	// function for operator() and Derivatives() at least.
	187	// You may also want to set up a constructor to enter in additional
	188	// parameter values (that will not vary during the solve).
	189
	190	protected:
	191	ColumnVector para; // Current x value
	192
	193	public:
	194	virtual bool IsValid() { return true; }
	195	// is the current x value OK
	196	virtual Real operator()(int i) = 0; // i-th function value at current para
	197	virtual void Set(const ColumnVector& X) { para = X; }
	198	// set current para
	199	bool IsValid(const ColumnVector& X)
	200	{ Set(X); return IsValid(); }
	201	// set para, check OK
	202	Real operator()(int i, const ColumnVector& X)
	203	{ Set(X); return operator()(i); }
	204	// set para, return value
	205	virtual ReturnMatrix Derivatives() = 0;
	206	// return derivatives as RowVector
	207	virtual ~R1_Col_I_D() {} // to keep gnu happy
	208	};
	209
	210
	211	class NonLinearLeastSquares : public FindMaximum2
	212	{
	213	// these replace the corresponding functions in FindMaximum2
	214	void Value(const ColumnVector&, bool, Real&, bool&);
	215	bool NextPoint(ColumnVector&, Real&);
	216	Real LastDerivative(const ColumnVector&);
	217
	218	Matrix X; // the things we need to do the
	219	ColumnVector Y; // QR triangularisation
	220	UpperTriangularMatrix U; // see the write-up in newmata.txt
	221	ColumnVector M;
	222	Real errorvar, criterion;
	223	int n_obs, n_param;
	224	const ColumnVector* DataPointer;
	225	RowVector Derivs;
	226	SymmetricMatrix Covariance;
	227	DiagonalMatrix SE;
	228	R1_Col_I_D& Pred; // Reference to predictor object
	229	int Lim; // maximum number of iterations
	230
	231	public:
	232	NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
	233	: criterion(crit), Pred(pred), Lim(lim) {}
	234	void Fit(const ColumnVector&, ColumnVector&);
	235	Real ResidualVariance() const { return errorvar; }
	236	void GetResiduals(ColumnVector& Z) const { Z = Y; }
	237	void GetStandardErrors(ColumnVector&);
	238	void GetCorrelations(SymmetricMatrix&);
	239	void GetHatDiagonal(DiagonalMatrix&) const;
	240
	241	private:
	242	void MakeCovariance();
	243	};
	244
	245
	246	// The next class is the prototype class for calculating the
	247	// log-likelihood.
	248	// I assume first derivatives are available and something like the
	249	// Fisher Information or variance/covariance matrix of the first
	250	// derivatives or minus the matrix of second derivatives is
	251	// available. This matrix must be positive definite.
	252
	253	class LL_D_FI
	254	{
	255	protected:
	256	ColumnVector para; // current parameter values
	257	bool wg; // true if FI matrix wanted
	258
	259	public:
	260	virtual void Set(const ColumnVector& X) { para = X; }
	261	// set parameter values
	262	virtual void WG(bool wgx) { wg = wgx; }
	263	// set wg
	264
	265	virtual bool IsValid() { return true; }
	266	// return true is para is OK
	267	bool IsValid(const ColumnVector& X, bool wgx=true)
	268	{ Set(X); WG(wgx); return IsValid(); }
	269
	270	virtual Real LogLikelihood() = 0; // return the loglikelihhod
	271	Real LogLikelihood(const ColumnVector& X, bool wgx=true)
	272	{ Set(X); WG(wgx); return LogLikelihood(); }
	273
	274	virtual ReturnMatrix Derivatives() = 0;
	275	// column vector of derivatives
	276	virtual ReturnMatrix FI() = 0; // Fisher Information matrix
	277	virtual ~LL_D_FI() {} // to keep gnu happy
	278	};
	279
	280	// This is the class for doing the maximum likelihood estimation
	281
	282	class MLE_D_FI : public FindMaximum2
	283	{
	284	// these replace the corresponding functions in FindMaximum2
	285	void Value(const ColumnVector&, bool, Real&, bool&);
	286	bool NextPoint(ColumnVector&, Real&);
	287	Real LastDerivative(const ColumnVector&);
	288
	289	// the things we need for the analysis
	290	LL_D_FI& LL; // reference to log-likelihood
	291	int Lim; // maximum number of iterations
	292	Real Criterion; // convergence criterion
	293	ColumnVector Derivs; // for the derivatives
	294	LowerTriangularMatrix LT; // Cholesky decomposition of FI
	295	SymmetricMatrix Covariance;
	296	DiagonalMatrix SE;
	297
	298	public:
	299	MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
	300	: LL(ll), Lim(lim), Criterion(criterion) {}
	301	void Fit(ColumnVector& Parameters);
	302	void GetStandardErrors(ColumnVector&);
	303	void GetCorrelations(SymmetricMatrix&);
	304
	305	private:
	306	void MakeCovariance();
	307	};
	308
	309
	310	#ifdef use_namespace
	311	}
	312	#endif
	313
	314
	315
	316	#endif
	317
	318	// body file: newmatnl.cpp
	319
	320
	321
	322

Note: See TracBrowser for help on using the repository browser.

Download in other formats: