Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: orxonox.OLD/orxonox/trunk/src/lib/newmat/newmatnl.h @ 4585

Last change on this file since 4585 was 4565, checked in by patrick, 19 years ago

orxonox/trunk: added the newmat library to the project. needs some translation in directory, temp under util/newmat. is needed by the collision detection engine to perform lin alg operations such as eigenvector decomposition. perhaps we will make our own library to do that later.

File size: 10.8 KB
Line 
1//$$ newmatnl.h           definition file for non-linear optimisation
2
3// Copyright (C) 1993,4,5: R B Davies
4
5#ifndef NEWMATNL_LIB
6#define NEWMATNL_LIB 0
7
8#include "newmat.h"
9
10#ifdef use_namespace
11namespace NEWMAT {
12#endif
13
14
15
16/*
17
18This is a beginning of a series of classes for non-linear optimisation.
19
20At present there are two classes. FindMaximum2 is the basic optimisation
21strategy when one is doing an optimisation where one has first
22derivatives and estimates of the second derivatives. Class
23NonLinearLeastSquares is derived from FindMaximum2. This provides the
24functions that calculate function values and derivatives.
25
26A third class is now added. This is for doing maximum-likelihood when
27you have first derviatives and something like the Fisher Information
28matrix (eg the variance covariance matrix of the first derivatives or
29minus the second derivatives - this matrix is assumed to be positive
30definite).
31
32
33
34   class FindMaximum2
35
36Suppose T is the ColumnVector of parameters, F(T) the function we want
37to maximise, D(T) the ColumnVector of derivatives of F with respect to
38T, and S(T) the matrix of second derivatives.
39
40Then the basic iteration is given a value of T, update it to
41
42           T - S.i() * D
43
44where .i() denotes inverse.
45
46If F was quadratic this would give exactly the right answer (except it
47might get a minimum rather than a maximum). Since F is not usually
48quadratic, the simple procedure would be to recalculate S and D with the
49new value of T and keep iterating until the process converges. This is
50known as the method of conjugate gradients.
51
52In practice, this method may not converge. FindMaximum2 considers an
53iteration of the form
54
55           T - x * S.i() * D
56
57where x is a number. It tries x = 1 and uses the values of F and its
58slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
59choses x to maximise the resulting function. This gives our new value of
60T. The program checks that the value of F is getting better and carries
61out a variety of strategies if it is not.
62
63The program also has a second strategy. If the successive values of T
64seem to be lying along a curve - eg we are following along a curved
65ridge, the program will try to fit this ridge and project along it. This
66does not work at present and is commented out.
67
68FindMaximum2 has three virtual functions which need to be over-ridden by
69a derived class.
70
71   void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
72
73T is the column vector of parameters. The function returns the value of
74the function to f, but may instead set oorg to true if the parameter
75values are not valid. If wg is true it may also calculate and store the
76second derivative information.
77
78   bool NextPoint(ColumnVector& H, Real& d);
79
80Using the value of T provided in the previous call of Value, find the
81conjugate gradients adjustment to T, that is - S.i() * D. Also return
82
83           d = D.t() * S.i() * D.
84
85NextPoint should return true if it considers that the process has
86converged (d very small) and false otherwise. The previous call of Value
87will have set wg to true, so that S will be available.
88
89   Real LastDerivative(const ColumnVector& H);
90
91Return the scalar product of H and the vector of derivatives at the last
92value of T.
93
94The function Fit is the function that calls the iteration.
95
96   void Fit(ColumnVector&, int);
97
98The arguments are the trial parameter values as a ColumnVector and the
99maximum number of iterations. The program calls a DataException if the
100initial parameters are not valid and a ConvergenceException if the
101process fails to converge.
102
103
104   class NonLinearLeastSquares
105
106This class is derived from FindMaximum2 and carries out a non-linear
107least squares fit. It uses a QR decomposition to carry out the
108operations required by FindMaximum2.
109
110A prototype class R1_Col_I_D is provided. The user needs to derive a
111class from this which includes functions the predicted value of each
112observation its derivatives. An object from this class has to be
113provided to class NonLinearLeastSquares.
114
115Suppose we observe n normal random variables with the same unknown
116variance and such the i-th one has expected value given by f(i,P)
117where P is a column vector of unknown parameters and f is a known
118function. We wish to estimate P.
119
120First derive a class from R1_Col_I_D and override Real operator()(int i)
121to give the value of the function f in terms of i and the ColumnVector
122para defined in class R1_CoL_I_D. Also override ReturnMatrix
123Derivatives() to give the derivates of f at para and the value of i
124used in the preceeding call to operator(). Return the result as a
125RowVector. Construct an object from this class. Suppose in what follows
126it is called pred.
127
128Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
129an iteration limit and an accuracy critierion.
130
131   NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
132
133The accuracy critierion should be somewhat less than one and 0.0001 is
134about the smallest sensible value.
135
136Define a ColumnVector P containing a guess at the value of the unknown
137parameter, and a ColumnVector Y containing the unknown data. Call
138
139   NLLS.Fit(Y,P);
140
141If the process converges, P will contain the estimates of the unknown
142parameters. If it does not converge an exception will be generated.
143
144The following member functions can be called after you have done a fit.
145
146Real ResidualVariance() const;
147
148The estimate of the variance of the observations.
149
150void GetResiduals(ColumnVector& Z) const;
151
152The residuals of the individual observations.
153
154void GetStandardErrors(ColumnVector&);
155
156The standard errors of the observations.
157
158void GetCorrelations(SymmetricMatrix&);
159
160The correlations of the observations.
161
162void GetHatDiagonal(DiagonalMatrix&) const;
163
164Forms a diagonal matrix of values between 0 and 1. If the i-th value is
165larger than, say 0.2, then the i-th data value could have an undue
166influence on your estimates.
167
168
169*/
170
171class FindMaximum2
172{
173   virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
174   virtual bool NextPoint(ColumnVector&, Real&) = 0;
175   virtual Real LastDerivative(const ColumnVector&) = 0;
176public:
177   void Fit(ColumnVector&, int);
178   virtual ~FindMaximum2() {}            // to keep gnu happy
179};
180
181class R1_Col_I_D
182{
183   // The prototype for a Real function of a ColumnVector and an
184   // integer.
185   // You need to derive your function from this one and put in your
186   // function for operator() and Derivatives() at least.
187   // You may also want to set up a constructor to enter in additional
188   // parameter values (that will not vary during the solve).
189
190protected:
191   ColumnVector para;                 // Current x value
192
193public:
194   virtual bool IsValid() { return true; }
195                                       // is the current x value OK
196   virtual Real operator()(int i) = 0; // i-th function value at current para
197   virtual void Set(const ColumnVector& X) { para = X; }
198                                       // set current para
199   bool IsValid(const ColumnVector& X)
200      { Set(X); return IsValid(); }
201                                       // set para, check OK
202   Real operator()(int i, const ColumnVector& X)
203      { Set(X); return operator()(i); }
204                                       // set para, return value
205   virtual ReturnMatrix Derivatives() = 0;
206                                       // return derivatives as RowVector
207   virtual ~R1_Col_I_D() {}            // to keep gnu happy
208};
209
210
211class NonLinearLeastSquares : public FindMaximum2
212{
213   // these replace the corresponding functions in FindMaximum2
214   void Value(const ColumnVector&, bool, Real&, bool&);
215   bool NextPoint(ColumnVector&, Real&);
216   Real LastDerivative(const ColumnVector&);
217
218   Matrix X;                         // the things we need to do the
219   ColumnVector Y;                   // QR triangularisation
220   UpperTriangularMatrix U;          // see the write-up in newmata.txt
221   ColumnVector M;
222   Real errorvar, criterion;
223   int n_obs, n_param;
224   const ColumnVector* DataPointer;
225   RowVector Derivs;
226   SymmetricMatrix Covariance;
227   DiagonalMatrix SE;
228   R1_Col_I_D& Pred;                 // Reference to predictor object
229   int Lim;                          // maximum number of iterations
230
231public:
232   NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
233      : criterion(crit), Pred(pred), Lim(lim) {}
234   void Fit(const ColumnVector&, ColumnVector&);
235   Real ResidualVariance() const { return errorvar; }
236   void GetResiduals(ColumnVector& Z) const { Z = Y; }
237   void GetStandardErrors(ColumnVector&);
238   void GetCorrelations(SymmetricMatrix&);
239   void GetHatDiagonal(DiagonalMatrix&) const;
240
241private:
242   void MakeCovariance();
243};
244
245
246// The next class is the prototype class for calculating the
247// log-likelihood.
248// I assume first derivatives are available and something like the
249// Fisher Information or variance/covariance matrix of the first
250// derivatives or minus the matrix of second derivatives is
251// available. This matrix must be positive definite.
252
253class LL_D_FI
254{
255protected:
256   ColumnVector para;                  // current parameter values
257   bool wg;                         // true if FI matrix wanted
258
259public:
260   virtual void Set(const ColumnVector& X) { para = X; }
261                                       // set parameter values
262   virtual void WG(bool wgx) { wg = wgx; }
263                                       // set wg
264
265   virtual bool IsValid() { return true; }
266                                       // return true is para is OK
267   bool IsValid(const ColumnVector& X, bool wgx=true)
268      { Set(X); WG(wgx); return IsValid(); }
269
270   virtual Real LogLikelihood() = 0;   // return the loglikelihhod
271   Real LogLikelihood(const ColumnVector& X, bool wgx=true)
272      { Set(X); WG(wgx); return LogLikelihood(); }
273
274   virtual ReturnMatrix Derivatives() = 0;
275                                       // column vector of derivatives
276   virtual ReturnMatrix FI() = 0;      // Fisher Information matrix
277   virtual ~LL_D_FI() {}               // to keep gnu happy
278};
279
280// This is the class for doing the maximum likelihood estimation
281
282class MLE_D_FI : public FindMaximum2
283{
284   // these replace the corresponding functions in FindMaximum2
285   void Value(const ColumnVector&, bool, Real&, bool&);
286   bool NextPoint(ColumnVector&, Real&);
287   Real LastDerivative(const ColumnVector&);
288
289   // the things we need for the analysis
290   LL_D_FI& LL;                        // reference to log-likelihood
291   int Lim;                            // maximum number of iterations
292   Real Criterion;                     // convergence criterion
293   ColumnVector Derivs;                // for the derivatives
294   LowerTriangularMatrix LT;           // Cholesky decomposition of FI
295   SymmetricMatrix Covariance;
296   DiagonalMatrix SE;
297
298public:
299   MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
300      : LL(ll), Lim(lim), Criterion(criterion) {}
301   void Fit(ColumnVector& Parameters);
302   void GetStandardErrors(ColumnVector&);
303   void GetCorrelations(SymmetricMatrix&);
304
305private:
306   void MakeCovariance();
307};
308
309
310#ifdef use_namespace
311}
312#endif
313
314
315
316#endif
317
318// body file: newmatnl.cpp
319
320
321
322
Note: See TracBrowser for help on using the repository browser.