gridfit: estimates a surface on a 2d grid, based on scattered data Replicates are allowed. All methods extrapolate to the grid boundaries. Gridfit uses a modified ridge estimator to generate the surface, where the bias is toward smoothness. Gridfit is not an interpolant. Its goal is a smooth surface that approximates your data, but allows you to control the amount of smoothing. usage #1: zgrid = gridfit(x,y,z,xnodes,ynodes); usage #2: [zgrid,xgrid,ygrid] = gridfit(x,y,z,xnodes,ynodes); usage #3: zgrid = gridfit(x,y,z,xnodes,ynodes,prop,val,prop,val,...); Arguments: (input) x,y,z - vectors of equal lengths, containing arbitrary scattered data The only constraint on x and y is they cannot ALL fall on a single line in the x-y plane. Replicate points will be treated in a least squares sense. ANY points containing a NaN are ignored in the estimation xnodes - vector defining the nodes in the grid in the independent variable (x). xnodes need not be equally spaced. xnodes must completely span the data. If they do not, then the 'extend' property is applied, adjusting the first and last nodes to be extended as necessary. See below for a complete description of the 'extend' property. If xnodes is a scalar integer, then it specifies the number of equally spaced nodes between the min and max of the data. ynodes - vector defining the nodes in the grid in the independent variable (y). ynodes need not be equally spaced. If ynodes is a scalar integer, then it specifies the number of equally spaced nodes between the min and max of the data. Also see the extend property. Additional arguments follow in the form of property/value pairs. Valid properties are: 'smoothness', 'interp', 'regularizer', 'solver', 'maxiter' 'extend', 'tilesize', 'overlap' Any UNAMBIGUOUS shortening (even down to a single letter) is valid for property names. All properties have default values, chosen (I hope) to give a reasonable result out of the box. 'smoothness' - scalar - determines the eventual smoothness of the estimated surface. A larger value here means the surface will be smoother. Smoothness must be a non-negative real number. Note: the problem is normalized in advance so that a smoothness of 1 MAY generate reasonable results. If you find the result is too smooth, then use a smaller value for this parameter. Likewise, bumpy surfaces suggest use of a larger value. (Sometimes, use of an iterative solver with too small a limit on the maximum number of iterations will result in non-convergence.) DEFAULT: 1 'interp' - character, denotes the interpolation scheme used to interpolate the data. DEFAULT: 'triangle' 'bilinear' - use bilinear interpolation within the grid (also known as tensor product linear interpolation) 'triangle' - split each cell in the grid into a triangle, then linear interpolation inside each triangle 'nearest' - nearest neighbor interpolation. This will rarely be a good choice, but I included it as an option for completeness. 'regularizer' - character flag, denotes the regularization paradignm to be used. There are currently three options. DEFAULT: 'gradient' 'diffusion' or 'laplacian' - uses a finite difference approximation to the Laplacian operator (i.e, del^2). We can think of the surface as a plate, wherein the bending rigidity of the plate is specified by the user as a number relative to the importance of fidelity to the data. A stiffer plate will result in a smoother surface overall, but fit the data less well. I've modeled a simple plate using the Laplacian, del^2. (A projected enhancement is to do a better job with the plate equations.) We can also view the regularizer as a diffusion problem, where the relative thermal conductivity is supplied. Here interpolation is seen as a problem of finding the steady temperature profile in an object, given a set of points held at a fixed temperature. Extrapolation will be linear. Both paradigms are appropriate for a Laplacian regularizer. 'gradient' - attempts to ensure the gradient is as smooth as possible everywhere. Its subtly different from the 'diffusion' option, in that here the directional derivatives are biased to be smooth across cell boundaries in the grid. The gradient option uncouples the terms in the Laplacian. Think of it as two coupled PDEs instead of one PDE. Why are they different at all? The terms in the Laplacian can balance each other. 'springs' - uses a spring model connecting nodes to each other, as well as connecting data points to the nodes in the grid. This choice will cause any extrapolation to be as constant as possible. Here the smoothing parameter is the relative stiffness of the springs connecting the nodes to each other compared to the stiffness of a spting connecting the lattice to each data point. Since all springs have a rest length (length at which the spring has zero potential energy) of zero, any extrapolation will be minimized. Note: I don't terribly like the 'springs' strategy. It tends to drag the surface towards the mean of all the data. Its been left in only because the paradigm interests me. 'solver' - character flag - denotes the solver used for the resulting linear system. Different solvers will have different solution times depending upon the specific problem to be solved. Up to a certain size grid, the direct \ solver will often be speedy, until memory swaps causes problems. What solver should you use? Problems with a significant amount of extrapolation should avoid lsqr. \ may be best numerically for small smoothnesss parameters and high extents of extrapolation. Large numbers of points will slow down the direct \, but when applied to the normal equations, \ can be quite fast. Since the equations generated by these methods will tend to be well conditioned, the normal equations are not a bad choice of method to use. Beware when a small smoothing parameter is used, since this will make the equations less well conditioned. DEFAULT: 'normal' '\' - uses matlab's backslash operator to solve the sparse system. 'backslash' is an alternate name. 'symmlq' - uses matlab's iterative symmlq solver 'lsqr' - uses matlab's iterative lsqr solver 'normal' - uses \ to solve the normal equations. 'maxiter' - only applies to iterative solvers - defines the maximum number of iterations for an iterative solver DEFAULT: min(10000,length(xnodes)*length(ynodes)) 'extend' - character flag - controls whether the first and last nodes in each dimension are allowed to be adjusted to bound the data, and whether the user will be warned if this was deemed necessary to happen. DEFAULT: 'warning' 'warning' - Adjust the first and/or last node in x or y if the nodes do not FULLY contain the data. Issue a warning message to this effect, telling the amount of adjustment applied. 'never' - Issue an error message when the nodes do not absolutely contain the data. 'always' - automatically adjust the first and last nodes in each dimension if necessary. No warning is given when this option is set. 'tilesize' - grids which are simply too large to solve for in one single estimation step can be built as a set of tiles. For example, a 1000x1000 grid will require the estimation of 1e6 unknowns. This is likely to require more memory (and time) than you have available. But if your data is dense enough, then you can model it locally using smaller tiles of the grid. My recommendation for a reasonable tilesize is roughly 100 to 200. Tiles of this size take only a few seconds to solve normally, so the entire grid can be modeled in a finite amount of time. The minimum tilesize can never be less than 3, although even this size tile is so small as to be ridiculous. If your data is so sparse than some tiles contain insufficient data to model, then those tiles will be left as NaNs. DEFAULT: inf 'overlap' - Tiles in a grid have some overlap, so they can minimize any problems along the edge of a tile. In this overlapped region, the grid is built using a bi-linear combination of the overlapping tiles. The overlap is specified as a fraction of the tile size, so an overlap of 0.20 means there will be a 20% overlap of successive tiles. I do allow a zero overlap, but it must be no more than 1/2. 0 <= overlap <= 0.5 Overlap is ignored if the tilesize is greater than the number of nodes in both directions. DEFAULT: 0.20 'autoscale' - Some data may have widely different scales on the respective x and y axes. If this happens, then the regularization may experience difficulties. autoscale = 'on' will cause gridfit to scale the x and y node intervals to a unit length. This should improve the regularization procedure. The scaling is purely internal. autoscale = 'off' will disable automatic scaling DEFAULT: 'on' Arguments: (output) zgrid - (nx,ny) array containing the fitted surface xgrid, ygrid - as returned by meshgrid(xnodes,ynodes) Speed considerations: Remember that gridfit must solve a LARGE system of linear equations. There will be as many unknowns as the total number of nodes in the final lattice. While these equations may be sparse, solving a system of 10000 equations may take a second or so. Very large problems may benefit from the iterative solvers or from tiling. Example usage: x = rand(100,1); y = rand(100,1); z = exp(x+2*y); xnodes = 0:.1:1; ynodes = 0:.1:1; g = gridfit(x,y,z,xnodes,ynodes); Note: this is equivalent to the following call: g = gridfit(x,y,z,xnodes,ynodes, ... 'smooth',1, ... 'interp','triangle', ... 'solver','normal', ... 'regularizer','gradient', ... 'extend','warning', ... 'tilesize',inf); Author: John D'Errico e-mail address: woodchips@rochester.rr.com Release: 2.0 Release date: 5/23/06
0001 function [zgrid,xgrid,ygrid] = gridfit(x,y,z,xnodes,ynodes,varargin) 0002 % gridfit: estimates a surface on a 2d grid, based on scattered data 0003 % Replicates are allowed. All methods extrapolate to the grid 0004 % boundaries. Gridfit uses a modified ridge estimator to 0005 % generate the surface, where the bias is toward smoothness. 0006 % 0007 % Gridfit is not an interpolant. Its goal is a smooth surface 0008 % that approximates your data, but allows you to control the 0009 % amount of smoothing. 0010 % 0011 % usage #1: zgrid = gridfit(x,y,z,xnodes,ynodes); 0012 % usage #2: [zgrid,xgrid,ygrid] = gridfit(x,y,z,xnodes,ynodes); 0013 % usage #3: zgrid = gridfit(x,y,z,xnodes,ynodes,prop,val,prop,val,...); 0014 % 0015 % Arguments: (input) 0016 % x,y,z - vectors of equal lengths, containing arbitrary scattered data 0017 % The only constraint on x and y is they cannot ALL fall on a 0018 % single line in the x-y plane. Replicate points will be treated 0019 % in a least squares sense. 0020 % 0021 % ANY points containing a NaN are ignored in the estimation 0022 % 0023 % xnodes - vector defining the nodes in the grid in the independent 0024 % variable (x). xnodes need not be equally spaced. xnodes 0025 % must completely span the data. If they do not, then the 0026 % 'extend' property is applied, adjusting the first and last 0027 % nodes to be extended as necessary. See below for a complete 0028 % description of the 'extend' property. 0029 % 0030 % If xnodes is a scalar integer, then it specifies the number 0031 % of equally spaced nodes between the min and max of the data. 0032 % 0033 % ynodes - vector defining the nodes in the grid in the independent 0034 % variable (y). ynodes need not be equally spaced. 0035 % 0036 % If ynodes is a scalar integer, then it specifies the number 0037 % of equally spaced nodes between the min and max of the data. 0038 % 0039 % Also see the extend property. 0040 % 0041 % Additional arguments follow in the form of property/value pairs. 0042 % Valid properties are: 0043 % 'smoothness', 'interp', 'regularizer', 'solver', 'maxiter' 0044 % 'extend', 'tilesize', 'overlap' 0045 % 0046 % Any UNAMBIGUOUS shortening (even down to a single letter) is 0047 % valid for property names. All properties have default values, 0048 % chosen (I hope) to give a reasonable result out of the box. 0049 % 0050 % 'smoothness' - scalar - determines the eventual smoothness of the 0051 % estimated surface. A larger value here means the surface 0052 % will be smoother. Smoothness must be a non-negative real 0053 % number. 0054 % 0055 % Note: the problem is normalized in advance so that a 0056 % smoothness of 1 MAY generate reasonable results. If you 0057 % find the result is too smooth, then use a smaller value 0058 % for this parameter. Likewise, bumpy surfaces suggest use 0059 % of a larger value. (Sometimes, use of an iterative solver 0060 % with too small a limit on the maximum number of iterations 0061 % will result in non-convergence.) 0062 % 0063 % DEFAULT: 1 0064 % 0065 % 0066 % 'interp' - character, denotes the interpolation scheme used 0067 % to interpolate the data. 0068 % 0069 % DEFAULT: 'triangle' 0070 % 0071 % 'bilinear' - use bilinear interpolation within the grid 0072 % (also known as tensor product linear interpolation) 0073 % 0074 % 'triangle' - split each cell in the grid into a triangle, 0075 % then linear interpolation inside each triangle 0076 % 0077 % 'nearest' - nearest neighbor interpolation. This will 0078 % rarely be a good choice, but I included it 0079 % as an option for completeness. 0080 % 0081 % 0082 % 'regularizer' - character flag, denotes the regularization 0083 % paradignm to be used. There are currently three options. 0084 % 0085 % DEFAULT: 'gradient' 0086 % 0087 % 'diffusion' or 'laplacian' - uses a finite difference 0088 % approximation to the Laplacian operator (i.e, del^2). 0089 % 0090 % We can think of the surface as a plate, wherein the 0091 % bending rigidity of the plate is specified by the user 0092 % as a number relative to the importance of fidelity to 0093 % the data. A stiffer plate will result in a smoother 0094 % surface overall, but fit the data less well. I've 0095 % modeled a simple plate using the Laplacian, del^2. (A 0096 % projected enhancement is to do a better job with the 0097 % plate equations.) 0098 % 0099 % We can also view the regularizer as a diffusion problem, 0100 % where the relative thermal conductivity is supplied. 0101 % Here interpolation is seen as a problem of finding the 0102 % steady temperature profile in an object, given a set of 0103 % points held at a fixed temperature. Extrapolation will 0104 % be linear. Both paradigms are appropriate for a Laplacian 0105 % regularizer. 0106 % 0107 % 'gradient' - attempts to ensure the gradient is as smooth 0108 % as possible everywhere. Its subtly different from the 0109 % 'diffusion' option, in that here the directional 0110 % derivatives are biased to be smooth across cell 0111 % boundaries in the grid. 0112 % 0113 % The gradient option uncouples the terms in the Laplacian. 0114 % Think of it as two coupled PDEs instead of one PDE. Why 0115 % are they different at all? The terms in the Laplacian 0116 % can balance each other. 0117 % 0118 % 'springs' - uses a spring model connecting nodes to each 0119 % other, as well as connecting data points to the nodes 0120 % in the grid. This choice will cause any extrapolation 0121 % to be as constant as possible. 0122 % 0123 % Here the smoothing parameter is the relative stiffness 0124 % of the springs connecting the nodes to each other compared 0125 % to the stiffness of a spting connecting the lattice to 0126 % each data point. Since all springs have a rest length 0127 % (length at which the spring has zero potential energy) 0128 % of zero, any extrapolation will be minimized. 0129 % 0130 % Note: I don't terribly like the 'springs' strategy. 0131 % It tends to drag the surface towards the mean of all 0132 % the data. Its been left in only because the paradigm 0133 % interests me. 0134 % 0135 % 0136 % 'solver' - character flag - denotes the solver used for the 0137 % resulting linear system. Different solvers will have 0138 % different solution times depending upon the specific 0139 % problem to be solved. Up to a certain size grid, the 0140 % direct \ solver will often be speedy, until memory 0141 % swaps causes problems. 0142 % 0143 % What solver should you use? Problems with a significant 0144 % amount of extrapolation should avoid lsqr. \ may be 0145 % best numerically for small smoothnesss parameters and 0146 % high extents of extrapolation. 0147 % 0148 % Large numbers of points will slow down the direct 0149 % \, but when applied to the normal equations, \ can be 0150 % quite fast. Since the equations generated by these 0151 % methods will tend to be well conditioned, the normal 0152 % equations are not a bad choice of method to use. Beware 0153 % when a small smoothing parameter is used, since this will 0154 % make the equations less well conditioned. 0155 % 0156 % DEFAULT: 'normal' 0157 % 0158 % '\' - uses matlab's backslash operator to solve the sparse 0159 % system. 'backslash' is an alternate name. 0160 % 0161 % 'symmlq' - uses matlab's iterative symmlq solver 0162 % 0163 % 'lsqr' - uses matlab's iterative lsqr solver 0164 % 0165 % 'normal' - uses \ to solve the normal equations. 0166 % 0167 % 0168 % 'maxiter' - only applies to iterative solvers - defines the 0169 % maximum number of iterations for an iterative solver 0170 % 0171 % DEFAULT: min(10000,length(xnodes)*length(ynodes)) 0172 % 0173 % 0174 % 'extend' - character flag - controls whether the first and last 0175 % nodes in each dimension are allowed to be adjusted to 0176 % bound the data, and whether the user will be warned if 0177 % this was deemed necessary to happen. 0178 % 0179 % DEFAULT: 'warning' 0180 % 0181 % 'warning' - Adjust the first and/or last node in 0182 % x or y if the nodes do not FULLY contain 0183 % the data. Issue a warning message to this 0184 % effect, telling the amount of adjustment 0185 % applied. 0186 % 0187 % 'never' - Issue an error message when the nodes do 0188 % not absolutely contain the data. 0189 % 0190 % 'always' - automatically adjust the first and last 0191 % nodes in each dimension if necessary. 0192 % No warning is given when this option is set. 0193 % 0194 % 0195 % 'tilesize' - grids which are simply too large to solve for 0196 % in one single estimation step can be built as a set 0197 % of tiles. For example, a 1000x1000 grid will require 0198 % the estimation of 1e6 unknowns. This is likely to 0199 % require more memory (and time) than you have available. 0200 % But if your data is dense enough, then you can model 0201 % it locally using smaller tiles of the grid. 0202 % 0203 % My recommendation for a reasonable tilesize is 0204 % roughly 100 to 200. Tiles of this size take only 0205 % a few seconds to solve normally, so the entire grid 0206 % can be modeled in a finite amount of time. The minimum 0207 % tilesize can never be less than 3, although even this 0208 % size tile is so small as to be ridiculous. 0209 % 0210 % If your data is so sparse than some tiles contain 0211 % insufficient data to model, then those tiles will 0212 % be left as NaNs. 0213 % 0214 % DEFAULT: inf 0215 % 0216 % 0217 % 'overlap' - Tiles in a grid have some overlap, so they 0218 % can minimize any problems along the edge of a tile. 0219 % In this overlapped region, the grid is built using a 0220 % bi-linear combination of the overlapping tiles. 0221 % 0222 % The overlap is specified as a fraction of the tile 0223 % size, so an overlap of 0.20 means there will be a 20% 0224 % overlap of successive tiles. I do allow a zero overlap, 0225 % but it must be no more than 1/2. 0226 % 0227 % 0 <= overlap <= 0.5 0228 % 0229 % Overlap is ignored if the tilesize is greater than the 0230 % number of nodes in both directions. 0231 % 0232 % DEFAULT: 0.20 0233 % 0234 % 0235 % 'autoscale' - Some data may have widely different scales on 0236 % the respective x and y axes. If this happens, then 0237 % the regularization may experience difficulties. 0238 % 0239 % autoscale = 'on' will cause gridfit to scale the x 0240 % and y node intervals to a unit length. This should 0241 % improve the regularization procedure. The scaling is 0242 % purely internal. 0243 % 0244 % autoscale = 'off' will disable automatic scaling 0245 % 0246 % DEFAULT: 'on' 0247 % 0248 % 0249 % Arguments: (output) 0250 % zgrid - (nx,ny) array containing the fitted surface 0251 % 0252 % xgrid, ygrid - as returned by meshgrid(xnodes,ynodes) 0253 % 0254 % 0255 % Speed considerations: 0256 % Remember that gridfit must solve a LARGE system of linear 0257 % equations. There will be as many unknowns as the total 0258 % number of nodes in the final lattice. While these equations 0259 % may be sparse, solving a system of 10000 equations may take 0260 % a second or so. Very large problems may benefit from the 0261 % iterative solvers or from tiling. 0262 % 0263 % 0264 % Example usage: 0265 % 0266 % x = rand(100,1); 0267 % y = rand(100,1); 0268 % z = exp(x+2*y); 0269 % xnodes = 0:.1:1; 0270 % ynodes = 0:.1:1; 0271 % 0272 % g = gridfit(x,y,z,xnodes,ynodes); 0273 % 0274 % Note: this is equivalent to the following call: 0275 % 0276 % g = gridfit(x,y,z,xnodes,ynodes, ... 0277 % 'smooth',1, ... 0278 % 'interp','triangle', ... 0279 % 'solver','normal', ... 0280 % 'regularizer','gradient', ... 0281 % 'extend','warning', ... 0282 % 'tilesize',inf); 0283 % 0284 % 0285 % Author: John D'Errico 0286 % e-mail address: woodchips@rochester.rr.com 0287 % Release: 2.0 0288 % Release date: 5/23/06 0289 0290 % set defaults 0291 params.smoothness = 1; 0292 params.interp = 'triangle'; 0293 params.regularizer = 'gradient'; 0294 params.solver = 'normal'; 0295 params.maxiter = []; 0296 params.extend = 'warning'; 0297 params.tilesize = inf; 0298 params.overlap = 0.20; 0299 params.mask = []; 0300 params.autoscale = 'on'; 0301 params.xscale = 1; 0302 params.yscale = 1; 0303 0304 % was the params struct supplied? 0305 if ~isempty(varargin) 0306 if isstruct(varargin{1}) 0307 % params is only supplied if its a call from tiled_gridfit 0308 params = varargin{1}; 0309 if length(varargin)>1 0310 % check for any overrides 0311 params = parse_pv_pairs(params,varargin{2:end}); 0312 end 0313 else 0314 % check for any overrides of the defaults 0315 params = parse_pv_pairs(params,varargin); 0316 0317 end 0318 end 0319 0320 % check the parameters for acceptability 0321 params = check_params(params); 0322 0323 % ensure all of x,y,z,xnodes,ynodes are column vectors, 0324 % also drop any NaN data 0325 x=x(:); 0326 y=y(:); 0327 z=z(:); 0328 k = isnan(x) | isnan(y) | isnan(z); 0329 if any(k) 0330 x(k)=[]; 0331 y(k)=[]; 0332 z(k)=[]; 0333 end 0334 xmin = min(x); 0335 xmax = max(x); 0336 ymin = min(y); 0337 ymax = max(y); 0338 0339 % did they supply a scalar for the nodes? 0340 if length(xnodes)==1 0341 xnodes = linspace(xmin,xmax,xnodes)'; 0342 xnodes(end) = xmax; % make sure it hits the max 0343 end 0344 if length(ynodes)==1 0345 ynodes = linspace(ymin,ymax,ynodes)'; 0346 ynodes(end) = ymax; % make sure it hits the max 0347 end 0348 0349 xnodes=xnodes(:); 0350 ynodes=ynodes(:); 0351 dx = diff(xnodes); 0352 dy = diff(ynodes); 0353 nx = length(xnodes); 0354 ny = length(ynodes); 0355 ngrid = nx*ny; 0356 0357 % set the scaling if autoscale was on 0358 if strcmpi(params.autoscale,'on') 0359 params.xscale = mean(dx); 0360 params.yscale = mean(dy); 0361 params.autoscale = 'off'; 0362 end 0363 0364 % check to see if any tiling is necessary 0365 if (params.tilesize < max(nx,ny)) 0366 % split it into smaller tiles. compute zgrid and ygrid 0367 % at the very end if requested 0368 zgrid = tiled_gridfit(x,y,z,xnodes,ynodes,params); 0369 else 0370 % its a single tile. 0371 0372 % mask must be either an empty array, or a boolean 0373 % aray of the same size as the final grid. 0374 nmask = size(params.mask); 0375 if ~isempty(params.mask) && ((nmask(2)~=nx) || (nmask(1)~=ny)) 0376 if ((nmask(2)==ny) || (nmask(1)==nx)) 0377 error 'Mask array is probably transposed from proper orientation.' 0378 else 0379 error 'Mask array must be the same size as the final grid.' 0380 end 0381 end 0382 if ~isempty(params.mask) 0383 params.maskflag = 1; 0384 else 0385 params.maskflag = 0; 0386 end 0387 0388 % default for maxiter? 0389 if isempty(params.maxiter) 0390 params.maxiter = min(10000,nx*ny); 0391 end 0392 0393 % check lengths of the data 0394 n = length(x); 0395 if (length(y)~=n) || (length(z)~=n) 0396 error 'Data vectors are incompatible in size.' 0397 end 0398 if n<3 0399 error 'Insufficient data for surface estimation.' 0400 end 0401 0402 % verify the nodes are distinct 0403 if any(diff(xnodes)<=0) || any(diff(ynodes)<=0) 0404 error 'xnodes and ynodes must be monotone increasing' 0405 end 0406 0407 % do we need to tweak the first or last node in x or y? 0408 if xmin<xnodes(1) 0409 switch params.extend 0410 case 'always' 0411 xnodes(1) = xmin; 0412 case 'warning' 0413 warning(['xnodes(1) was decreased by: ',num2str(xnodes(1)-xmin),', new node = ',num2str(xmin)]) 0414 xnodes(1) = xmin; 0415 case 'never' 0416 error(['Some x (',num2str(xmin),') falls below xnodes(1) by: ',num2str(xnodes(1)-xmin)]) 0417 end 0418 end 0419 if xmax>xnodes(end) 0420 switch params.extend 0421 case 'always' 0422 xnodes(end) = xmax; 0423 case 'warning' 0424 warning(['xnodes(end) was increased by: ',num2str(xmax-xnodes(end)),', new node = ',num2str(xmax)]) 0425 xnodes(end) = xmax; 0426 case 'never' 0427 error(['Some x (',num2str(xmax),') falls above xnodes(end) by: ',num2str(xmax-xnodes(end))]) 0428 end 0429 end 0430 if ymin<ynodes(1) 0431 switch params.extend 0432 case 'always' 0433 ynodes(1) = ymin; 0434 case 'warning' 0435 warning(['ynodes(1) was decreased by: ',num2str(ynodes(1)-ymin),', new node = ',num2str(ymin)]) 0436 ynodes(1) = ymin; 0437 case 'never' 0438 error(['Some y (',num2str(ymin),') falls below ynodes(1) by: ',num2str(ynodes(1)-ymin)]) 0439 end 0440 end 0441 if ymax>ynodes(end) 0442 switch params.extend 0443 case 'always' 0444 ynodes(end) = ymax; 0445 case 'warning' 0446 warning(['ynodes(end) was increased by: ',num2str(ymax-ynodes(end)),', new node = ',num2str(ymax)]) 0447 ynodes(end) = ymax; 0448 case 'never' 0449 error(['Some y (',num2str(ymax),') falls above ynodes(end) by: ',num2str(ymax-ynodes(end))]) 0450 end 0451 end 0452 0453 % determine which cell in the array each point lies in 0454 [junk,indx] = histc(x,xnodes); %#ok 0455 [junk,indy] = histc(y,ynodes); %#ok 0456 % any point falling at the last node is taken to be 0457 % inside the last cell in x or y. 0458 k=(indx==nx); 0459 indx(k)=indx(k)-1; 0460 k=(indy==ny); 0461 indy(k)=indy(k)-1; 0462 ind = indy + ny*(indx-1); 0463 0464 % Do we have a mask to apply? 0465 if params.maskflag 0466 % if we do, then we need to ensure that every 0467 % cell with at least one data point also has at 0468 % least all of its corners unmasked. 0469 params.mask(ind) = 1; 0470 params.mask(ind+1) = 1; 0471 params.mask(ind+ny) = 1; 0472 params.mask(ind+ny+1) = 1; 0473 end 0474 0475 % interpolation equations for each point 0476 tx = min(1,max(0,(x - xnodes(indx))./dx(indx))); 0477 ty = min(1,max(0,(y - ynodes(indy))./dy(indy))); 0478 % Future enhancement: add cubic interpolant 0479 switch params.interp 0480 case 'triangle' 0481 % linear interpolation inside each triangle 0482 k = (tx > ty); 0483 L = ones(n,1); 0484 L(k) = ny; 0485 0486 t1 = min(tx,ty); 0487 t2 = max(tx,ty); 0488 A = sparse(repmat((1:n)',1,3),[ind,ind+ny+1,ind+L], ... 0489 [1-t2,t1,t2-t1],n,ngrid); 0490 0491 case 'nearest' 0492 % nearest neighbor interpolation in a cell 0493 k = round(1-ty) + round(1-tx)*ny; 0494 A = sparse((1:n)',ind+k,ones(n,1),n,ngrid); 0495 0496 case 'bilinear' 0497 % bilinear interpolation in a cell 0498 A = sparse(repmat((1:n)',1,4),[ind,ind+1,ind+ny,ind+ny+1], ... 0499 [(1-tx).*(1-ty), (1-tx).*ty, tx.*(1-ty), tx.*ty], ... 0500 n,ngrid); 0501 0502 end 0503 rhs = z; 0504 0505 % Build regularizer. Add del^4 regularizer one day. 0506 switch params.regularizer 0507 case 'springs' 0508 % zero "rest length" springs 0509 [i,j] = meshgrid(1:nx,1:(ny-1)); 0510 ind = j(:) + ny*(i(:)-1); 0511 m = nx*(ny-1); 0512 stiffness = 1./(dy/params.yscale); 0513 Areg = sparse(repmat((1:m)',1,2),[ind,ind+1], ... 0514 stiffness(j(:))*[-1 1],m,ngrid); 0515 0516 [i,j] = meshgrid(1:(nx-1),1:ny); 0517 ind = j(:) + ny*(i(:)-1); 0518 m = (nx-1)*ny; 0519 stiffness = 1./(dx/params.xscale); 0520 Areg = [Areg;sparse(repmat((1:m)',1,2),[ind,ind+ny], ... 0521 stiffness(i(:))*[-1 1],m,ngrid)]; 0522 0523 [i,j] = meshgrid(1:(nx-1),1:(ny-1)); 0524 ind = j(:) + ny*(i(:)-1); 0525 m = (nx-1)*(ny-1); 0526 stiffness = 1./sqrt((dx(i(:))/params.xscale).^2 + ... 0527 (dy(j(:))/params.yscale).^2); 0528 0529 Areg = [Areg;sparse(repmat((1:m)',1,2),[ind,ind+ny+1], ... 0530 stiffness*[-1 1],m,ngrid)]; 0531 0532 Areg = [Areg;sparse(repmat((1:m)',1,2),[ind+1,ind+ny], ... 0533 stiffness*[-1 1],m,ngrid)]; 0534 0535 case {'diffusion' 'laplacian'} 0536 % thermal diffusion using Laplacian (del^2) 0537 [i,j] = meshgrid(1:nx,2:(ny-1)); 0538 ind = j(:) + ny*(i(:)-1); 0539 dy1 = dy(j(:)-1)/params.yscale; 0540 dy2 = dy(j(:))/params.yscale; 0541 0542 Areg = sparse(repmat(ind,1,3),[ind-1,ind,ind+1], ... 0543 [-2./(dy1.*(dy1+dy2)), 2./(dy1.*dy2), ... 0544 -2./(dy2.*(dy1+dy2))],ngrid,ngrid); 0545 0546 [i,j] = meshgrid(2:(nx-1),1:ny); 0547 ind = j(:) + ny*(i(:)-1); 0548 dx1 = dx(i(:)-1)/params.xscale; 0549 dx2 = dx(i(:))/params.xscale; 0550 0551 Areg = Areg + sparse(repmat(ind,1,3),[ind-ny,ind,ind+ny], ... 0552 [-2./(dx1.*(dx1+dx2)), 2./(dx1.*dx2), ... 0553 -2./(dx2.*(dx1+dx2))],ngrid,ngrid); 0554 0555 case 'gradient' 0556 % Subtly different from the Laplacian. A point for future 0557 % enhancement is to do it better for the triangle interpolation 0558 % case. 0559 [i,j] = meshgrid(1:nx,2:(ny-1)); 0560 ind = j(:) + ny*(i(:)-1); 0561 dy1 = dy(j(:)-1)/params.yscale; 0562 dy2 = dy(j(:))/params.yscale; 0563 0564 Areg = sparse(repmat(ind,1,3),[ind-1,ind,ind+1], ... 0565 [-2./(dy1.*(dy1+dy2)), 2./(dy1.*dy2), ... 0566 -2./(dy2.*(dy1+dy2))],ngrid,ngrid); 0567 0568 [i,j] = meshgrid(2:(nx-1),1:ny); 0569 ind = j(:) + ny*(i(:)-1); 0570 dx1 = dx(i(:)-1)/params.xscale; 0571 dx2 = dx(i(:))/params.xscale; 0572 0573 Areg = [Areg;sparse(repmat(ind,1,3),[ind-ny,ind,ind+ny], ... 0574 [-2./(dx1.*(dx1+dx2)), 2./(dx1.*dx2), ... 0575 -2./(dx2.*(dx1+dx2))],ngrid,ngrid)]; 0576 0577 end 0578 nreg = size(Areg,1); 0579 0580 % Append the regularizer to the interpolation equations, 0581 % scaling the problem first. Use the 1-norm for speed. 0582 NA = norm(A,1); 0583 NR = norm(Areg,1); 0584 A = [A;Areg*(params.smoothness*NA/NR)]; 0585 rhs = [rhs;zeros(nreg,1)]; 0586 % do we have a mask to apply? 0587 if params.maskflag 0588 unmasked = find(params.mask); 0589 end 0590 % solve the full system, with regularizer attached 0591 switch params.solver 0592 case {'\' 'backslash'} 0593 if params.maskflag 0594 % there is a mask to use 0595 % permute for minimum fill in for R (in the QR) 0596 p = colamd(A(:,unmasked)); 0597 zgrid=nan(ny,nx); 0598 zgrid(unmasked(p)) = A(:,unmasked(p))\rhs; 0599 else 0600 % permute for minimum fill in for R (in the QR) 0601 p = colamd(A); 0602 zgrid=zeros(ny,nx); 0603 zgrid(p) = A(:,p)\rhs; 0604 end 0605 0606 case 'normal' 0607 % The normal equations, solved with \. Can be fast 0608 % for huge numbers of data points. 0609 if params.maskflag 0610 % there is a mask to use 0611 % Permute for minimum fill-in for \ (in chol) 0612 APA = A(:,unmasked)'*A(:,unmasked); 0613 p = symamd(APA); 0614 zgrid=nan(ny,nx); 0615 zgrid(unmasked(p)) = APA(p,p)\(A(:,unmasked(p))'*rhs); 0616 else 0617 % Permute for minimum fill-in for \ (in chol) 0618 APA = A'*A; 0619 p = symamd(APA); 0620 zgrid=zeros(ny,nx); 0621 zgrid(p) = APA(p,p)\(A(:,p)'*rhs); 0622 end 0623 0624 case 'symmlq' 0625 % iterative solver - symmlq - requires a symmetric matrix, 0626 % so use it to solve the normal equations. No preconditioner. 0627 tol = abs(max(z)-min(z))*1.e-13; 0628 if params.maskflag 0629 % there is a mask to use 0630 zgrid=nan(ny,nx); 0631 [zgrid(unmasked),flag] = symmlq(A(:,unmasked)'*A(:,unmasked), ... 0632 A(:,unmasked)'*rhs,tol,params.maxiter); 0633 else 0634 [zgrid,flag] = symmlq(A'*A,A'*rhs,tol,params.maxiter); 0635 zgrid = reshape(zgrid,ny,nx); 0636 end 0637 % display a warning if convergence problems 0638 switch flag 0639 case 0 0640 % no problems with convergence 0641 case 1 0642 % SYMMLQ iterated MAXIT times but did not converge. 0643 warning(['Symmlq performed ',num2str(params.maxiter), ... 0644 ' iterations but did not converge.']) 0645 case 3 0646 % SYMMLQ stagnated, successive iterates were the same 0647 warning 'Symmlq stagnated without apparent convergence.' 0648 otherwise 0649 warning(['One of the scalar quantities calculated in',... 0650 ' symmlq was too small or too large to continue computing.']) 0651 end 0652 0653 case 'lsqr' 0654 % iterative solver - lsqr. No preconditioner here. 0655 tol = abs(max(z)-min(z))*1.e-13; 0656 if params.maskflag 0657 % there is a mask to use 0658 zgrid=nan(ny,nx); 0659 [zgrid(unmasked),flag] = lsqr(A(:,unmasked),rhs,tol,params.maxiter); 0660 else 0661 [zgrid,flag] = lsqr(A,rhs,tol,params.maxiter); 0662 zgrid = reshape(zgrid,ny,nx); 0663 end 0664 0665 % display a warning if convergence problems 0666 switch flag 0667 case 0 0668 % no problems with convergence 0669 case 1 0670 % lsqr iterated MAXIT times but did not converge. 0671 warning(['Lsqr performed ',num2str(params.maxiter), ... 0672 ' iterations but did not converge.']) 0673 case 3 0674 % lsqr stagnated, successive iterates were the same 0675 warning 'Lsqr stagnated without apparent convergence.' 0676 case 4 0677 warning(['One of the scalar quantities calculated in',... 0678 ' LSQR was too small or too large to continue computing.']) 0679 end 0680 0681 end 0682 0683 end % if params.tilesize... 0684 0685 % only generate xgrid and ygrid if requested. 0686 if nargout>1 0687 [xgrid,ygrid]=meshgrid(xnodes,ynodes); 0688 end 0689 0690 % ============================================ 0691 % End of main function - gridfit 0692 % ============================================ 0693 0694 % ============================================ 0695 % subfunction - parse_pv_pairs 0696 % ============================================ 0697 function params=parse_pv_pairs(params,pv_pairs) 0698 % parse_pv_pairs: parses sets of property value pairs, allows defaults 0699 % usage: params=parse_pv_pairs(default_params,pv_pairs) 0700 % 0701 % arguments: (input) 0702 % default_params - structure, with one field for every potential 0703 % property/value pair. Each field will contain the default 0704 % value for that property. If no default is supplied for a 0705 % given property, then that field must be empty. 0706 % 0707 % pv_array - cell array of property/value pairs. 0708 % Case is ignored when comparing properties to the list 0709 % of field names. Also, any unambiguous shortening of a 0710 % field/property name is allowed. 0711 % 0712 % arguments: (output) 0713 % params - parameter struct that reflects any updated property/value 0714 % pairs in the pv_array. 0715 % 0716 % Example usage: 0717 % First, set default values for the parameters. Assume we 0718 % have four parameters that we wish to use optionally in 0719 % the function examplefun. 0720 % 0721 % - 'viscosity', which will have a default value of 1 0722 % - 'volume', which will default to 1 0723 % - 'pie' - which will have default value 3.141592653589793 0724 % - 'description' - a text field, left empty by default 0725 % 0726 % The first argument to examplefun is one which will always be 0727 % supplied. 0728 % 0729 % function examplefun(dummyarg1,varargin) 0730 % params.Viscosity = 1; 0731 % params.Volume = 1; 0732 % params.Pie = 3.141592653589793 0733 % 0734 % params.Description = ''; 0735 % params=parse_pv_pairs(params,varargin); 0736 % params 0737 % 0738 % Use examplefun, overriding the defaults for 'pie', 'viscosity' 0739 % and 'description'. The 'volume' parameter is left at its default. 0740 % 0741 % examplefun(rand(10),'vis',10,'pie',3,'Description','Hello world') 0742 % 0743 % params = 0744 % Viscosity: 10 0745 % Volume: 1 0746 % Pie: 3 0747 % Description: 'Hello world' 0748 % 0749 % Note that capitalization was ignored, and the property 'viscosity' 0750 % was truncated as supplied. Also note that the order the pairs were 0751 % supplied was arbitrary. 0752 0753 npv = length(pv_pairs); 0754 n = npv/2; 0755 0756 if n~=floor(n) 0757 error 'Property/value pairs must come in PAIRS.' 0758 end 0759 if n<=0 0760 % just return the defaults 0761 return 0762 end 0763 0764 if ~isstruct(params) 0765 error 'No structure for defaults was supplied' 0766 end 0767 0768 % there was at least one pv pair. process any supplied 0769 propnames = fieldnames(params); 0770 lpropnames = lower(propnames); 0771 for i=1:n 0772 p_i = lower(pv_pairs{2*i-1}); 0773 v_i = pv_pairs{2*i}; 0774 0775 ind = strmatch(p_i,lpropnames,'exact'); 0776 if isempty(ind) 0777 ind = find(strncmp(p_i,lpropnames,length(p_i))); 0778 if isempty(ind) 0779 error(['No matching property found for: ',pv_pairs{2*i-1}]) 0780 elseif length(ind)>1 0781 error(['Ambiguous property name: ',pv_pairs{2*i-1}]) 0782 end 0783 end 0784 p_i = propnames{ind}; 0785 0786 % override the corresponding default in params 0787 params = setfield(params,p_i,v_i); %#ok 0788 0789 end 0790 0791 0792 % ============================================ 0793 % subfunction - check_params 0794 % ============================================ 0795 function params = check_params(params) 0796 0797 % check the parameters for acceptability 0798 % smoothness == 1 by default 0799 if isempty(params.smoothness) 0800 params.smoothness = 1; 0801 else 0802 if (length(params.smoothness)>1) || (params.smoothness<=0) 0803 error 'Smoothness must be scalar, real, finite, and positive.' 0804 end 0805 end 0806 0807 % regularizer - must be one of 4 options - the second and 0808 % third are actually synonyms. 0809 valid = {'springs', 'diffusion', 'laplacian', 'gradient'}; 0810 if isempty(params.regularizer) 0811 params.regularizer = 'diffusion'; 0812 end 0813 ind = find(strncmpi(params.regularizer,valid,length(params.regularizer))); 0814 if (length(ind)==1) 0815 params.regularizer = valid{ind}; 0816 else 0817 error(['Invalid regularization method: ',params.regularizer]) 0818 end 0819 0820 % interp must be one of: 0821 % 'bilinear', 'nearest', or 'triangle' 0822 % but accept any shortening thereof. 0823 valid = {'bilinear', 'nearest', 'triangle'}; 0824 if isempty(params.interp) 0825 params.interp = 'triangle'; 0826 end 0827 ind = find(strncmpi(params.interp,valid,length(params.interp))); 0828 if (length(ind)==1) 0829 params.interp = valid{ind}; 0830 else 0831 error(['Invalid interpolation method: ',params.interp]) 0832 end 0833 0834 % solver must be one of: 0835 % 'backslash', '\', 'symmlq', 'lsqr', or 'normal' 0836 % but accept any shortening thereof. 0837 valid = {'backslash', '\', 'symmlq', 'lsqr', 'normal'}; 0838 if isempty(params.solver) 0839 params.solver = '\'; 0840 end 0841 ind = find(strncmpi(params.solver,valid,length(params.solver))); 0842 if (length(ind)==1) 0843 params.solver = valid{ind}; 0844 else 0845 error(['Invalid solver option: ',params.solver]) 0846 end 0847 0848 % extend must be one of: 0849 % 'never', 'warning', 'always' 0850 % but accept any shortening thereof. 0851 valid = {'never', 'warning', 'always'}; 0852 if isempty(params.extend) 0853 params.extend = 'warning'; 0854 end 0855 ind = find(strncmpi(params.extend,valid,length(params.extend))); 0856 if (length(ind)==1) 0857 params.extend = valid{ind}; 0858 else 0859 error(['Invalid extend option: ',params.extend]) 0860 end 0861 0862 % tilesize == inf by default 0863 if isempty(params.tilesize) 0864 params.tilesize = inf; 0865 elseif (length(params.tilesize)>1) || (params.tilesize<3) 0866 error 'Tilesize must be scalar and > 0.' 0867 end 0868 0869 % overlap == 0.20 by default 0870 if isempty(params.overlap) 0871 params.overlap = 0.20; 0872 elseif (length(params.overlap)>1) || (params.overlap<0) || (params.overlap>0.5) 0873 error 'Overlap must be scalar and 0 < overlap < 1.' 0874 end 0875 0876 % ============================================ 0877 % subfunction - tiled_gridfit 0878 % ============================================ 0879 function zgrid=tiled_gridfit(x,y,z,xnodes,ynodes,params) 0880 % tiled_gridfit: a tiled version of gridfit, continuous across tile boundaries 0881 % usage: [zgrid,xgrid,ygrid]=tiled_gridfit(x,y,z,xnodes,ynodes,params) 0882 % 0883 % Tiled_gridfit is used when the total grid is far too large 0884 % to model using a single call to gridfit. While gridfit may take 0885 % only a second or so to build a 100x100 grid, a 2000x2000 grid 0886 % will probably not run at all due to memory problems. 0887 % 0888 % Tiles in the grid with insufficient data (<4 points) will be 0889 % filled with NaNs. Avoid use of too small tiles, especially 0890 % if your data has holes in it that may encompass an entire tile. 0891 % 0892 % A mask may also be applied, in which case tiled_gridfit will 0893 % subdivide the mask into tiles. Note that any boolean mask 0894 % provided is assumed to be the size of the complete grid. 0895 % 0896 % Tiled_gridfit may not be fast on huge grids, but it should run 0897 % as long as you use a reasonable tilesize. 8-) 0898 0899 % Note that we have already verified all parameters in check_params 0900 0901 % Matrix elements in a square tile 0902 tilesize = params.tilesize; 0903 % Size of overlap in terms of matrix elements. Overlaps 0904 % of purely zero cause problems, so force at least two 0905 % elements to overlap. 0906 overlap = max(2,floor(tilesize*params.overlap)); 0907 0908 % reset the tilesize for each particular tile to be inf, so 0909 % we will never see a recursive call to tiled_gridfit 0910 Tparams = params; 0911 Tparams.tilesize = inf; 0912 0913 nx = length(xnodes); 0914 ny = length(ynodes); 0915 zgrid = zeros(ny,nx); 0916 0917 % linear ramp for the bilinear interpolation 0918 rampfun = inline('(t-t(1))/(t(end)-t(1))','t'); 0919 0920 % loop over each tile in the grid 0921 h = waitbar(0,'Relax and have a cup of JAVA. Its my treat.'); 0922 warncount = 0; 0923 xtind = 1:min(nx,tilesize); 0924 while ~isempty(xtind) && (xtind(1)<=nx) 0925 0926 xinterp = ones(1,length(xtind)); 0927 if (xtind(1) ~= 1) 0928 xinterp(1:overlap) = rampfun(xnodes(xtind(1:overlap))); 0929 end 0930 if (xtind(end) ~= nx) 0931 xinterp((end-overlap+1):end) = 1-rampfun(xnodes(xtind((end-overlap+1):end))); 0932 end 0933 0934 ytind = 1:min(ny,tilesize); 0935 while ~isempty(ytind) && (ytind(1)<=ny) 0936 % update the waitbar 0937 waitbar((xtind(end)-tilesize)/nx + tilesize*ytind(end)/ny/nx) 0938 0939 yinterp = ones(length(ytind),1); 0940 if (ytind(1) ~= 1) 0941 yinterp(1:overlap) = rampfun(ynodes(ytind(1:overlap))); 0942 end 0943 if (ytind(end) ~= ny) 0944 yinterp((end-overlap+1):end) = 1-rampfun(ynodes(ytind((end-overlap+1):end))); 0945 end 0946 0947 % was a mask supplied? 0948 if ~isempty(params.mask) 0949 submask = params.mask(ytind,xtind); 0950 Tparams.mask = submask; 0951 end 0952 0953 % extract data that lies in this grid tile 0954 k = (x>=xnodes(xtind(1))) & (x<=xnodes(xtind(end))) & ... 0955 (y>=ynodes(ytind(1))) & (y<=ynodes(ytind(end))); 0956 k = find(k); 0957 0958 if length(k)<4 0959 if warncount == 0 0960 warning 'A tile was too underpopulated to model. Filled with NaNs.' 0961 end 0962 warncount = warncount + 1; 0963 0964 % fill this part of the grid with NaNs 0965 zgrid(ytind,xtind) = NaN; 0966 0967 else 0968 % build this tile 0969 zgtile = gridfit(x(k),y(k),z(k),xnodes(xtind),ynodes(ytind),Tparams); 0970 0971 % bilinear interpolation (using an outer product) 0972 interp_coef = yinterp*xinterp; 0973 0974 % accumulate the tile into the complete grid 0975 zgrid(ytind,xtind) = zgrid(ytind,xtind) + zgtile.*interp_coef; 0976 0977 end 0978 0979 % step to the next tile in y 0980 if ytind(end)<ny 0981 ytind = ytind + tilesize - overlap; 0982 % are we within overlap elements of the edge of the grid? 0983 if (ytind(end)+max(3,overlap))>=ny 0984 % extend this tile to the edge 0985 ytind = ytind(1):ny; 0986 end 0987 else 0988 ytind = ny+1; 0989 end 0990 0991 end % while loop over y 0992 0993 % step to the next tile in x 0994 if xtind(end)<nx 0995 xtind = xtind + tilesize - overlap; 0996 % are we within overlap elements of the edge of the grid? 0997 if (xtind(end)+max(3,overlap))>=nx 0998 % extend this tile to the edge 0999 xtind = xtind(1):nx; 1000 end 1001 else 1002 xtind = nx+1; 1003 end 1004 1005 end % while loop over x 1006 1007 % close down the waitbar 1008 close(h) 1009 1010 if warncount>0 1011 warning([num2str(warncount),' tiles were underpopulated & filled with NaNs']) 1012 end 1013 1014 1015 1016