This is the command gmtregressgmt that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
PROGRAM:
NAME
gmtregress - Linear regression of 1-D data sets
SYNOPSIS
gmtregress [ table ] [ min/max/inc ] [ level ] [ x|y|o|r ] [ flags ] [ 1|2|r|w ] [ [r] ] [
min/max/inc | n ] [ [w][x][y][r] ] [ [level] ] [ -a<flags> ] [ -b<binary> ] [ -g<gaps> ] [
-h<headers> ] [ -i<flags> ] [ -o<flags> ]
Note: No space is allowed between the option flag and the associated arguments.
DESCRIPTION
gmtregress reads one or more data tables [or stdin] and determines the best linear
regression model y = a + b* x for each segment using the chosen parameters. The user may
specify which data and model components should be reported. By default, the model will be
evaluated at the input points, but alternatively you can specify an equidistant range over
which to evaluate the model, or turn off evaluation completely. Instead of determining
the best fit we can perform a scan of all possible regression lines (for a range of slope
angles) and examine how the chosen misfit measure varies with slope. This is particularly
useful when analyzing data with many outliers. Note: If you actually need to work with
log10 of x or y you can accomplish that transformation during read by using the -i option.
REQUIRED ARGUMENTS
None
OPTIONAL ARGUMENTS
table One or more ASCII (or binary, see -bi[ncols][type]) data table file(s) holding a
number of data columns. If no tables are given then we read from standard input.
The first two columns are expected to contain the required x and y data. Depending
on your -W and -E settings we may expect an additional 1-3 columns with error
estimates of one of both of the data coordinates, and even their correlation.
-Amin/max/inc
Instead of determining a best-fit regression we explore the full range of
regressions. Examine all possible regression lines with slope angles between min
and max, using steps of inc degrees [-90/+90/1]. For each slope the optimum
intercept is determined based on your regression type (-E) and misfit norm (-N)
settings. For each segment we report the four columns angle, E, slope, intercept,
for the range of specified angles. The best model parameters within this range are
written into the segment header and reported in verbose mode (-V).
-Clevel
Set the confidence level (in %) to use for the optional calculation of confidence
bands on the regression [95]. This is only used if -F includes the output column
c.
-Ex|y|o|r
Type of linear regression, i.e., select the type of misfit we should calculate.
Choose from x (regress x on y; i.e., the misfit is measured horizontally from data
point to regression line), y (regress y on x; i.e., the misfit is measured
vertically [Default]), o (orthogonal regression; i.e., the misfit is measured from
data point orthogonally to nearest point on the line), or r (Reduced Major Axis
regression; i.e., the misfit is the product of both vertical and horizontal
misfits) [y].
-Fflags
Append a combination of the columns you wish returned; the output order will match
the order specified. Choose from x (observed x), y (observed y), m (model
prediction), r (residual = data minus model), c (symmetrical confidence interval on
the regression; see -C for specifying the level), z (standardized residuals or
so-called z-scores) and w (outlier weights 0 or 1; for -Nw these are the Reweighted
Least Squares weights) [xymrczw]. As an alternative to evaluating the model, just
give -Fp and we instead write a single record with the model parameters npoints
xmean ymean angle misfit slope intercept sigma_slope sigma_intercept.
-N1|2|r|w
Selects the norm to use for the misfit calculation. Choose among 1 (L-1 measure;
the mean of the absolute residuals), 2 (Least-squares; the mean of the squared
residuals), r (LMS; The least median of the squared residuals), or w (RLS;
Reweighted Least Squares: the mean of the squared residuals after outliers
identified via LMS have been removed) [Default is 2]. Traditional regression uses
L-2 while L-1 and in particular LMS are more robust in how they handle outliers.
As alluded to, RLS implies an initial LMS regression which is then used to identify
outliers in the data, assign these a zero weight, and then redo the regression
using a L-2 norm.
-S[r] Restricts which records will be output. By default all data records will be output
in the format specified by -F. Use -S to exclude data points identified as
outliers by the regression. Alternatively, use -Sr to reverse this and only output
the outlier records.
-Tmin/max/inc | -Tn
Evaluate the best-fit regression model at the equidistant points implied by the
arguments. If -Tn is given instead we will reset min and max to the extreme
x-values for each segment and determine inc so that there are exactly n output
values for each segment. To skip the model evaluation entirely, simply provide
-T0.
-W[w][x][y][r]
Specifies weighted regression and which weights will be provided. Append x if
giving 1-sigma uncertainties in the x-observations, y if giving 1-sigma
uncertainties in y, and r if giving correlations between x and y observations, in
the order these columns appear in the input (after the two required and leading x,
y columns). Giving both x and y (and optionally r) implies an orthogonal
regression, otherwise giving x requires -Ex and y requires -Ey. We convert
uncertainties in x and y to regression weights via the relationship weight =
1/sigma. Use -Ww if the we should interpret the input columns to have precomputed
weights instead. Note: residuals with respect to the regression line will be
scaled by the given weights. Most norms will then square this weighted residual
(-N1 is the only exception).
-V[level] (more ...)
Select verbosity level [c].
-acol=name[...] (more ...)
Set aspatial column associations col=name.
-bi[ncols][t] (more ...)
Select native binary input.
-bo[ncols][type] (more ...)
Select native binary output. [Default is same as input].
-g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
Determine data gaps and line breaks.
-h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
Skip or produce header record(s).
-icols[l][sscale][ooffset][,...] (more ...)
Select input columns (0 is first column).
-ocols[,...] (more ...)
Select output columns (0 is first column).
-^ or just -
Print a short message about the syntax of the command, then exits (NOTE: on Windows
use just -).
-+ or just +
Print an extensive usage (help) message, including the explanation of any
module-specific option (but not the GMT common options), then exits.
-? or no arguments
Print a complete usage (help) message, including the explanation of options, then
exits.
--version
Print GMT version and exit.
--show-datadir
Print full path to GMT share directory and exit.
ASCII FORMAT PRECISION
The ASCII output formats of numerical data are controlled by parameters in your gmt.conf
file. Longitude and latitude are formatted according to FORMAT_GEO_OUT, whereas other
values are formatted according to FORMAT_FLOAT_OUT. Be aware that the format in effect can
lead to loss of precision in the output, which can lead to various problems downstream. If
you find the output is not written with enough precision, consider switching to binary
output (-bo if available) or specify more decimals using the FORMAT_FLOAT_OUT setting.
EXAMPLES
To do a standard least-squares regression on the x-y data in points.txt and return x, y,
and model prediction with 99% confidence intervals, try
gmt regress points.txt -Fxymc -C99 > points_regressed.txt
To just get the slope for the above regression, try
slope=`gmt regress points.txt -Fp -o5`
To do a reweighted least-squares regression on the data rough.txt and return x, y, model
prediction and the RLS weights, try
gmt regress rough.txt -Fxymw > points_regressed.txt
To do an orthogonal least-squares regression on the data crazy.txt but first take the
logarithm of both x and y, then return x, y, model prediction and the normalized residuals
(z-scores), try
gmt regress crazy.txt -Eo -Fxymz -i0-1l > points_regressed.txt
To examine how the orthogonal LMS misfits vary with angle between 0 and 90 in steps of 0.2
degrees for the same file, try
gmt regress points.txt -A0/90/0.2 -Eo -Nr > points_analysis.txt
REFERENCES
Draper, N. R., and H. Smith, 1998, Applied regression analysis, 3rd ed., 736 pp., John
Wiley and Sons, New York.
Rousseeuw, P. J., and A. M. Leroy, 1987, Robust regression and outlier detection, 329 pp.,
John Wiley and Sons, New York.
York, D., N. M. Evensen, M. L. Martinez, and J. De Basebe Delgado, 2004, Unified equations
for the slope, intercept, and standard errors of the best straight line, Am. J. Phys.,
72(3), 367-375.
Use gmtregressgmt online using onworks.net services