public class SimpleRegression
extends java.lang.Object
implements java.io.Serializable
y = intercept + slope * x
Standard errors for intercept
and slope
are
available as well as ANOVA, rsquare and Pearson's r statistics.
Observations (x,y pairs) can be added to the model one at a time or they can be provided in a 2dimensional array. The observations are not stored in memory, so there is no limit to the number of observations that can be added to the model.
Usage Notes:
NaN
. At least two observations with
different x coordinates are requred to estimate a bivariate regression
model.
Constructor and Description 

SimpleRegression()
Create an empty SimpleRegression instance

SimpleRegression(int degrees)
Create an empty SimpleRegression.

SimpleRegression(TDistribution t)
Deprecated.
in 2.2 (to be removed in 3.0). Please use the
other constructor instead. 
Modifier and Type  Method and Description 

void 
addData(double[][] data)
Adds the observations represented by the elements in
data . 
void 
addData(double x,
double y)
Adds the observation (x,y) to the regression data set.

void 
clear()
Clears all data from the model.

double 
getIntercept()
Returns the intercept of the estimated regression line.

double 
getInterceptStdErr()
Returns the
standard error of the intercept estimate,
usually denoted s(b0).

double 
getMeanSquareError()
Returns the sum of squared errors divided by the degrees of freedom,
usually abbreviated MSE.

long 
getN()
Returns the number of observations that have been added to the model.

double 
getR()
Returns
Pearson's product moment correlation coefficient,
usually denoted r.

double 
getRegressionSumSquares()
Returns the sum of squared deviations of the predicted y values about
their mean (which equals the mean of y).

double 
getRSquare()
Returns the
coefficient of determination,
usually denoted rsquare.

double 
getSignificance()
Returns the significance level of the slope (equiv) correlation.

double 
getSlope()
Returns the slope of the estimated regression line.

double 
getSlopeConfidenceInterval()
Returns the halfwidth of a 95% confidence interval for the slope
estimate.

double 
getSlopeConfidenceInterval(double alpha)
Returns the halfwidth of a (100100*alpha)% confidence interval for
the slope estimate.

double 
getSlopeStdErr()
Returns the standard
error of the slope estimate,
usually denoted s(b1).

double 
getSumOfCrossProducts()
Returns the sum of crossproducts, x_{i}*y_{i}.

double 
getSumSquaredErrors()
Returns the
sum of squared errors (SSE) associated with the regression
model.

double 
getTotalSumSquares()
Returns the sum of squared deviations of the y values about their mean.

double 
getXSumSquares()
Returns the sum of squared deviations of the x values about their mean.

double 
predict(double x)
Returns the "predicted"
y value associated with the
supplied x value, based on the data that has been
added to the model when this method is activated. 
void 
removeData(double[][] data)
Removes observations represented by the elements in
data . 
void 
removeData(double x,
double y)
Removes the observation (x,y) from the regression data set.

void 
setDistribution(TDistribution value)
Deprecated.
in 2.2 (to be removed in 3.0).

public SimpleRegression()
@Deprecated public SimpleRegression(TDistribution t)
other constructor
instead.t
 the distribution used to compute inference statistics.public SimpleRegression(int degrees)
degrees
 Number of degrees of freedom of the distribution
used to compute inference statistics.public void addData(double x, double y)
Uses updating formulas for means and sums of squares defined in "Algorithms for Computing the Sample Variance: Analysis and Recommendations", Chan, T.F., Golub, G.H., and LeVeque, R.J. 1983, American Statistician, vol. 37, pp. 242247, referenced in Weisberg, S. "Applied Linear Regression". 2nd Ed. 1985.
x
 independent variable valuey
 dependent variable valuepublic void removeData(double x, double y)
Mirrors the addData method. This method permits the use of SimpleRegression instances in streaming mode where the regression is applied to a sliding "window" of observations, however the caller is responsible for maintaining the set of observations in the window.
The method has no effect if there are no points of data (i.e. n=0)x
 independent variable valuey
 dependent variable valuepublic void addData(double[][] data)
data
.
(data[0][0],data[0][1])
will be the first observation, then
(data[1][0],data[1][1])
, etc.
This method does not replace data that has already been added. The
observations represented by data
are added to the existing
dataset.
To replace all data, use clear()
before adding the new
data.
data
 array of observations to be addedpublic void removeData(double[][] data)
data
.
If the array is larger than the current n, only the first n elements are processed. This method permits the use of SimpleRegression instances in streaming mode where the regression is applied to a sliding "window" of observations, however the caller is responsible for maintaining the set of observations in the window.
To remove all data, use clear()
.
data
 array of observations to be removedpublic void clear()
public long getN()
public double predict(double x)
y
value associated with the
supplied x
value, based on the data that has been
added to the model when this method is activated.
predict(x) = intercept + slope * x
Preconditions:
Double,NaN
is
returned.
x
 input x
valuey
valuepublic double getIntercept()
The least squares estimate of the intercept is computed using the normal equations. The intercept is sometimes denoted b0.
Preconditions:
Double,NaN
is
returned.
public double getSlope()
The least squares estimate of the slope is computed using the normal equations. The slope is sometimes denoted b1.
Preconditions:
Double.NaN
is
returned.
public double getSumSquaredErrors()
The sum is computed using the computational formula
SSE = SYY  (SXY * SXY / SXX)
where SYY
is the sum of the squared deviations of the y
values about their mean, SXX
is similarly defined and
SXY
is the sum of the products of x and y mean deviations.
The sums are accumulated using the updating algorithm referenced in
addData(double, double)
.
The return value is constrained to be nonnegative  i.e., if due to rounding errors the computational formula returns a negative result, 0 is returned.
Preconditions:
Double,NaN
is
returned.
public double getTotalSumSquares()
This is defined as SSTO here.
If n < 2
, this returns Double.NaN
.
public double getXSumSquares()
n < 2
, this returns Double.NaN
.public double getSumOfCrossProducts()
public double getRegressionSumSquares()
This is usually abbreviated SSR or SSM. It is defined as SSM here
Preconditions:
Double.NaN
is
returned.
public double getMeanSquareError()
If there are fewer than three data pairs in the model,
or if there is no variation in x
, this returns
Double.NaN
.
public double getR()
Preconditions:
Double,NaN
is
returned.
public double getRSquare()
Preconditions:
Double,NaN
is
returned.
public double getInterceptStdErr()
If there are fewer that three observations in the
model, or if there is no variation in x, this returns
Double.NaN
.
public double getSlopeStdErr()
If there are fewer that three data pairs in the model,
or if there is no variation in x, this returns Double.NaN
.
public double getSlopeConfidenceInterval() throws MathException
The 95% confidence interval is
(getSlope()  getSlopeConfidenceInterval(),
getSlope() + getSlopeConfidenceInterval())
If there are fewer that three observations in the
model, or if there is no variation in x, this returns
Double.NaN
.
Usage Note:
The validity of this statistic depends on the assumption that the
observations included in the model are drawn from a
Bivariate Normal Distribution.
MathException
 if the confidence interval can not be computed.public double getSlopeConfidenceInterval(double alpha) throws MathException
The (100100*alpha)% confidence interval is
(getSlope()  getSlopeConfidenceInterval(),
getSlope() + getSlopeConfidenceInterval())
To request, for example, a 99% confidence interval, use
alpha = .01
Usage Note:
The validity of this statistic depends on the assumption that the
observations included in the model are drawn from a
Bivariate Normal Distribution.
Preconditions:
Double.NaN
.
(0 < alpha < 1)
; otherwise an
IllegalArgumentException
is thrown.
alpha
 the desired significance levelMathException
 if the confidence interval can not be computed.public double getSignificance() throws MathException
Specifically, the returned value is the smallest alpha
such that the slope confidence interval with significance level
equal to alpha
does not include 0
.
On regression output, this is often denoted Prob(t > 0)
Usage Note:
The validity of this statistic depends on the assumption that the
observations included in the model are drawn from a
Bivariate Normal Distribution.
If there are fewer that three observations in the
model, or if there is no variation in x, this returns
Double.NaN
.
MathException
 if the significance level can not be computed.@Deprecated public void setDistribution(TDistribution value)
value
 the new distribution"Copyright © 2010  2018 Adobe Systems Incorporated. All Rights Reserved"