How It WorksWritersEssay ExamplesReviewsPricing

Statistics · Regression analysis

Linear Regression Analysis: NFL Team Current Value vs. Operating Income

Population vs. sample comparison with correlation and simple linear regression modeling

Difficulty 6/10~45 min to solvePublished Jun 2026Donated by students
Linear regressionCorrelationdescriptive-statisticsConfidence intervalsHypothesis testing
Sample for learning, not submission. Read it, study it — write your own. Honor code →

The problem

PROJECT #2 — Regression Analysis

Using a dataset of NFL team valuations and operating income, complete a multi-part regression analysis project covering descriptive statistics, sampling, confidence intervals, correlation, and simple linear regression.

Part 1. Find descriptive statistics for the population (N=32) for Current Value and Operating Income, using appropriate graphs (histogram, stem-and-leaf, box plot, Q-Q plot) to assess normality.

Part 2. Select a random sample of 11 teams and repeat the descriptive analysis on the sample.

Part 3. Compare the population and sample findings, including differences in means, standard deviations, and medians.

Part 4. Develop 95% confidence interval estimates for the sample means of Current Value and Operating Income, and determine if they contain the population means.

Part 5. Find Pearson's correlation for both the population and sample, and explain the findings.

Part 6. Develop a simple linear regression equation for the sample (n=11) using SPSS and manual calculations, including: specifying the model, identifying variables, formulating the equation, drawing the regression line, creating a scatter plot, finding residuals, explaining coefficients, and forecasting Y for three X values.

Part 7. Write a conclusion highlighting the major regression findings.

Introduction

This assignment evaluates descriptive statistics, confidence interval estimates, correlation analysis, and simple linear regression using SPSS. The data is analyzed at two levels: the full population (N = 32 NFL teams) and a random sample (n = 11 teams). The two variables examined are Current Value (in millions of dollars) and Operating Income (in millions of dollars). Both variables are numerical with a scale level of measurement.

Part 1 — Descriptive Statistics of the Population (N = 32)

Descriptive Statistics Table

NSumMeanStd. DeviationMedianSkewnessKurtosis
Current Value (in Millions $)32629101965.94629.41817601.5052.164
Operating Income (in Millions $)322438.876.21250.445161.5502.3266.676

Discussion: The population (N = 32) shows a mean current value of $1965.94 million with a standard deviation of $629.418 million, and a median of $1760 million. Operating income has a mean of $76.212 million, standard deviation of $50.4451 million, and median of $61.550 million. Both variables show positive skewness (1.505 and 2.326 respectively), indicating right-skewed distributions with a long tail of high-value teams.

Histograms

Histograms of Current Value and Operating Income for the population (N=32) of NFL teams, both showing right-skewed distributions with most teams clustered at lower values and a tail extending toward higher values

Discussion: The population (N = 32) histograms for Current Value and Operating Income show that the scores do not have a normal distribution (no perfect bell-shaped curve) — the graphs are skewed to the right. This indicates that most teams have moderate valuations and incomes, with a few high-revenue franchises pulling the mean above the median.

Part 2 — Random Sample Selection and Descriptive Statistics (n = 11)

Simple Random Sample (n = 11)

2015 RankTeamCurrent Value (in Millions $)Operating Income (in Millions $)
#28St Louis Rams145034.00
#23New Orleans Saints152070.00
#12Baltimore Ravens193059.80
#26Tennessee Titans149050.50
#19Carolina Panthers156077.80
#22San Diego Chargers153064.80
#14Indianapolis Colts188090.10
#30Detroit Lions144036.10
#21Kansas City Chiefs153048.60
#15Seattle Seahawks187043.60
#3Washington Redskins2850124.90

Descriptive Statistics Table (Sample)

NSumMeanStd. DeviationMedianSkewnessKurtosis
Current Value (in Millions $)11190501731.82413.46915302.2815.800
Operating Income (in Millions $)11700.263.65526.734659.8001.2191.645

Discussion: The sample (n = 11) shows a mean current value of $1731.82 million with a standard deviation of $413.469 million and a median of $1530 million. Operating income has a mean of $63.655 million, standard deviation of $26.7346 million, and median of $59.800 million. The standard deviations are notably smaller than the population values, reflecting reduced variability in this particular sample.

Histograms (Sample)

Histograms of Current Value and Operating Income for the sample (n=11) of NFL teams, both showing approximately normal distributions with bell-shaped curves

Discussion: The sample size (n = 11) histograms for Current Value and Operating Income show that the scores are approximately normally distributed (reasonably well-shaped bell curves). This is expected for small samples drawn from skewed populations — individual samples may appear more symmetric than the parent distribution.

Part 3 — Comparison of Population and Sample

Comparison Table

Current Value ($M)Operating Income ($M)
Population Mean1965.9476.212
Population Std. Deviation629.41850.4451
Population Median176061.550
Sample Mean1731.8263.655
Sample Std. Deviation413.46926.7346
Sample Median153059.800

Discussion: The mean, standard deviation, and median of the population (N = 32) are observed to be higher than the corresponding sample (n = 11) statistics for both Current Value and Operating Income. The differences are substantial: the population mean for Current Value is $234.12 million higher than the sample mean, and the population standard deviation is $215.95 million higher. This reflects sampling variability — this particular random sample happened to exclude some of the highest-valued teams (e.g., Dallas Cowboys, New England Patriots) that pull the population mean upward.

Part 4 — Confidence Interval Estimates (95% Level)

Current Value ($M)

Sample mean: xˉ=1731.82\bar{x} = 1731.82

Sample standard deviation: s=413.469s = 413.469

Sample size: n=11n = 11

Standard error: SE=sn=413.46911=124.665SE = \frac{s}{\sqrt{n}} = \frac{413.469}{\sqrt{11}} = 124.665

For a 95% confidence interval, using z=1.96z = 1.96:

CI=xˉ±zSE=1731.82±1.96×124.665CI = \bar{x} \pm z \cdot SE = 1731.82 \pm 1.96 \times 124.665 CI=1731.82±244.34=[1487.48,1976.16]CI = 1731.82 \pm 244.34 = [1487.48, 1976.16]

Discussion: The 95% confidence interval for Current Value is [1487.48,1976.16][1487.48, 1976.16] million. The population mean of $1965.94 million is contained within this interval, meaning our sample-based estimate successfully captures the true population parameter.

Operating Income ($M)

Sample mean: xˉ=63.655\bar{x} = 63.655

Sample standard deviation: s=26.7346s = 26.7346

Sample size: n=11n = 11

Standard error: SE=sn=26.734611=8.061SE = \frac{s}{\sqrt{n}} = \frac{26.7346}{\sqrt{11}} = 8.061

For a 95% confidence interval:

CI=63.655±1.96×8.061=63.655±15.80=[47.86,79.45]CI = 63.655 \pm 1.96 \times 8.061 = 63.655 \pm 15.80 = [47.86, 79.45]

Discussion: The 95% confidence interval for Operating Income is [47.86,79.45][47.86, 79.45] million. The population mean of $76.212 million is contained within this interval, confirming that our sample provides a reliable estimate of the population parameter.

Part 5 — Correlation Analysis

Population (N = 32)

Pearson Correlation: r=0.918r = 0.918

Significance: p<0.001p < 0.001 (two-tailed)

Discussion: A very strong and statistically significant positive correlation (r=0.918r = 0.918) is observed between Current Value and Operating Income across the population of N = 32 teams. The p-value is less than 0.001, indicating that this relationship is highly unlikely to have occurred by chance. Teams with higher operating incomes tend to have substantially higher current valuations.

Sample (n = 11)

Pearson Correlation: r=0.791r = 0.791

Significance: p=0.004p = 0.004 (two-tailed)

Discussion: A strong and statistically significant positive correlation (r=0.791r = 0.791) is observed between Current Value and Operating Income in the sample of n = 11 teams. The p-value of 0.004 is well below the conventional 0.05 threshold, confirming that the relationship remains significant even in the smaller sample. The sample correlation is somewhat lower than the population correlation (0.791 vs. 0.918), which is typical of sampling variability.

Part 6 — Simple Linear Regression Analysis (Sample, n = 11)

Model Specification and Features

Model Summary:

RAdjusted R²Std. Error of the Estimate
0.7910.6260.585266.435

Coefficients:

TermBStd. ErrortSig.
(Constant)952.734216.0944.4090.002
Operating Income12.2393.1513.8840.004

Features: The coefficient of determination (R2R^2) is 0.626, meaning that 62.6% of the variability in Current Value is explained by Operating Income. Both the constant (intercept) and slope coefficients are statistically significant (p = 0.002 and p = 0.004 respectively), indicating that the regression model is meaningful.

Variables

Independent variable (X): Operating Income (in millions of dollars)

Dependent variable (Y): Current Value (in millions of dollars)

Regression Equation

Y=952.734+12.239×Operating IncomeY = 952.734 + 12.239 \times \text{Operating Income}

where YY is the predicted Current Value and Operating Income is measured in millions of dollars.

Regression Line and Scatter Plot

Scatter plot of Current Value vs. Operating Income for the sample (n=11) with fitted regression line Y = 952.734 + 12.239X, showing positive linear relationship with one high-leverage point (Washington Redskins) in the upper right

Discussion: The scatter plot shows a clear positive linear relationship between Operating Income and Current Value. The regression line fits the data reasonably well, with most points clustered around the line. The Washington Redskins (Operating Income = $124.9M, Current Value = $2850M) appears as a high-leverage point in the upper right.

Residuals

Residual plot showing the vertical distances between observed Current Value points and the fitted regression line for each of the 11 sample teams

Discussion: Residuals represent the vertical distances between the observed Current Value and the predicted value from the regression line. Positive residuals indicate teams valued higher than the model predicts; negative residuals indicate teams valued lower than predicted. The residual plot helps diagnose whether the linear model is appropriate and whether assumptions like constant variance are met.

Interpretation of Coefficients

Intercept (a0=952.734a_0 = 952.734): When Operating Income is zero, the model predicts a Current Value of $952.734 million. This represents the baseline value of an NFL franchise independent of operating income, though extrapolating to zero income is beyond the range of the data and may not be meaningful.

Slope (b1=12.239b_1 = 12.239): For each additional $1 million in Operating Income, the model predicts an increase of $12.239 million in Current Value. This is a substantial multiplier effect — profitability strongly drives franchise valuation in the NFL market.

Forecasting Y for Three X Values

Using the regression equation Y=952.734+12.239×XY = 952.734 + 12.239 \times X:

Operating Income (X)CalculationPredicted Current Value (Y)
$54$952.734 + 12.239(54) = 952.734 + 660.906$$1613.64
$67.70$952.734 + 12.239(67.70) = 952.734 + 828.580$$1781.31
$93$952.734 + 12.239(93) = 952.734 + 1138.227$$2090.96

Forecasted values:

  • Y54=1613.64Y_{54} = 1613.64 million
  • Y67.70=1781.31Y_{67.70} = 1781.31 million
  • Y93=2090.96Y_{93} = 2090.96 million

All three X values fall within the range of the sample data (min = $34.00M, max = $124.90M), so these predictions are reasonable interpolations rather than risky extrapolations.

Part 7 — Conclusion

In the descriptive comparison, the mean, standard deviation, and median values for both Current Value and Operating Income were higher in the population (N = 32) than in the sample (n = 11), reflecting sampling variability and the exclusion of some high-valued franchises from this particular sample.

Both 95% confidence intervals constructed from the sample statistics successfully contained the true population means, demonstrating that the sample (despite being smaller and having lower means) provided reliable estimates of the population parameters.

The correlation analysis revealed a very strong positive relationship between Current Value and Operating Income in the population (r=0.918r = 0.918) and a strong relationship in the sample (r=0.791r = 0.791). Both correlations were statistically significant, confirming that profitability is a key driver of franchise value.

The simple linear regression model based on the sample (Y=952.734+12.239×Operating IncomeY = 952.734 + 12.239 \times \text{Operating Income}) explained 62.6% of the variance in Current Value. The slope coefficient of 12.239 indicates that each additional million dollars in operating income predicts a $12.239 million increase in team valuation — a powerful multiplier effect. Both the intercept and slope were statistically significant (p = 0.002 and p = 0.004 respectively), confirming the model's validity.

Forecasts for three operating income values ($54M, $67.70M, and $93M) produced current value predictions of $1613.64M, $1781.31M, and $2090.96M respectively, all within plausible ranges given the sample data. The regression model provides a useful tool for estimating NFL franchise valuations based on profitability, though the unexplained 37.4% of variance suggests other factors (market size, stadium revenue, brand strength) also play important roles.

Read with the expert

Difficulty 6/104 techniques4 mistakes to watch6 expert's notes

Difficulty

6/10

Multi-part project combining descriptive statistics, confidence intervals, correlation, and regression. Challenge is integrating multiple techniques coherently rather than any.

Techniques used

  1. 01
    Descriptive statistics

    Characterize the distribution of Current Value and Operating Income for both population and sample before modeling.

  2. 02
    Correlation analysis

    Quantify the linear relationship strength between the two variables before building the regression model.

  3. 03
    Confidence intervals

    Estimate the range within which the true population means likely fall, using the sample statistics.

  4. 04
    Simple linear regression

    Model the relationship between operating income (predictor) and current value (response) to make forecasts and understand the slope.

Watch for

  • Confusing population (N=32) and sample (n=11) statistics. The confidence intervals are built from the sample but judged against the population parameters.

  • Forgetting to check the confidence interval coverage. The question explicitly asks whether the CIs contain the true population means — you must compare numerically.

  • Misinterpreting R² as correlation. R² = 0.626 means 62.6% of variance explained; the correlation is r = 0.791 (the square root, with sign matching the slope).

Expert's notes

Editor's analysis

What this solution demonstrates, and what to watch out for.

Descriptive statistics. Characterize the distribution of Current Value and Operating Income for both population and sample before modeling.
Correlation analysis. Quantify the linear relationship strength between the two variables before building the regression model.
Confidence intervals. Estimate the range within which the true population means likely fall, using the sample statistics.
Simple linear regression. Model the relationship between operating income (predictor) and current value (response) to make forecasts and understand the slope.
Confusing population (N=32) and sample (n=11) statistics. The confidence intervals are built from the sample but judged against the population parameters.
Forgetting to check the confidence interval coverage. The question explicitly asks whether the CIs contain the true population means — you must compare numerically.
Misinterpreting R² as correlation. R² = 0.626 means 62.6% of variance explained; the correlation is r = 0.791 (the square root, with sign matching the slope).
Stating the regression slope without context. '12.239' alone is meaningless — say 'a \$1 million increase in operating income predicts a \$12.239 million increase in team value.'

Don't copy a sample. Get one written for you.

Original work tailored to your prompt, your level, and your deadline, with the same care that goes into these samples.

Honor code

Studying it is fair use. Submitting it as your own is not.

These samples are donated by past students and shared as a learning resource. They are not for submission. Copying any sample verbatim would fail Turnitin and violate your institution's academic integrity policy. Read them to understand structure, evidence, and argument, then write your own. See our honor code and DMCA policy for details.