• The project aimed to build a model to help Chris Paul, a professional basketball player, identify the statistics that most influence his offensive ability, particularly points scored per game.
  • Over 1,000 rows of game statistics were scraped from Chris Paul’s page on Basketball Reference to build a linear regression model.
  • The model was designed to provide interpretable and actionable insights that Chris Paul could incorporate into his playing strategy.
  • The data collection involved scraping per game statistics for Paul’s entire career, totaling 1155 regular season games.
  • A pairplot graph was built to visualize the data and identify any significant collinearities, after which redundant features were removed.
  • Three different regression models were built: Linear Regression, Polynomial Regression, and Lasso Regression. The Linear Regression model was chosen for its simplicity and interpretability.
  • The data consisted of Paul’s in-game stats (points, assists, steals, etc.) along with some categorical data such as home vs away, opponent, and date.
  • The Linear Regression model revealed that steals and defensive rebounds were not significant predictors of points per game as their p-values were greater than 0.6.
  • Field goal attempts and free-throw attempts were found to be the biggest indicators of points scored, with coefficients of 1.11 and 0.8 respectively, followed by three-point attempts with a coefficient of 0.39.
  • Interestingly, offensive rebounds were found to be the largest deterrent of points scored, with a coefficient of -0.914.

Linked to GitHub Repo