Alright, let’s dive into this 1991 classic draft picks thing. I was messing around with some old sports data, trying to see if I could predict player performance based on pre-draft info. You know, just a little side project to keep my coding skills sharp and relive some sports nostalgia.

First off, I grabbed the data. Found a CSV file online with all the 1991 NBA draft picks – name, college, position, all that jazz. It wasn’t the cleanest data, I tell ya. Missing values, weird formatting… typical stuff. So, I spent a good chunk of time cleaning it up. Used Python with Pandas, of course. Gotta love Pandas for wrangling data.
Then, I started thinking about what I wanted to predict. I figured career points scored would be a good measure of overall success. So, I needed to find that data too. * is a lifesaver. Scraped the career stats for each player. That took a while, gotta be respectful with the scraping so I don’t get blocked.
Next up: Feature Engineering. This is where the fun begins! I wanted to see if things like college performance (points per game, rebounds, etc.) had any correlation with career points. Also looked at draft position – you’d think a higher pick would translate to a better career, right? I calculated some basic stats, normalized the data, and created a few interaction terms (like draft position college points). Just playing around to see what sticks.
After that, it was modeling time. I kept it simple. Started with a Linear Regression model in scikit-learn. Split the data into training and testing sets. Trained the model on the training data and then evaluated it on the testing data. The R-squared value wasn’t great, but it was a start. Then I tried a few other models: Ridge Regression, Lasso Regression, and even a simple Random Forest.
The results? Well, it wasn’t exactly groundbreaking stuff. Draft position had some predictive power, as expected. College performance also played a role, but not as strong as I thought it would. Some players drafted low ended up having great careers, and vice versa. It just goes to show you, the draft is a crapshoot!
- Data Cleaning and Preparation using Pandas
- Web Scraping using Beautiful Soup and Requests
- Feature Engineering (college stats, draft position, interactions)
- Model Training and Evaluation using Scikit-learn (Linear Regression, Ridge, Lasso, Random Forest)
Finally, I visualized the results. Made some scatter plots showing predicted vs. actual career points for each model. Also created a bar chart showing the feature importance for the Random Forest model. It’s always good to see what the model thinks is important.
What did I learn? Well, predicting NBA player success is hard! There are so many factors that go into it – injuries, coaching, team fit, luck… But it was a fun exercise in data analysis and machine learning. And it reminded me of some great players from that 1991 draft class. Maybe I’ll try a different draft class next time, or add some more advanced features. Who knows? It’s all about tinkering and learning.