Okay, so I dove into this Tsitsipas vs. Jarry prediction thing. Thought it’d be a fun little project. Here’s how it all went down.

First things first: Data, Data, Data! I started by scraping a bunch of tennis data. Match results, player stats, the whole shebang. I focused on Tsitsipas and Jarry, naturally, but also grabbed data on their opponents and playing conditions.
Next up, cleaning the mess. Let’s be honest, raw data is never clean. Missing values, inconsistencies… ugh. I spent a good chunk of time filling in the blanks, standardizing formats, and making sure everything was actually usable.
Feature Engineering Time. This is where it got interesting. I started calculating some potentially useful stats. Win percentages on different surfaces, head-to-head records, recent form (wins/losses in the last X matches), average number of aces, break point conversion rates… You name it, I tried it. I wanted to capture every possible advantage either player might have.
Then I loaded all the features into a model. Started with a simple logistic regression to get a baseline. Figured, hey, it’s easy to understand and quick to train. It spit out a prediction, but honestly, it wasn’t that great. Accuracy was hovering around 60%, which is barely better than a coin flip. Back to the drawing board!
Tried other Models. Okay, so logistic regression wasn’t cutting it. I moved on to more complex stuff: Random Forests, Support Vector Machines, even dabbled with a basic neural network. Each model required some tuning, fiddling with hyperparameters to optimize performance.
Ran into overfitting at one point, where the model was performing great on the training data but terribly on the test data. Classic. Added regularization, simplified the model, and got things back on track.
The Random Forest seemed to be giving me the best results overall. The accuracy on my test set crept up to around 70-72%. Still not perfect, but definitely an improvement.
- Tsitsipas’ Strengths: Solid baseline game, good serve.
- Jarry’s Strengths: Powerful serve, aggressive style.
The Final Prediction? After all that, my model gave Tsitsipas a 65% chance of winning. Jarry’s got a puncher’s chance with his serve, but Tsitsipas is generally more consistent.

Of course, tennis is unpredictable. Anything can happen on the day. Players have off days, injuries pop up, the crowd can play a factor… So take it with a grain of salt. But hey, that’s the fun of it, right?
I did learn a lot doing this, though. Data cleaning is a pain, feature engineering is where the magic happens, and model selection is crucial. And even with all that, you’re still just making an educated guess. Good times!