Thank you for visiting TenniStats. We are a blog that focuses on the analysis and visualization of tennis data. In this first post, we start by exploring our dataset and sharing some general insights to lay the groundwork for future discussion. Please subscribe to stay up-to-date on our progress!
We first examined data for the 2015 season consisting of every men’s singles match played at the ATP 250 level or higher. It contains data points on the tournament, players, and statistics of each match, including points won, aces, and double faults. We examined how court surface impacts these match statistics, as well as how height may give an advantage on different surfaces.
During the season, players transition from playing on hard courts at the Australian Open, to clay courts in the run up to Roland Garros, to grass courts a week later for Wimbledon, only to finish on hard court for the U.S. Open. As the breakdown indicates, most matches occur on hard courts, then clay, and only 11% of matches are played on grass. The diversity in court surface creates lots of drama, as different players are better suited for different surfaces, creating great entertainment for fans.
Does playing surface make a difference in Aces and Doubles Faults? Yes and no.
Because a grass court is a faster playing surface, it is typical to see shorter points, more serve and volleying, and more winners hit, including ace winners. Our data indicate that players on grass hit 2.8x more aces than double faults while only 1.7x more aces than double faults on clay. This makes sense as clay courts are slower and give a player more time to retrieve the ball. However, the comparison shows that players are just as likely to hit a double fault on clay as on other surfaces. The surface has less effect on your actual service motion than on the aspect of returning the ball, which is why we would not expect it to change.
Does height make a difference in winning matches on different surfaces? Yes, on grass.
Our data indicate that on grass, height plays a positive role in increasing your odds to win. This may be due to having an easier time hitting aces, winners, and volleys. However, the opposite seems true on clay, as the average winner height is shorter than the average loser height. Since a shorter player will have a lower center of gravity and be more mobile on clay, and a taller player’s aces will be less effective, the shorter player may have an advantage.
Are the results significant?
Grass: The average height difference on a per match basis is 1.7cm, with a 95% confidence interval of 0.5 to 2.9 cm, leading us to reject the null hypothesis that there is no difference. This test does not prove causation and it also does not quantify the size of the effect, strictly telling us that, based on the data, there is correlation. Under the null hypothesis model, we would expect a lower than 5% chance that this result would occur by random variation in the dataset.
Clay: The average height difference on a per match basis is -0.59 with a 95% confidence interval between -1.26 to 0.07 cm, leaving us without enough evidence to reject the null hypothesis. However, the 90% confidence interval narrows to -1.15 to -0.04, which would lead us to reject the null hypothesis that there is no difference in heights on clay. This is a very weak test and conclusion, and the results are taken with caution. Additional analysis is needed to tease out if there is in fact a relationship between winner and loser height on clay.