About the Author

Cameras, Computation, and Big Data Offer New Insight in Sports

By Tim Chartier and Miles Abbett

For several generations, fans have been tracking statistics on their favorite sporting events from baseball to cricket to soccer.  While many college basketball teams continue to assign roles to their staff regarding such record-keeping, cameras and computers are logging a large amount of data, previously unimaginable in the NBA.

In the fall of 2013, the NBA signed a contract with STATS LLC to place SportVU cameras in every arena.  The cameras are generally perched in a stadium’s rafters, snapping 25 frames per second. The images feed into a computer that translates the pictures into a treasure trove of data.

A SportVU camera perched high in the United Center arena in Chicago, Illinois.

SportVU began in 2005 when Israeli missile defense specialist Miky Tamir decided to explore the use of defense system optical tracking technology to create sports stats. Before long, such methods of computer vision were tracking soccer players, the ball, and referees in arenas. In 2008, STATS purchased the SportVU technology and were soon adapting it to basketball. Among various differences between the sports, the smaller space on a basketball court leads to more congestion and the need for multiple angles to track jersey numbers in order to identify players.  So, STATS uses six cameras to track basketball and three for soccer.

Big Data and New Insight

What exactly do the SportVU cameras produce?  For each game, the NBA receives an XML file containing raw coordinate data for each player and referees along with the UNIX time code and game clock.  Coordinate data is also supplied for the ball—with this being 3 dimensional data. This data is supplied for each 1/25 of a second of the game producing a 40–45 MB file for each game. Teams receive the data on every game—everything except the referee data, which is analyzed only by the NBA.

Take a moment and consider the amount of data and possible insight in such a file.  One can easily track how far any given player (or the ball for that matter) traveled in the duration of a game.  For example, Nicolas Batum of the Portland Trail Blazers leads the league in distance traveled this season at a total of 139.6 miles, an average of 3.5 miles per game. Simple difference formulas approximate the velocities or acceleration of players, referees, or the ball.  Where were players when a shot was missed?  Who rebounded the ball and where were other players?

To aid in these types of questions, STATS offers another set of data, which results from algorithms developed by the company to identify events in a game.  For example, who touches the ball, shoots the ball, rebounds, and so forth.  This information is entered into a database and easily accessible to NBA teams.

It’s important to note that the automated play-by-play data available online from the NBA may record the same time for a shot as the rebound.  However, one to two seconds can elapse between these events. For a system as precise as SportVU, this is significant. The arrangement of players can look different the moment a shot is launched versus when it bounces off the rim. For teams, both arrangements are important and potentially insightful.   STATS can use the NBA time feed and compare it to their XML raw data to know exactly when the shot occurred and then where every player was on the court.

Jason Rosenfeld, Director of Basketball Analytics for the Charlotte Bobcats, stated, “The data provides an unprecedented level of detail and provides an opportunity for analyses that could only have been dreamt of previously. As such, it enables analysts to answer questions and learn things about the game that were previously virtually unanswerable.”

STATS is a 24/7 organization tracking sports data 365 days of the year around the globe.  As such, they have support staff available during every game.  The support is needed in part because human verification ensures accurate recording of the data.

Computationally Tracking Players

During the game, the computer searches for the numbers on each jersey.  It then calculates the probability that a given jersey is, for instance, Stephen Curry who wears number 30 for the Golden State Warriors. If the probability falls below a threshold, a message is popped up with several cutouts of the jersey in question. Once verified, the player in question can be traced back and the coordinate data updated for the period that the person was not fully identified.

To aid in this verification, samples of jerseys are supplied. Keep in mind that arenas differ in lighting. For example, the Lakers play on a brightly lit court with very darkly lit stands. This is quite different than the lighting for the Clippers, despite the fact that both teams play in the Staples Center in Los Angeles.

Who’s entering the data? In Charlotte, a team of interns were trained, with several being undergraduates at my very own Davidson College. Jason Rosenfeld of the Bobcats noted, “The Davidson interns, who work every game for us, have done a great job ensuring the data gets collected correctly.”

Patrick Bolton, an optical analyst for STATS LLC, guides Ross Kruse, Davidson College Class of ’17, in operating the SportVU system.

Ross Kruse, a Davidson freshman who is a Bobcats interns, commented, “Working with SportsVU…allows me to see real world applications of the skills I am learning in the classroom….it gives me the opportunity to work in the industry I hope to pursue. Very few freshman get the chance to have an internship in their hopeful field much less in an industry as small as the sports analytics industry. Going to the games never feels like a chore but instead like an honor.” Kruse is seen to the right working with Patrick Bolton, an optical analyst for STATS, on the SportVU system.  The goal isn’t necessarily data collection but data that gives insight.  To this end, as the data is collected during the game, STATS creates reports either in PDF form or for their iPad application for the coaching staffs.  As such, teams have detailed game data less than 2 minutes after anything happens on the court.

This new level of insight isn’t confined to NBA teams.  Fans can see the fruits of this new wave of data analysis and gain new insights on the game.  The NBA’s official website now displays a “Player Tracking Data” section online, giving fans access to statistics that have never been made publicly available. The information includes basics, such as distance traveled and number of touches, and more advanced metrics, such as opponent field goal percentage allowed at the rim and number of rebounds per game where an opponent is within 3.5 feet of the rebounder.  With this cutting edge data, fans gain a whole new perspective on the true tendencies, strengths, and weaknesses of their favorite players across the league.

Data and Algorithms Now and in the Future

For the SIAM community, a natural question may arise: “Can I see the data?” Due to the contracts, the data is simply not publicly or academically available. However, Ryan Warkins, Director of Basketball Products for STATS, noted that as the system becomes more popular for college sports, it may be available for those sports in a year or so.  Time will tell.

So, what lies ahead for STATS, and what hurdles must be crossed?  Warkins noted that the company continues to explore efficient algorithms that expand the scope of real-time data.  Some data, while available, can only be offered after the game.  The methods vary depending on the event of interest.

Trevor Booker (35) of the Washington Wizards sets a pick for his teammate Kirk Hinrich as part of a pick and roll offensive play. This picture was taken from http://en.wikipedia.org/wiki/File:Trevor_Booker,_Tony_Parker_and_Kirk_Hinrich.jpg.

Some events, like total distance traveled by a player, are easily tracked.  Other events, like a pick and roll, are harder to identify.  Take the pick and roll to the left. Here, Trevor Booker is setting what is called a pick or screen for his teammate Kirk Hinrich.  When Booker blocks the defender, Tony Parker, Hinrich is freed of Parker.  Following this screen, Booker would “roll” by slipping behind Parker to accept a pass from Hinrich.  Such a play is called a pick and roll and requires techniques in machine learning to identify such an event.  As Warkins stated, “There is a different solution to each problem.”  And, as the SIAM community knows, there is a large collection of methods to try and ever the interest in developing more when current techniques fail.

In time, you may be watching a sporting event and new stats pop up on the screen, detailing the game at a level never seen before.  This could, very well, be the result of computers “seeing” the game through computational methods.  And then, during and after the game, computational methods will analyze the data—searching to improve performance.  In the future, computational methods may offer a breakthrough in a sport leading to a new level of performance and competition in the game.

Tim Chartier is an associate professor in the Department of Mathematics and Computer Science at Davidson College. To learn more about Tim’s work and how he teaches it to students, view a profile of his work by the LA Times
   Miles Abbett, a senior at Davidson College, is a Manager of Basketball Analytics for the Davidson College men’s basketball team.