Betting Spreadsheets 102: Breaking ESPN’s gameID Code

This Betting Spreadsheets post assumes a basic level of familiarity with Microsoft Excel. I would suggest opening Excel and trying to replicate what I explain below. If you run into issues, feel free to contact me.
I just started using ESPN to bring in box scores to my database. I never wanted to use them in the past because it required a lot of manual work to get each individual gameid. If you pay attention to website addresses, you can usually start to figure out patterns that you can predict what a given address will show. Teams may have unique team names or numbers a site uses and a specific format in the address such as team vs team and date, etc. ESPN has the same address for box scores except the number at the end is different for each game. It follows this scructure:
https://scores.espn.go.com/ncf/boxscore?gameId= [and then a number for that game].
To my knowledge, there is no pattern or way to predict with 100% certainty what the gameid for a specific game is. Until sportingnews.com revamped their website, I knew what the pattern was and could create the addresses in excel and cycle through each one as I needed it. With ESPN, I am still stumped at the pattern, but I came up with a solution to easily import a given week’s games.
I use Chrome as my browser. If I navigate to the scoreboard for a given week (for example https://scores.espn.go.com/college-football/scoreboard?confId=80&seasonYear=2014&seasonType=2&weekNumber=6), I can right click in the browser and select “View Page Source.”
Again, I’m not a website expert, but to me all of the code in here can be very helpful in determining how a website is working behind the scenes. However, looking at the page source for this address makes me nauseous. The gameids for the week are all in here which is all I’m concerned with. If you copy from row 1 all the way to the end and paste into excel (match destination formatting if prompted), you can find them. If you pasted into cell A1, sort by column A. Scroll down to row 43. Rows 43 to 99 in column A have the gameids for each game. You can write a simple function in Excel to grab only the gameid and combine it with the web address from before that does not change.
What can then be done is running a macro with a loop that imports each box score. If you just wanted one specific box score, you could use my previous post to help do that (https://dailysportsedge.com/17395/betting-spreadsheets-101-basics/). Even though I can’t completely automate my whole process, navigating to the scoreboard and doing a quick copy/paste/sort and grabbing the gameids I need is a relatively painless process to get all the box scores from the week to analyze as needed.
The same method could also be applied to play-by-plays, but for the ambitious, I’ll warn you that ESPN’s play-by-play will disappoint you as it’s loaded with errors. How the worldwide leader cannot have perfect data or at least very accurate data boggles my mind, but some may find a less accurate ESPN play-by-play is worth it compared to shelling out thousands of dollars to sites that have it done for you. Sportingnews.com used to have a pretty accurate play-by-play that could be easily scraped, but they revamped their website and no longer is a source I can use. This is not for the faint of heart and can be very tedious to setup, but once complete can provide some great detail that isn’t readily available without paying a pretty hefty price tag.
I used ESPN as an example, but you can apply the thought process to other sites and test them out to see if they work well for you. If you have problems replicating, comment below and I’ll try to help you troubleshoot.
Tags Betting Resources
0 thoughts on “Betting Spreadsheets 102: Breaking ESPN’s gameID Code”