Ranking teams using Discrete Time Markov Chains

9 minute read

Ranking methods are an essential tool in making decisions. They have many applications from sports to web searches to recommender systems. One of the most popular ranking algorithm is Google’s Page Rank algorithm that also uses the notion of Markov Chains in some capacity. In this post we use Discrete Time Markov Chains (DTMC’s) to rank 32 NFL teams after the regular season.

Introduction

The National Football League (NFL) is a professional American football league consisting of 32 teams, divided equally between the National Football Conference (NFC) and the American Football Conference (AFC). Both conferences consist of four four-team divisions. Each team plays 16 regular-season games; thus, teams do not play all other teams during a single regular season. We will be using scores from the 2007 regular season, can be downloaded from: link.

Modelling

Naturally, we will consider each team to correspond to each state in the Markov chain.

Therefore, Our state space will be the total no. of football teams in NFL represented as \(X_n \forall n \in {0,1,2...31}\).

We introduce a new paramter \(F\) i.e Football fans where \(f\) is an individual fan.

Initially, we assume that football fans are equally distributed among all the teams i.e \(F_0 = F_1 = F_2 = ... = F_{31}\)

Then,

\(\exists\) \(P_{i,j}\) \(\forall\) \(i = j\) i.e Transition exists for all the teams that have played each other

\(\nexists\) \(P_{i,j}\) \(\forall\) \(i \ne j\) i.e Transition does not exist for all the teams that have not played each other

Based on the values of scores (subjective to the approach used), we assume that after each match a fan makes a decision to move i.e switch allegiance from one team to another. Eventually, we consider the team with the highest no. of fans to be ranked 1.

Note that, movement of fans here is an arbitirary concept and it is the same as saying chances of winning the Vince Lombardi Trophy, \(t\), where after each match \(t\) moves from team \(i \longrightarrow j\) or \(j \longrightarrow i\). So, in conclusion here we are simply equating \(F\) (No. of fans) to chances of winning the trophy.

With this construct we can say that our problem is a first order Markov Chain as

\(P(X_{n+1} = j| X_n = i )\) and \(P(X_{n+1} = i| X_n = j)\).

Approach 1 : Using both PtsW and PtsL

In the first approach we will be utilising the points won (PtsW) and points lost (PtsL) by each team.

data = pd.read_csv('data_2007.csv') #reading in the file
data = data.rename(index = str, columns = {"Winner/tie": "Winner", "Loser/tie": "Loser"})
data.Winner.isnull() #256 to 267 rows are null i.e Playoff rows
data = data[0:256]
list(data.columns) #names of all the columns

In this approach at each iteration,

\[f_{i,k} \longrightarrow f_{j,k+1}\]

A fan moves from losing team to winning team based on PtsW i.e points scored by the winning team

\[f_{j,k} \longrightarrow f_{i,k+1}\]

A fan moves from the winning team to losing team based on PtsL i.e points scored by the losing team

Note that, in this case a fan does not move from the same team to itself i.e \(f_{i,k} \not\to f_{i,k+1}\) as we are explicitly using scores and there is naturally no PtsW and PtsL data for a team against itself. Therefore, by extension: \(\nexists P_{i,i} \forall i\)

Now, we will show that our Markov chain is irreducible and aperiodic. This is to utilize an essential property of Markov chains which is:

For an irreducible positive recurrent DTMC, there exist \({\pi_j > 0, j \in S}\) such that

\[\begin{aligned} \lim_{n \to \infty} p^n_{i,j} = \pi_j, \forall i,j \in S \\ \end{aligned}\]

where the \(\pi_j\) is the unique solution to:

\[\begin{aligned} \pi_j = \sum_{i \in S} \pi_i p_{i,j}, \forall j \in S \\ \sum_{j \in S} = 1 \end{aligned}\]

Note that, for finite states an irreducible and aperiodic DTMC is positive reccurent.

On an intutive level: if a DTMC is positive recurrent and we take it’s probability transition matrix to infinity then it’s transition probability values of moving from one state to another will converge to a steady state distribution.

Proving it is irreducible.

Our Markov Chain is irreducible simply by how the league is scheduled. To elaborate, in an NFL league each team plays 16 games each season.

Twice agianst each team in their division (6)
4 games against a division in the other league (4)
4 games against another division in their league (4)
1 game each against two of the remaining divisions in their league (2)

So a team is connected to all the other 32 teams.

Proving aperiodicity.

For aperiodicity, first consider one division where all 4 teams play each other twice. We will have 4 bi-directed states that are all connected to each other. Note that a Markov chain is aperiodic if there are 3 or more fully connected bidirected states. By extension, our division is aperiodic. Now as our chain is irreducible and since aperiodicity is a class property, our entire model is also aperiodic.

Now, that we have proved that our markov chain is irreducible and aperiodic we can use above mentioned property of \(\pi = \pi P\) to obtain the steady state vector \(\pi\) which can then be used to rank the teams.

Below is the code to generate a transition matrix from the csv file downloaded.

df2 = pd.DataFrame(np.random.randint(low=0, high=1, size=(32, 32)), columns=['Arizona Cardinals', 'Atlanta Falcons', 'Baltimore Ravens',
       'Buffalo Bills', 'Carolina Panthers', 'Chicago Bears',
       'Cincinnati Bengals', 'Cleveland Browns', 'Dallas Cowboys',
       'Denver Broncos', 'Detroit Lions', 'Green Bay Packers',
       'Houston Texans', 'Indianapolis Colts', 'Jacksonville Jaguars',
       'Kansas City Chiefs', 'Miami Dolphins', 'Minnesota Vikings',
       'New England Patriots', 'New Orleans Saints', 'New York Giants',
       'New York Jets', 'Oakland Raiders', 'Philadelphia Eagles',
       'Pittsburgh Steelers', 'San Diego Chargers', 'San Francisco 49ers',
       'Seattle Seahawks', 'St. Louis Rams', 'Tampa Bay Buccaneers',
       'Tennessee Titans', 'Washington Redskins'])

my_dict = {'Arizona Cardinals' : 0, 'Atlanta Falcons' : 1, 'Baltimore Ravens' : 2,
       'Buffalo Bills' : 3, 'Carolina Panthers' : 4, 'Chicago Bears' : 5,
       'Cincinnati Bengals' : 6, 'Cleveland Browns' : 7, 'Dallas Cowboys' : 8,
       'Denver Broncos' : 9, 'Detroit Lions' : 10, 'Green Bay Packers' : 11,
       'Houston Texans' : 12, 'Indianapolis Colts' : 13, 'Jacksonville Jaguars' : 14,
       'Kansas City Chiefs' : 15, 'Miami Dolphins' : 16, 'Minnesota Vikings' : 17,
       'New England Patriots' : 18, 'New Orleans Saints' : 19, 'New York Giants' : 20,
       'New York Jets' : 21, 'Oakland Raiders' : 22, 'Philadelphia Eagles' : 23,
       'Pittsburgh Steelers' : 24, 'San Diego Chargers' : 25, 'San Francisco 49ers' : 26,
       'Seattle Seahawks' : 27, 'St. Louis Rams' : 28, 'Tampa Bay Buccaneers' : 29,
       'Tennessee Titans' : 30, 'Washington Redskins' : 31}


data2 = data[['Winner', 'Loser', 'PtsW', 'PtsL']]

############
#Function to assign Loser -> Winner scores (L -> W, PtsW)
def L_W(data, my_dict, winner, data2):
    for loser in data.loc[data.Winner == winner, 'Loser'].unique():
        for team, value in my_dict.items():
            if loser == team:
                i = value
                l = sum(data.loc[(data.Loser == loser) & (data.Winner == winner), 'PtsW'].values)
                j = winner
                data2.loc[i, j] = l
    return(data2)

t1 = data[['Winner', 'Loser', 'PtsW']]
win_team = t1.Winner.unique()
for winner in win_team:
    L_W(t1, my_dict, winner, df2)

#Function to assign Winner -> Loser score (W -> L, PtsL)

def W_L(data, my_dict, winner, data2):
    for loser in data.loc[data.Winner == winner, 'Loser'].unique():
        for team, value in my_dict.items():
            if winner == team:
                i = value
                l = sum(data.loc[(data.Loser == loser) & (data.Winner == winner), 'PtsL'].values)
                j = loser
                data2.loc[i, j] = l
    return(data2)

t2 = data[['Winner', 'Loser', 'PtsL']]
for winner in win_team:
    W_L(t2, my_dict, winner, df2)

This is what our transition matrix looks like:

Transition Matrix

Note, that we need to normalize the values of the matrix such that all the rows add up to 1. We do this as the transition matrix should be a stochastic matrix.

#Sum by row
row_sum = df2.sum(axis = 1).to_dict()
transition_mat = df2.values


final_mat  = transition_mat/transition_mat.sum(axis = 1)[:, None]
final_mat.sum(axis = 1) #All rows add up to 1

Now, we raise the matrix to a high power to find it’s liming distribution.

raised_mat = matrix_power(final_mat, 10000)

Here is what it looks like after converging:

Transition Matrix raised to the power of 10000

Note that:

In cases where there were two matches between teams we have taken the sum of Ptsw and PtsL.
The images only display a partial matrix.

After ranking the steady state \(\pi\) vector this is what we get:

steady_state = dict(enumerate(raised_mat[1]))
#Sort dictionary by value
rank = sorted(steady_state, key = steady_state.get, reverse = True)

#Result
for r in rank:
    for name, state in my_dict.items():
        if r == state:
            print(name)

New England Patriots
Indianapolis Colts
Green Bay Packers
Dallas Cowboys
San Diego Chargers
New York Giants
Chicago Bears
Detroit Lions
Jacksonville Jaguars
Minnesota Vikings
Houston Texans
Philadelphia Eagles
New Orleans Saints
Cincinnati Bengals
Cleveland Browns
Pittsburgh Steelers
Denver Broncos
Arizona Cardinals
Miami Dolphins
Washington Redskins
New York Jets
Seattle Seahawks
Buffalo Bills
Tampa Bay Buccaneers
Baltimore Ravens
Tennessee Titans
Oakland Raiders
St. Louis Rams
Atlanta Falcons
Kansas City Chiefs
San Francisco 49ers
Carolina Panthers

Approach 2 : Using absolute wins and losses

The same construct that we defined in the Approach 1 carries over to this one with the only difference being that instead of PtsW and PtsL we consider wins and losses of a team.

At each iteration,

\[f_{i,k} \longrightarrow f_{j,k+1}\]

A fan moves from the losing team to the winning team with a probability \(p \in (0.5, 1)\)

\[f_{j,k} \longrightarrow f_{i,k+1}\]

A fan moves from the winning team to losing team with probability \(1-p\)

Note that, the probability of a fan moving to a winning team should be higher than a losing team. Mathamatically, \(P(f_{i_w})>P(f_{j_l})\)

Secondly, if two teams have equal wins and loses against each other then the transitions between the two teams are considered to be equally likely in either direction i.e \(f_{i,j} = f_{j,i} = 0.5\).

The chain is irreducible and aperiodic due to the same reasons as mentioned in Approach 1.

In this case we choose an arbitirary p value of \(0.8\)

#Function to create a transition matrix
p = 0.8 #arbitrary value selected
q = 0.2 #arbitrary value selected
def get_matrix(data_original, team, data_state, team_dict):
    '''Takes in the original dataframe, team, empty transition matrix and dictionary to give
    a transition matrix based on values of p and 1'''
    t1 = data_original.loc[data_original.Winner == team, 'Loser'].unique() #Teams supplied 'team' WON against
    t2 = data_original.loc[data_original.Loser == team, 'Winner'].unique() #Teams suuplied 'team' LOST again
    t3 = np.intersect1d(t1,t2) #Teams supplied 'team' both won and lost against, p = 0.5 for this
    t1 = np.setdiff1d(t1,t3) #Removing the intersection for WON
    t2 = np.setdiff1d(t2,t3) #Removing the intersection for LOST
    for name, state in team_dict.items():
        if name == team:
            l = state #l is the state space number
    for i in t1:
        data_state.loc[l, i] = q
    for j in t2:
        data_state.loc[l, j] = p
    for k in t3:
        data_state.loc[l, k] = 0.5
    return(data_state)

for i in teams:
    get_matrix(data, i, df2, my_dict)

Using \(\pi = \pi P\) and ranking the steady state \(\pi\) vector this is what we get:

New England Patriots
Dallas Cowboys
Green Bay Packers
San Diego Chargers
Indianapolis Colts
Washington Redskins
Jacksonville Jaguars
Philadelphia Eagles
New York Giants
Chicago Bears
Minnesota Vikings
Tennessee Titans
Denver Broncos
Houston Texans
Detroit Lions
Cleveland Browns
Pittsburgh Steelers
Buffalo Bills
Tampa Bay Buccaneers
New Orleans Saints
Seattle Seahawks
Carolina Panthers
Arizona Cardinals
Cincinnati Bengals
Kansas City Chiefs
Baltimore Ravens
New York Jets
Oakland Raiders
Atlanta Falcons
Miami Dolphins
San Francisco 49ers
St. Louis Rams

All the code for this project can be found on my here.

References

NFL league scheduling: https://www.youtube.com/watch?v=KGKwTnaV-rg
Data from https://www.pro-football-reference.com/years/2007/games.html
Background reading:
- “Discrete-Time Markov Chains: Limiting Behavior.” Introduction to Modeling and Analysis of Stochastic Systems, by Vidyadhar G. Kulkarni, Springer, 2011.
- Vaziri, B. (n.d.). Markov-based ranking methods
- Mattingly, R (n.d.). A Markov Method for Ranking College Football Conferences

Share on

Twitter Facebook Google+ LinkedIn

Kartik Mutya

Ranking teams using Discrete Time Markov Chains

Introduction

Modelling

Approach 1 : Using both PtsW and PtsL

Proving it is irreducible.

Proving aperiodicity.

Approach 2 : Using absolute wins and losses

References

Share on

You may also enjoy

Simulating Convergence

Bda

All Things Markov

Measures hidden in Probability