Application of Markov Chains to Analyze and Predict the Mathematical Achievement Gap between African American and White American Students

Using data drawn from the National Assessment of Educational Progress (NAEP) much research [1-4] have reported the alarming underachievement in mathematics of African American youths, ages 9, 13 and 17 as compared to their White counterparts. Reform efforts ([5-8], from the National Council of Teachers’ of Mathematics) in mathematics education in the United States have called for an agenda of mathematics teaching and learning that emphasizes students’ engagement in the processes of critical mathematical thinking and problem solving analogous to that of mathematicians [9]. However, despite these efforts, African American students continue to exhibit underachievement and underrepresentation in mathematics.


Introduction Mathematical achievement gap
Using data drawn from the National Assessment of Educational Progress (NAEP) much research [1][2][3][4] have reported the alarming underachievement in mathematics of African American youths, ages 9, 13 and 17 as compared to their White counterparts. Reform efforts ( [5][6][7][8], from the National Council of Teachers' of Mathematics) in mathematics education in the United States have called for an agenda of mathematics teaching and learning that emphasizes students' engagement in the processes of critical mathematical thinking and problem solving analogous to that of mathematicians [9]. However, despite these efforts, African American students continue to exhibit underachievement and underrepresentation in mathematics.
What has not been delineated in educational policy or reform efforts in mathematics is a clear objective for the mathematical achievement of African American students. For example, public policy in the United States has not determined specifically how African Americans should achieve mathematically compared to their White counterparts and by what date or when should this achievement be realized. To add to the literature on the mathematical achievement of African American students in this regard, this research examined the trends of mathematical achievement between African American and White American students over the past forty years. The authors' goal was to generate a probability statistic that would predict the future achievement of African Americans given historical, mathematical achievement gains and declines. Specifically, this study sought to estimate the probability that the mathematical achievement gap that has historically existed between African Americans and White Americans will close within the next 50 years.

Mathematical model
In 1907, Andrei Markov, a Russian mathematician, began the study of an important new type of chance process. In this process, the outcome of a given experiment can affect the outcome of the next experiment.
This type of process is called a Markov Process or Markov Chain [10]. A Markov-Chain process is a stochastic, mathematical model with transition probabilities (defined below) that provide information about how to relate one stage of a process to the next [10].
To create a mathematical model for this research, the authors used a discrete Markov Chain. As delineated in the work of Kemeny and Snell [11], a discrete or finite Markov chain is a stochastic process with finitely many states on a nominal scale. Moreover, the stochastic process is a random process evolving in time. Since this research sought to determine the probability that the mathematical achievement gap between African American and White American students will close in a finite number of years, a discrete Markov chain was appropriate in modeling this phenomenon. Further, mathematical achievement gap scores examined during discrete times in history lend themselves better to stochastic modeling rather than deterministic modeling in which the impending state of the process completely depends on the past and the present states of the process. The authors used the following properties of a discrete Markov process in developing the framework of the stochastic model: • The number of possible outcomes or states is finite.
• The outcome at any stage depends only on the outcome of the previous stage.
• The probabilities are constant over time.
Mathematically, we describe a discrete Markov chain with the following formula: [12] In the formula, t denotes discrete time. In this paper, we define t in terms of years. As denoted in the above formula, we have a set of states (t)   1t 2 t nt {s ,s ,...s } = S . The process starts in one of these states and moves successively from one state to another. Each move is called a step. If the chain is currently in state s ti , then it moves to state s tj at the next step with a probability denoted by m ij (t). This probability does not depend upon which states the chain was in before the current state.
The probabilities m ij (t) are called transition probabilities. The matrix is a one-step transition matrix [12]. Henceforth in this paper, M denotes the transition matrix for our discrete Markov process, such that for all i and j [13]. Each entry ∈ ij m M is defined as the probability of transitioning (moving) from state i to state j. Further, we denote the states: {s 1 , s 2 … s n }. Thus, m ij is the probability that an object in state s j transitions to state s i [14].
Further describing our mathematical model, we let M be the transition matrix for our discrete Markov process such that M k has only positive entries for some k. Then there exists a steady-state probability x s vector such that Mx s =x s [10]. Moreover, [15]. The mathematical representation of our steady-state vector can be denoted with the following chain of equivalences: x [15].
Hence, in our discrete Markov process, the steady-state vector x s is defined as the limiting vector or eigenvector of the transition matrix M, and corresponds to the eigenvalue of 1.

Data
The National Assessment of Educational Progress (NAEP) is the largest nationally representative of what students in the United States know and can do in mathematics, as well as other subject areas. NAEP assessments are conducted periodically during particular calendar years and are administered uniformly across the United States. NAEP assessments are administered to US students at ages 9, 13, and 17. Results from NAEP data show the average achievement score at each age level and for various demographic groups based on descriptors such as race, socioeconomic status, and gender.
Provided in Figure 1 are NAEP data trends in mathematics achievement for African American and White American 9-year olds. The data reveal a wide achievement gap during the 1970s and early 1980s. However, the gap narrows during the 1990s but widens again during the late 2000s.
Similarly, as seen in Figures 2 and 3 for youths ages 13 and 17, although there is a persistent achievement gap in mathematics between African Americans and White Americans, the gap has widen and narrowed during particular time periods.

Results
To analyze the NAEP data, the authors developed tables of the mathematical achievement gaps between African American and White American youths at ages 9, 13, and 17 during the assessment years 1973-2012. To create our mathematical model, we defined g to be the random variable representing the score gaps between African American and White American students. As posited in the work of Anderson and Goodman [13], our model is created under the assumption that g can be grouped into classes that we define as states in our discrete Markov chain. Each of these classes or states represents the range of potential values for g. Thus, using this assumption we developed ranges of score gaps, during particular time periods. Given this analysis, we found it necessary to generate three separate probability statistics based on the mathematical achievement gap at each age level-9, 13, and 17. Thus, we find a transition matrix for each age level-9, 13, and 17-based on our analysis of the score gaps for each of these groups.
As described above, one of the properties of the discrete Markov process is that the set of states S ij are finite. Therefore, we first examined the mathematical achievement gaps at age 9 and assigned each set of gap ranges to a corresponding transition state S ij to create the discrete Markov-Chain model (Tables 1 and 2).
As shown in Table 2 , , , s s s s since gap scores have historically moved within these ranges. We now define x ij as the number of times g transitions from class i to class j [13]. Thus, we count the number of transitions or movements of g in Table 1 between the classes or states in Table 2. Based on our analysis, we describe the changes between the score gaps as a stochastic process. Hence, this stochastic process of counting the movements of g result in the relative frequency of times g began in a particular state and transitioned to each of the other states. Therefore, we define the transition probability . We further describe m ij as the conditional probability of transitioning into one state, given the immediately preceding state. Hence, the outcome at any stage depends only on the outcome of the previous stage.
To complete the development of our discrete Markov-chain model, we must establish the dependence of against the test hypothesis that g is statistically independent. According to Billingsley [16], the Chi-Square test provides a systematic way of addressing the statistical analysis of Markov chains. Therefore, we used a Chi-Square test of We see that after ten years (or steps) the transition matrix reaches an equilibrium state or stabilizes, in which all rows of the matrix are the same. Thus, this matrix is our steady-state matrix, and we conclude that based on our Markov process, the achievement gap will likely close within the next 10 years for the 9-year old age level.
Note that since our data analysis ends at year 2012, our ten-year prediction includes the years 2013 and 2014. Hence, it is important to investigate whether the steady-state matrix can be determined in less than ten years (or steps). We find that after nine years (or steps) our transition matrix for this age level is as follows: Thus, to six decimal places, the transition matrix does not reach an equilibrium state until after ten steps. Similar to the process described above for 9-year olds, the authors created a discrete Markov-Chain model based on NAEP data score gaps for 13-year olds (Tables 3 and 4). Table 4, although smaller or larger intervals of the movement of g could be created, the authors created four states , , , s s s s since gap scores have historically moved within these ranges and defined x ij as the number of times g transitions from class i to class j [13].

As shown in
To complete the development of the discrete Markov-chain model for age 13, as described above for age 9, we established the dependence of g against the test hypothesis that g is statistically independent by using the Chi-Square test. We found that the result of the Chi-Square test was 16.107 (p=0.013) and was significant at the 98.700 percent confidence level. Thus, based on these results we proceeded in creating our discrete Markov-Chain model for age 13. the data in Table 1, specifically treating a matrix of transition counts as if it were a contingency table [13]. We found that the result of the Chi-Square test was 13.063 (p=0.042) and was significant at the 95.800 percent confidence level. Thus, based on these results we proceeded in creating our discrete Markov-Chain model for age 9.
Given that the score gaps described above transition through four states { } 1 2 3 4 , , , s s s s for 9-year old African American and White American students during the years 1973-2012, we define our transition matrix of the Markov process for this age level by determining the initial class in which g existed and count g's movement from this class or state to each of the other classes or states. Mathematically, we describe this process as: where the entries in M are the transition probabilities in our discrete Markov-Chain model.
Further, in our discrete Markov chain, the sequence of all successive vectors x ns is linked by our transition matrix, M. Eventually, the successive probabilities stabilize or reach an equilibrium state, and converge over time [10]. To generate our probability statistic to predict whether the mathematical achievement gap will close within the next 50 years for students at age 9, we first find our transition matrix after a     , , , s s s s defined above, we find the transition matrix to be: where the entries in M are the transition probabilities in our discrete Markov-Chain model.
Again, to generate our probability statistic to predict whether the mathematical achievement gap will close within the next 50 years, we first find our transition matrix prior to fifty steps to determine if it reaches a steady state. Using Mathematica to perform power calculations of M, after ten years (or steps) we find the transition matrix to be: Since the transition matrix reaches equilibrium after ten steps, the authors investigated whether the steady-state matrix could be determined prior to ten steps. The authors found: Thus, to six decimal places, we find that the transition matrix reaches an equilibrium state after ten steps. Therefore, we conclude that based on our Markov process, the achievement gap will likely close within the next ten years for 13-year olds.
Finally, the authors created a discrete Markov-Chain model based on NAEP data score gaps at age 17 (Tables 5 and 6).
As shown in Table 6, we created four states { } To complete the development of our discrete Markov-chain model for age 17, we again used a Chi-Square test. The result was 12.833 (p=0.046) and was significant at the 95.400 percent confidence level.    , , , s s s s defined below, we find the transition matrix to be: where the entries in M are the transition probabilities in our discrete Markov-Chain model.
To generate our probability statistic to predict whether the mathematical achievement gap will close within the next 50 years for this age level, we first find our transition matrix prior to fifty steps to determine if it reaches a steady state. Using Mathematica to perform power calculations of M, after ten years (or steps) we found the transition matrix to be: Finding that the transition matrix does not stabilize after ten steps, the authors examined whether it would stabilize after fifteen years (or steps). The authors found: Since the transition matrix reaches equilibrium after fifteen steps, we investigated whether the steady-state matrix could be determined prior to fifteen steps. We found: Thus, to six decimal places, the transition matrix does not reach an equilibrium state until after fifteen steps. Therefore, we conclude that based on our Markov process, the achievement gap will likely close within the next fifteen years for 17-year olds.

Limitation of the Study
In creating the mathematical model for this study, the authors chose a random variable g, namely the score gaps in mathematical achievement between African American and White American youths, ages 9, 13, and 17, and analyzed the transition of this variable through a unit of time t in years. This type of analysis creates a limitation of the study in that other factors (such as socioeconomic status, enrollment patterns in mathematics courses, gender issues, teacher expectations, test anxiety, and etc.) that may impact the score gaps in mathematical achievement are not considered. Rather than isolating each of these potential impact factors and analyzing separately their effect on the mathematical achievement gap, our random variable g represented the collective outcome of these factors. Moreover, modeling the random variable in this regard was necessary to create the discrete Markov-Chain model. In light of this limitation of the study, the following implications are made.

Implications of the study
As can be discerned from the National Assessment of Educational Progress (NAEP) data, historically there has existed an achievement gap in mathematics between African American students, ages 9, 13, and 17 and their White counterparts. This achievement gap has influenced educational processes in public and private school systems, as well as public policies in the United States. Although programs and policies in the United States such as Algebra for All, No Child Left Behind, and the Common Core Mathematics Standards were implemented to improve the mathematical achievement of all American students, the consistent achievement gap in mathematics among various demographic groups has influenced these implementations. Moreover, professional mathematics education organizations such as the National Council of Teachers of Mathematics have called for equitable mathematics classrooms for all students of mathematics [5][6][7][8].
Based on the discrete Markov process, described as a stochastic, mathematical model, presented in this article, the authors found that the mathematical achievement gap that exists between African Americans and White Americans in the United States will likely close within the next ten years for 9-and 13-year olds and within the next fifteen years for 17-year olds. Given that these predictions are varied for different age levels, they may serve useful as educational policy makers continue to develop public policy to address the mathematical achievement of all American youths, including those minority youths who consistently underachieve in mathematics.