Quick BKT

This is a quick helper interface for tracing/debugging/developing Bayesian Knowledge Tracing models [1]. It covers computing of the skill mastery (latent), performance (observation), as well as forward and backward variables [2].

Computations are done with the help of Numeric Javascript

 pInit pLearn pForget pSlip pGuess Observations 0's and 1's separated by spaces

Likelihood of the data given the model:

# Right/
wrong
pKnown pCorrect Forward Variables Backward Variables Scale $c_t$ Fwd. Variables Scaled Bwd. Variables Scaled
$\alpha_{t,i=1}$ $\alpha_{t,i=2}$ $\beta_{t,i=1}$ $\beta_{t,i=2}$ $\hat{\alpha}_{t,i=1}$ $\hat{\alpha}_{t,i=2}$ $\hat{\beta}_{t,i=1}$ $\hat{\beta}_{t,i=2}$
† Estimates for row 0 are computed before any observations are considered: pKnown = pInit,
pCorrect is computed based on pInit as well, other variables are not defined. For rows 1 through N,
all of the variables are computed assuming the observation has been revealed to the algorithm.

Definitions

Standard BKT considers every skill as an independant one. Computations are performed for a student-skill sequence of observations of length T. The number of hidden states is N. The number of observations is M. Indices for elements of vectors, as well as rows and columns of matrices are 1-started. For the matrices, row index is given first, column – second. For simplicity, we consider the number of hidden states and the number of observations to be both equal to 2.

 Hidden states {known, unknown} more formally, $S=\{s_i\},~i=1..N$ Observations {correct, incorrect} more formally, $O=\{o_t\},~o_t=\nu_m,~t=1..T,~m=1..M$ Priors vector, 1*N $\Pi= \begin{bmatrix} p(L_0)&1-p(L_0)\\ \end{bmatrix}$ more formally, $\Pi=\{\pi_i\},~i=1..N$ Transitions matrix, N*N $A= \begin{bmatrix} 1&0\\ p(T)&1-p(T) \end{bmatrix}$ more formally, $A=\{a_{ij}\},~a_{ij}=p\{s_{t+1} = j|s_t = i\},~i,j=1..N$ i.e., the probability of transferring from state i to state j Emissions matrix, N*M $B= \begin{bmatrix} 1-p(S)&p(S)\\ p(G)&1-p(G) \end{bmatrix}$ more formally, $b_{jm}=p\{o_t=\nu_m|s_t=j\}~j=1..N,~m=1..M$ i.e., the probability of observing $\nu_m$, while being in state j Correctness, 1*M Vector containing the probability distribution of observations. The first element is pCorrect. Knowledge, 1*N Vector containing the probability distribution of hidden states. The first element is pKnown.

Computations

 Knowledge L $L_0=\Pi$ $L_t=p(L|E)_t \cdot A,~t=1..T$ $p(L|E)_t= (L_{t-1} \circ B[,o_t] )~./~( L_{t-1} \cdot B[,o_t] ),~t=1..T$, where $\circ$ is by-element multiplication, $\cdot$ is dot-product, ./ is by-element division, $B[,o_t]$ is the $o_t$-th column of matrix B, $o_t$ is the observation at time t. Correctness C $C_0=\Pi \cdot B$ $C_t=L_t \cdot B,~t=1..T$

References

1. Corbett, A. T. and Anderson, J. R.: Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4), 253-278. (1995)
2. Forward—backward algorithm. (2015, April 3). In Wikipedia, The Free Encyclopedia. Retrieved 23:04, July 15, 2015, from http://en.wikipedia.org/w/index.php?title=Forward%E2%80%93backward_algorithm&oldid=654797958
3. Levinson, S. E., Rabiner, L. R., & Sondhi, M. M. (1983). An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell System Technical Journal, 62(4), 1035-1074.