Michael (Mikhail) V. Yudelson's Academic Page

Quick BKT

This is a quick helper interface for tracing/debugging/developing Bayesian Knowledge Tracing models [1]. It covers computing of the skill mastery (latent), performance (observation), as well as forward and backward variables [2].

Computations are done with the help of Numeric Javascript

Likelihood of the data given the model
Accuracy of the model
Root mean squared error of the model

#	Right/ wrong	pKnown	pCorrect	Forward Variables		Backward Variables		Scale $c_t$	Fwd. Variables Scaled		Bwd. Variables Scaled
#	Right/ wrong	pKnown	pCorrect	$\alpha_{t,i=1}$	$\alpha_{t,i=2}$	$\beta_{t,i=1}$	$\beta_{t,i=2}$	Scale $c_t$	$\hat{\alpha}_{t,i=1}$	$\hat{\alpha}_{t,i=2}$	$\hat{\beta}_{t,i=1}$	$\hat{\beta}_{t,i=2}$
† Estimates for row 0 are computed before any observations are considered: pKnown = pInit, pCorrect is computed based on pInit as well, other variables are not defined. For rows 1 through N, all of the variables are computed assuming the observation has been revealed to the algorithm.

Definitions

Standard BKT considers every skill as an independant one. Computations are performed for a student-skill sequence of observations of length T. The number of hidden states is N. The number of observations is M. Indices for elements of vectors, as well as rows and columns of matrices are 1-started. For the matrices, row index is given first, column – second. For simplicity, we consider the number of hidden states and the number of observations to be both equal to 2.

Hidden states	{known, unknown}
	more formally, $S=\{s_i\},~i=1..N$
Observations	{correct, incorrect}
	more formally, $O=\{o_t\},~o_t=\nu_m,~t=1..T,~m=1..M$
Priors vector, 1*N	$\Pi=\[p(L_0), 1-p(L_0)\]$
	more formally, $\Pi=\{\pi_i\},~i=1..N$
Transitions matrix, N*N	$A=\[1, 0; p(T), 1-p(T)\]$
	more formally, $A=\{a_{ij}\},~a_{ij}=p\{s_{t+1} = j\|s_t = i\},~i,j=1..N$ i.e., the probability of transferring from state i to state j
Emissions matrix, N*M	$B=\[1-p(S), p(S); p(G), 1-p(G)\]$
	more formally, $b_j(m)=p\{o_t=\nu_m\|s_t=j\}~j=1..N,~m=1..M$ i.e., the probability of observing $\nu_m$ , while being in state j
Correctness, 1*M	Vector containing the probability distribution of observations. The first element is pCorrect.
Knowledge, 1*N	Vector containing the probability distribution of hidden states. The first element is pKnown.

Computations

Knowledge L	$L_0=\Pi$
	$L_t=p(L\|E)_t \cdot A,~t=1..T$
	$p(L\|E)_t= (L_{t-1} \circ B[,o_t] )~./~( L_{t-1} \cdot B[,o_t] ),~t=1..T$ , where
	$\circ$ is by-element multiplication, $\cdot$ is dot-product, ./ is by-element division, $B[,o_t]$ is the $o_t$ -th column of matrix B, $o_t$ is the observation at time t.
Correctness C	$C_0=\Pi \cdot B$
	$C_t=L_t \cdot B,~t=1..T$

References

Corbett, A. T. and Anderson, J. R.: Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4), 253-278. (1995)
Forward—backward algorithm. (2015, April 3). In Wikipedia, The Free Encyclopedia. Retrieved 23:04, July 15, 2015, from http://en.wikipedia.org/w/index.php?title=Forward%E2%80%93backward_algorithm&oldid=654797958
Levinson, S. E., Rabiner, L. R., & Sondhi, M. M. (1983). An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell System Technical Journal, 62(4), 1035-1074.

Adapted from 960 grid

pInit
pLearn
pForget
pSlip
pGuess
Observations	0's and 1's separated by spaces