TY - JOUR
T1 - Implicit Heterogeneous Features Embedding in Deep Knowledge Tracing
AU - Yang, Haiqin
AU - Cheung, Lap Pong
N1 - Funding Information:
Funding Information The work described in this paper was partially supported by the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. UGC/IDS14/16).
Publisher Copyright:
© 2017, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2018/2/1
Y1 - 2018/2/1
N2 - Deep recurrent neural networks have been successfully applied to knowledge tracing, namely, deep knowledge tracing (DKT), which aims to automatically trace students’ knowledge states by mining their exercise performance data. Two main issues exist in the current DKT models: First, the complexity of the DKT models increases the tension of psychological interpretation. Second, the input of existing DKT models is only the exercise tags representing via one-hot encoding. The correlation between the hidden knowledge components and students’ responses to the exercises heavily relies on training the DKT models. The existing rich and informative features are excluded in the training, which may yield sub-optimal performance. To utilize the information embedded in these features, researchers have proposed a manual method to pre-process the features, i.e., discretizing them based on the inner characteristics of individual features. However, the proposed method requires many feature engineering efforts and is infeasible when the selected features are huge. To tackle the above issues, we design an automatic system to embed the heterogeneous features implicitly and effectively into the original DKT model. More specifically, we apply tree-based classifiers to predict whether the student can correctly answer the exercise given the heterogeneous features, an effective way to capture how the student deviates from others in the exercise. The predicted response and the true response are then encoded into a 4-bit one-hot encoding and concatenated with the original one-hot encoding features on the exercise tags to train a long short-term memory (LSTM) model, which can output the probability that a student will answer the exercise correctly on the corresponding exercise. We conduct a thorough evaluation on two educational datasets and demonstrate the merits and observations of our proposal.
AB - Deep recurrent neural networks have been successfully applied to knowledge tracing, namely, deep knowledge tracing (DKT), which aims to automatically trace students’ knowledge states by mining their exercise performance data. Two main issues exist in the current DKT models: First, the complexity of the DKT models increases the tension of psychological interpretation. Second, the input of existing DKT models is only the exercise tags representing via one-hot encoding. The correlation between the hidden knowledge components and students’ responses to the exercises heavily relies on training the DKT models. The existing rich and informative features are excluded in the training, which may yield sub-optimal performance. To utilize the information embedded in these features, researchers have proposed a manual method to pre-process the features, i.e., discretizing them based on the inner characteristics of individual features. However, the proposed method requires many feature engineering efforts and is infeasible when the selected features are huge. To tackle the above issues, we design an automatic system to embed the heterogeneous features implicitly and effectively into the original DKT model. More specifically, we apply tree-based classifiers to predict whether the student can correctly answer the exercise given the heterogeneous features, an effective way to capture how the student deviates from others in the exercise. The predicted response and the true response are then encoded into a 4-bit one-hot encoding and concatenated with the original one-hot encoding features on the exercise tags to train a long short-term memory (LSTM) model, which can output the probability that a student will answer the exercise correctly on the corresponding exercise. We conduct a thorough evaluation on two educational datasets and demonstrate the merits and observations of our proposal.
KW - Knowledge tracing
KW - Recurrent neural networks
KW - Tree-based classifiers
UR - http://www.scopus.com/inward/record.url?scp=85038114072&partnerID=8YFLogxK
U2 - 10.1007/s12559-017-9522-0
DO - 10.1007/s12559-017-9522-0
M3 - Article
AN - SCOPUS:85038114072
SN - 1866-9956
VL - 10
SP - 3
EP - 14
JO - Cognitive Computation
JF - Cognitive Computation
IS - 1
ER -