Extracting Rule RF in Educational Data Classification: From a Random Forest to Interpretable Refined Rules
To early detect in-trouble students in an academic credit system has been emerging in the educational data mining research arena. This problem has been taken into consideration with a multi-class educational data classification task. Although many existing supervised learning algorithms are availabl...
Saved in:
Published in: | 2015 International Conference on Advanced Computing and Applications (ACOMP) pp. 20 - 27 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-11-2015
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | To early detect in-trouble students in an academic credit system has been emerging in the educational data mining research arena. This problem has been taken into consideration with a multi-class educational data classification task. Although many existing supervised learning algorithms are available and able to provide us with many acceptable classification models, the interpretability of these models needs to be investigated so that they can be applied in practice. On the other hand, random forests have been examined and appeared to be an appropriate solution to effectively classify the students for early in-trouble student detection in a credit system. However, random forests are black-box ensemble models which lack a capability of explanation for the reasoning behind their prediction. Therefore, in this paper, we define a rule extraction algorithm named ExtractingRuleRF to derive an interpretable refined classification rule set from a random forest for a multi-class data classification task. The proposed algorithm follows a greedy approach with two phases: rule refinement and rule extraction. In the first phase, we prepare a ranked weighted rule set with more interpretability and equivalent classification power of the input random forest by retaining its classification scheme. In the second phase, our rule extraction process returns the best rules for the highest accuracy and/or a full coverage based on the priority of each ranked rule. Consequently, the theoretical analysis of the algorithm and experimental results on real educational data sets have shown that ExtractingRuleRF can produce a more effective and interpretable rule-based classification model than its corresponding random forest. Such a result helps our knowledge-based educational decision support with interpretable classification rules to be more practical. |
---|---|
AbstractList | To early detect in-trouble students in an academic credit system has been emerging in the educational data mining research arena. This problem has been taken into consideration with a multi-class educational data classification task. Although many existing supervised learning algorithms are available and able to provide us with many acceptable classification models, the interpretability of these models needs to be investigated so that they can be applied in practice. On the other hand, random forests have been examined and appeared to be an appropriate solution to effectively classify the students for early in-trouble student detection in a credit system. However, random forests are black-box ensemble models which lack a capability of explanation for the reasoning behind their prediction. Therefore, in this paper, we define a rule extraction algorithm named ExtractingRuleRF to derive an interpretable refined classification rule set from a random forest for a multi-class data classification task. The proposed algorithm follows a greedy approach with two phases: rule refinement and rule extraction. In the first phase, we prepare a ranked weighted rule set with more interpretability and equivalent classification power of the input random forest by retaining its classification scheme. In the second phase, our rule extraction process returns the best rules for the highest accuracy and/or a full coverage based on the priority of each ranked rule. Consequently, the theoretical analysis of the algorithm and experimental results on real educational data sets have shown that ExtractingRuleRF can produce a more effective and interpretable rule-based classification model than its corresponding random forest. Such a result helps our knowledge-based educational decision support with interpretable classification rules to be more practical. |
Author | Lu Thi, Kim Phung Phung, Nguyen Hua Vo Thi, Ngoc Chau |
Author_xml | – sequence: 1 givenname: Kim Phung surname: Lu Thi fullname: Lu Thi, Kim Phung email: lutkphung@gmail.com organization: Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam – sequence: 2 givenname: Ngoc Chau surname: Vo Thi fullname: Vo Thi, Ngoc Chau email: chauvtn@cse.hcmut.edu.vn organization: Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam – sequence: 3 givenname: Nguyen Hua surname: Phung fullname: Phung, Nguyen Hua email: phung@cse.hcmut.edu.vn organization: Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam |
BookMark | eNotj8tOwzAQRY0EErR0yYqNf6DBj8R22FWhgUpFRRWsq4k9QUapUzmuBH_fQFnd0cyZI90JuQx9QELuOMs4Z-XDotq8vmWC8SLj8oJMeK60NELmxTWZDYNvmFBaFYyZG_K1_E4RbPLhk26PHdJtTX2gS3e0kHwfoKNPkIBWHYyfrT9vH2kd-z0FuoXgxqHuIw6Jpp6uQsJ4iJig-ZVh6wO6P_NwS65a6Aac_eeUfNTL9-plvt48r6rFem650mkupRVKGZ0bLRCFkMIo2TS2aEuQpcCxlnbOoHZWFMAMt67kyuqcj9iIyym5P3s9Iu4O0e8h_ux0Pt40kydliFZb |
CODEN | IEEPAD |
CitedBy_id | crossref_primary_10_1007_s10115_024_02069_8 crossref_primary_10_3390_electronics11132082 crossref_primary_10_1007_s12530_022_09434_4 |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ACOMP.2015.13 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1467382345 9781467382342 |
EndPage | 27 |
ExternalDocumentID | 7422370 |
Genre | orig-research |
GroupedDBID | 6IE 6IL ALMA_UNASSIGNED_HOLDINGS CBEJK RIB RIC RIE RIL |
ID | FETCH-LOGICAL-c167t-33c266874872ee2232863bbc5f9a392e0157dd8e7dc25a081cd916c74163b2233 |
IEDL.DBID | RIE |
IngestDate | Thu Jan 18 11:13:32 EST 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c167t-33c266874872ee2232863bbc5f9a392e0157dd8e7dc25a081cd916c74163b2233 |
PageCount | 8 |
ParticipantIDs | ieee_primary_7422370 |
PublicationCentury | 2000 |
PublicationDate | 20151101 |
PublicationDateYYYYMMDD | 2015-11-01 |
PublicationDate_xml | – month: 11 year: 2015 text: 20151101 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | 2015 International Conference on Advanced Computing and Applications (ACOMP) |
PublicationTitleAbbrev | ACOMP |
PublicationYear | 2015 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib026765008 |
Score | 1.6761947 |
Snippet | To early detect in-trouble students in an academic credit system has been emerging in the educational data mining research arena. This problem has been taken... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 20 |
SubjectTerms | Cities and towns Classification algorithms Data mining Data models Decision trees ensemble interpretable classification model multi-class classification Prediction algorithms random forest rule extraction Vegetation |
Title | Extracting Rule RF in Educational Data Classification: From a Random Forest to Interpretable Refined Rules |
URI | https://ieeexplore.ieee.org/document/7422370 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoJyZALeJbHhhJW-zYTthQ26gLUBWQ2Cp_XBCoJKhJJH4-Z7e0DCxslhU50TnRvXPeu0fIpUYMnjMuo9ilCgsUgd-c5DYyQggH3CWx8UfZk0d1_5KMxr5NztVGCwMAgXwGPT8M__JdaRt_VNbHMo5xhQV6S6XJSqv18-4wqRBrDJJtG83-7fDhburJW6LnzQt-maeE3JHt_e-u-6S7FeHR6Sa9HJAdKDrkffxVB11T8UpnzQLoLKNvBd3QNPSCjnStafC69CygMHtDs2X5QTWd6cLhwPtxVjWtS7rlHBq_GOQIOl1YueqS52z8NJxEa7uEyF5LVUecW8y2icIShAHgMzOMujFW5KlGFAQYCOVcAspZJjRCAesQG9oAyQxezg9JuygLOCLUKctlnBrBJY_ZINcgQaQm1UqqXHFxTDo-TvPPVUeM-TpEJ39Pn5JdvwsrBd8ZadfLBs5Jq3LNRdjDbxpcnI0 |
link.rule.ids | 310,311,782,786,791,792,798,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgDDABahHfeGAkLcSxnbChfqiItlSlSGyVPy4IVBLUJhI_n7NbWgYWNsuKnOic6N45790j5FIhBk9DJoLIJhILFI7fnGAm0JxzC8zGkXZH2d0nOXiJW23XJudqpYUBAE8-g7ob-n_5NjelOyprYBkXMokF-haPpJALtdbP2xMKiWjjOl430mzcNR_7Q0ff4nVnX_DLPsVnj87u_-67R2prGR4drhLMPtmArEre21-FVzZlr3RUToGOOvQtoyuihprSlioU9W6XjgfkZ29pZ5Z_UEVHKrM4cI6c84IWOV2zDrVbDFKEndavPK-R50573OwGS8OEwNwIWQSMGcy3scQiJATAZw4x7lobniYKcRBgIKS1MUhrQq4QDBiL6NB4UKbxcnZAKlmewSGhVhomokRzJlgUXqcKBPBEJwrjnkrGj0jVxWnyueiJMVmG6Pjv6Quy3R33e5Pe_eDhhOy4HVno-U5JpZiVcEY257Y89_v5DYf4n94 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2015+International+Conference+on+Advanced+Computing+and+Applications+%28ACOMP%29&rft.atitle=Extracting+Rule+RF+in+Educational+Data+Classification%3A+From+a+Random+Forest+to+Interpretable+Refined+Rules&rft.au=Lu+Thi%2C+Kim+Phung&rft.au=Vo+Thi%2C+Ngoc+Chau&rft.au=Phung%2C+Nguyen+Hua&rft.date=2015-11-01&rft.pub=IEEE&rft.spage=20&rft.epage=27&rft_id=info:doi/10.1109%2FACOMP.2015.13&rft.externalDocID=7422370 |