Automatic Classification of Swedish Metadata Using Dewey Decimal Classification: A Comparison of Approaches
Koraljka Golub, Johan Hagelbäck, Anders Ardö
Table 4 Accuracy of the Naïve Bayes classifier using different pre-processing.
Naïve Bayes
Dataset Accuracy, unigrams Accuracy, unigrams + 2-grams
Training set Test set Training set Test set
T_KW_MC 95.42% 76.52% 99.66% 75.96%
T_KW_MC_rem 90.17% 76.79% 93.25% 78.21%
T_KW_MC_stm 94.32% 76.36% 99.59% 76.36%
T_KW_MC_stm_rem 89.62% 76.26% 92.95% 78.27%
T_KW_MC_sw 95.50% 76.46% 99.64% 76.62%
T_KW_MC_sw_rem 90.28% 77.09% 92.33% 78.60%
T_KW_MC_sw_stm 94.49% 76.59% 99.53% 76.95%
T_KW_MC_sw_stm_rem 89.79% 76.36% 91.96% 78.90%