|
|
Automatic Classification of Swedish Metadata Using Dewey Decimal Classification: A Comparison of Approaches
|
Koraljka Golub, Johan Hagelbäck, Anders Ardö
|
|
|
Table 4 Accuracy of the Naïve Bayes classifier using different pre-processing. |
|
Naïve Bayes | Dataset | Accuracy, unigrams | Accuracy, unigrams + 2-grams | Training set | Test set | Training set | Test set | T_KW_MC | 95.42% | 76.52% | 99.66% | 75.96% | T_KW_MC_rem | 90.17% | 76.79% | 93.25% | 78.21% | T_KW_MC_stm | 94.32% | 76.36% | 99.59% | 76.36% | T_KW_MC_stm_rem | 89.62% | 76.26% | 92.95% | 78.27% | T_KW_MC_sw | 95.50% | 76.46% | 99.64% | 76.62% | T_KW_MC_sw_rem | 90.28% | 77.09% | 92.33% | 78.60% | T_KW_MC_sw_stm | 94.49% | 76.59% | 99.53% | 76.95% | T_KW_MC_sw_stm_rem | 89.79% | 76.36% | 91.96% | 78.90% |
|
|
|