Journal of Data and Information Science ›› 2020, Vol. 5 ›› Issue (4): 19-34.doi: 10.2478/jdis-2020-0034
• Research Paper • Previous Articles Next Articles
Xiaoli Chen1,2,(), Tao Han1,2
Received:
2020-02-08
Revised:
2020-05-20
Accepted:
2020-06-11
Online:
2020-09-20
Published:
2020-11-20
Contact:
Xiaoli Chen
E-mail:chenxl@mail.las.ac.cn
Table 1
Topic-entity distribution of the 8th citation generation.
Topic | Top 10 Topic Entities |
---|---|
1 | numerical analysis, simulation, Monte Carlo, artificial intelligence, dynamic programming, probability, principal component analysis, experiment, Markov chain, controllers |
2 | algorithm, simulation, Markov chain, Monte Carlo method, Monte Carlo, artificial intelligence, principal component analysis, experiment, program optimization, artificial neural network |
3 | artificial neural network, algorithm, fingerprint, genetic programming, biological neural networks, CPU cache, backpropagation, neural network simulation, gradient, discontinuous Galerkin method |
4 | artificial neural network, Boltzmann machine, restricted Boltzmann machine, generative model, backpropagation, pixel, speech recognition, deep learning, MNIST database, mixture model |
5 | fault tolerance, data mining, artificial neural network, brute force search, algorithm, asymptotically optimal algorithm, backpropagation |
Table 2
Disappearing topics over each citation generation.
Gen | Top 10 Disappearing Topic Entities |
---|---|
1-2 | generative model, Boltzmann machine, restricted Boltzmann machine, algorithm, inference, pixel, latent variable, gradient, Markov chain, approximation algorithm |
3-18 | artificial neural network, algorithm, generative model, backpropagation, nonlinear system, deep learning, gradient, speech recognition, hidden Markov model, pixel |
19 | artificial neural network, generative model, machine learning, algorithm, restricted Boltzmann machine, convolutional neural network, image resolution, value ethics, Boltzmann machine, gradient |
20 | artificial neural network, hidden Markov model, Markov model, nonlinear system, backpropagation, unsupervised learning, speech recognition, time series, cluster analysis, cognition disorders |
21 | artificial neural network, nonlinear system, generative model, factor analysis, MNIST database, anatomical layer, deep learning, mixture model, unit, gradient |
22 | pixel, restricted Boltzmann machine, gradient, artificial neural network, speech recognition, Boltzmann machine, unsupervised learning, statistical model, deep learning, network architecture |
Table 3
Inherited topics over each citation generation.
Gen | Top 10 Inherited Topic Entities |
---|---|
1-7 | artificial neural network, algorithm, deep learning, backpropagation, speech recognition, hidden Markov model, neural network simulation, machine learning, test set, nonlinear system |
8-9 | algorithm, simulation, Markov chain, Monte Carlo method, Monte Carlo, artificial intelligence, principal component analysis, experiment, program optimization, artificial neural network |
10-14 | artificial neural network, backpropagation, generative model, Boltzmann machine, restricted Boltzmann machine, computer data storage, deep learning, speech recognition, feedforward neural network, nonlinear system |
15-16 | simulation, Monte Carlo method, Monte Carlo, algorithm, numerical analysis, Markov chain, dynamic programming, solutions, coefficient, experiment |
17 | artificial neural network, gradient, matching polynomial, nonlinear system, spline interpolation, hidden Markov model, generative model, approximation algorithm, Bayesian network, factor analysis |
18 | simulation, Monte Carlo method, Monte Carlo, computation, computation action, silicon, gradient, distortion, Markov chain, algorithm |
19 | artificial neural network, generative model, machine learning, algorithm, restricted Boltzmann machine, convolutional neural network, image resolution, value ethics, Boltzmann machine, gradient |
20 | artificial neural network, hidden Markov model, Markov model, nonlinear system, backpropagation, unsupervised learning, speech recognition, time series, cluster analysis, cognition disorders |
21 | artificial neural network, nonlinear system, generative model, factor analysis, MNIST database, anatomical layer, deep learning, mixture model, unit, gradient |
22 | artificial intelligence, mitral valve prolapse syndrome, greater than, power dividers and directional couplers, supervised learning, performance, meal occasion for eating, plasminogen activator, nominal impedance, platelet glycoprotein 4 human |
Table 4
Innovative topics over each citation generation.
Gen | Top 10 Innovative Topic Entities |
---|---|
1 | artificial intelligence, computation, machine learning, biological neural networks, experiment, neural tube defects, convolutional neural network, synthetic data, simulation, neural networks |
2 | machine learning, experiment, supervised learning, simulation, program optimization, sparse matrix, neural networks, neural network simulation, computation, unsupervised learning |
3 | greater than, solutions, classification, estimation theory, Eisenstein’s criterion, pattern recognition, cluster analysis, neural tube defects, feature selection, sensor |
4 | robot, Monte Carlo, Markov model, Eisenstein’s criterion, rule guideline, neural network simulation, coefficient, numerical analysis, dynamic programming, high and low level |
5 | numerical analysis, artificial intelligence, heuristic, experiment, solutions, Eisenstein’s criterion, computation, requirement, sensor, coefficient |
6-21 | artificial intelligence, Monte Carlo method, biological neural networks, neural network simulation, Bayesian network, Markov chain |
22 | principal component analysis, food, principal component, obesity, platelet glycoprotein 4 human, red meat, whole grains, eaf2 gene, diabetes mellitus, exercise |
[1] | Ammar, W., Groeneveld, D., Bhagavatula, C.S., Beltagy, I., Crawford, M., Downey, D.C., & Dunkelberger, J. (2018). Construction of the literature graph in semantic scholar. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. 3, pp. 84-91. United States: Association for Computational Linguistics (ACL). doi: 10.18653/v1/N18-3011 |
[2] | Bao, Y., Collier, N., & Datta, A. (2013). A partially supervised cross-collection topic model for cross-domain text classification. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 239-248. New York, USA: ACM. doi: 10.1145/2505515.2505556 |
[3] | Beykikhoshk, A., Phung, D., Arandjelovic, O., & Venkatesh, S. (2016). Analysing the history of autism spectrum disorder using topic models. In Proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 762-771. Montreal: IEEE. doi: 10.1109/dsaa.2016.65 |
[4] | Blei, D.M., Ng, A.Y., Jordan, M.I., & Lafferty, J. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022. |
[5] |
Cardenas, R., Bello, K., Coronado, A.M., & Villota, E. (2018). Improving topic coherence using entity extraction denoising. Prague Bull. Math. Linguistics, 110, 85-101. doi: 10.2478/pralin-2018-0004
doi: 10.2478/pralin-2018-0004 |
[6] | Chang, J. (2009). Relational topic models for document networks. In Proceedings of the Conference on AI and Statistics (AISTATS). |
[7] |
Chang, J., & Blei, D.M. (2010). Hierarchical relational models for document networks. Annals of Applied Statistics, 4(1), 124-150.
doi: 10.1214/09-AOAS309 |
[8] |
Chen, C., Buntine, W., Ding, N., Xie, L., & Du, L. (2015). Differential topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 230-242. doi: 10.1109/TPAMI.2014.2313127
doi: 10.1109/TPAMI.2014.2313127 pmid: 26353238 |
[9] | Chen, X., & Han, T. (2019). How research milestone shape the technology of today—A case study of highly cited researcher using topic model. In Proceedings of the 17th International Conference on Scientometrics and Informetrics,ISSI 2019, pp. 2553-2554. Rome. |
[10] |
De Battisti, F., Ferrara, A., & Salini, S. (2015). A decade of research in statistics: A topic model approach. Scientometrics, 103, 413-433. doi: 10.1007/s11192-015-1554-1
doi: 10.1007/s11192-015-1554-1 |
[11] | Dietz, L., Bickel, S., & Scheffer, T. (2007). Unsupervised prediction of citation influences. ICML ‘07: In Proceedings of the 24th International Conference on Machine Learning, pp. 233-240. Retrieved from https://doi.org/10.1145/1273496.1273526 |
[12] | Doyle, G., & Elkan, C. (2009). Accounting for burstiness in topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pp. 281-288. New York, USA: ACM. doi: 10.1145/1553374.1553410 |
[13] |
Elgendi, M. (2019). Characteristics of a highly cited article: A machine learning perspective. IEEE Access, 7, 87977-87986. doi: 10.1109/ACCESS.2019.2925965
doi: 10.1109/Access.6287639 |
[14] | Gerrish, S.M., & Blei, D.M. (2010). A language-based approach to measuring scholarly impact. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 375-382. USA: Omnipress. Retrieved from http://dl.acm.org/citation.cfm?id=3104322.3104371 |
[15] | Hall, D., Jurafsky, D., & Manning, C.D. (2008). Studying the history of ideas using topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 363-371. Stroudsburg: Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=1613715.1613763 |
[16] | He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., & Giles, C. (2009). Detecting topic evolution in scientific literature: How can citations help? pp. 957-966. doi: 10.1145/1645953.1646076 |
[17] |
Hu, X., Rousseau, R., & Chen, J. (2011). On the definition of forward and backward citation generations. Journal of Informetrics, 5, 27-36. doi: https://doi.org/10.1016/j.joi.2010.07.004
doi: 10.1016/j.joi.2010.07.004 |
[18] | Iwata, T., Yamada, T., Sakurai, Y., & Ueda, N. (2010). Online multiscale dynamic topic models. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 663-672. New York, USA: ACM. doi: 10.1145/1835804.1835889 |
[19] | Jennifer, S., & Halem, M. (2018). Ontology-grounded topic modeling for climate science research. In Emerging Topics in Semantic Technologies. ISWC 2018 Satellite Events. AKA Verlag, Berlin. |
[20] | Kataria, S., Mitra, P., & Bhatia, S. (2010). Utilizing context in generative bayesian models for linked corpus. In M. Fox, & D. Poole (Ed.), In Proceedings of the 24th AAAI Conference on Artificial Intelligence. AAAI Press. Retrieved from http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/view/1883 |
[21] |
Kim, J., Kim, D., & Oh, A. (2017). Joint modeling of topics, citations, and topical authority in academic corpora. Transactions of the Association for Computational Linguistics, 5, 191-204. Retrieved from https://transacl.org/ojs/index.php/tacl/article/view/1061
doi: 10.1162/tacl_a_00055 |
[22] | Li, W., & Mccallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. In Proceedings of the 23rd International Conference on Machine Learning, pp. 577-584. |
[23] |
Martínez, M.A., Herrera, M., Contreras, E., Ruíz, A., & Herrera-Viedma, E. (2015). Characterizing highly cited papers in Social Work through H-Classics. Scientometrics, 102, 1713-1729. doi: 10.1007/s11192-014-1460-y
doi: 10.1007/s11192-014-1460-y |
[24] | Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, pp. 3111-3119. USA: Curran Associates Inc. Retrieved from http://dl.acm.org/citation.cfm?id=2999792.2999959 |
[25] | Mimno, D., Wallach, H.M., Talley, E., Leenders, M., & McCallum, A. (2011). Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262-272. Stroudsburg: Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=2145432.2145462 |
[26] | Moody, C.E. (2016). Mixing dirichlet topic models and word embeddings to make lda2vec. |
[27] | Musat, C.C., Velcin, J., Trausan-Matu, S., & Rizoiu, M.-A. (2011). Improving topic evaluation using conceptual knowledge. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain. |
[28] | Nallapati, R.M., Ahmed, A., Xing, E.P., & Cohen, W.W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 542-550. New York, USA: ACM. doi: 10.1145/1401890.1401957 |
[29] | Nallapati, R., & Cohen, W. (2008). Link-plsa-lda: A new unsupervised model for topics and influence in blogs. International Conference on Weblogs and Social Media. |
[30] | Newman, D., Chemudugunta, C., & Smyth, P. (2006). Statistical entity-topic models. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 680-686. New York, USA: ACM. doi: 10.1145/1150402.1150487 |
[31] | Newman, D., Lau, J.H., Grieser, K., & Baldwin, T. (2010. Automatic evaluation of topic coherence. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100-108. Stroudsburg: Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=1857999.1858011 |
[32] |
Parker, J.N., Allesina, S., & Lortie, C.J. (2013). Characterizing a scientific elite (B): Publication and citation patterns of the most highly cited scientists in environmental science and ecology. Scientometrics, 94(2), 469-480. doi: 10.1007/s11192-012-0859-6
doi: 10.1007/s11192-012-0859-6 |
[33] | Paul, M., & Girju, C.R. (2009). Topic modeling of research fields: An interdisciplinary perspective. International Conference Recent Advances in Natural Language Processing, RANLP, 337-342. |
[34] | Paul, M., & Girju, R. (2010). A two-dimensional Topic-Aspect Model for discovering multi- faceted topics. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 545-550. |
[35] | Řehůřek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45-50. Valletta: ELRA. |
[36] | Risch, J., & Krestel, R. (2018, 6). My approach = Your apparatus? Entropy-based topic modeling on multiple domain-specific text collections. In Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries. Fort Worth, TX, USA. doi: 10.1145/3197026.3197038 |
[37] | Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining, pp. 399-408. New York, USA: ACM. doi: 10.1145/2684822.2685324 |
[38] | Salvatier, J., Wiecki, T., & Fonnesbeck, C. (2016). Probabilistic programming in Python using PyMC3. Peer J Computer Science, e55. doi: 10.7287/PEERJ.PREPRINTS.1686V1 |
[39] | Shen, J., Song, Z., Li, S., Tan, Z., Mao, Y., Fu, L. . . . , & Wang, X. (2016). Modeling topic- level academic influence in scientific literatures. Scholarly big data: AI perspectives, challenges, and ideas, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA. Retrieved from http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12598 |
[40] | Wang, C., Blei, D., & Heckerman, D. (2008). Continuous time dynamic topic models. Tech. rep. Retrieved from https://www.microsoft.com/en-us/research/publication/continuous-time-dynamic-topic-models/ |
[41] | Wang, X., Zhai, C., & Roth, D. (2013). Understanding evolution of research themes: A probabilistic generative model for citations. In R. Parekh, J. He, D. S. Inderjit, P. Bradley, Y. Koren, R. Ghani, . . R. Uthurusamy (Ed.), KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1115-1123. Association for Computing Machinery. doi: 10.1145/2487575.2487698 |
[42] | Wu, H., Wang, M., Feng, J., & Pei, Y. (2010). Research topic evolution in “Bioinformatics”. In Proceedings of the 4th International Conference on Bioinformatics and Biomedical Engineering, pp. 1-4. doi: 10.1109/ICBBE.2010.5516318 |
[43] |
Wu, Q., Zhang, C., Hong, Q., & Chen, L. (2014). Topic evolution based on LDA and HMM and its application in stem cell research. Journal of Information Science, 40(5), 611-620. doi: 10.1177/0165551514540565
doi: 10.1177/0165551514540565 |
[44] | Xu, S., Shi, Q., Qiao, X., Zhu, L., Jung, H., Lee, S., & Choi, S.P. (2014). Author-Topic over Time (AToT): A dynamic users’ interest model. In J. J. Park, H. Adeli, N. Park, & I. Woungang (Ed.), Mobile, Ubiquitous, and Intelligent Computing, pp. 239-245. Berlin: Springer Berlin Heidelberg. |
[45] |
Yan, E. (2015). Research dynamics, impact, and dissemination: A topic-level analysis: Research Dynamics, Impact, and Dissemination. Journal of the Association for Information Science and Technology, 66, 2357-2372. doi: 10.1002/asi.23324
doi: 10.1002/asi.2015.66.issue-11 |
[46] | Zhai, C., Velivelli, A., & Yu, B. (2004). A cross-collection mixture model for comparative text mining. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 743-748. New York, USA: ACM. doi: 10.1145/1014052.1014150 |
[47] | Zhang, J., Gerow, A., Altosaar, J., Evans, J., & Jean So, R. (2015). Fast, flexible models for discovering topic correlation across weakly-related collections. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1554-1564. Lisbon: Association for Computational Linguistics. doi: 10.18653/v1/D15-1179 |
[48] | Zhou, H.K., Yu, H.M., & Hu, R. (2017). Topic discovery and evolution in scientific literature based on content and citations. Frontiers of Information Technology & Electronic Engineering, 10, 1511-1532. doi: 10.1631/FITEE.1601125 |
[1] | Chunlei YE. Mapping the evolution of research topics using ATM and SNA [J]. Journal of Data and Information Science, 2014, 7(4): 46-62. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||