Purpose: To reveal the typical features of text duplication in papers from four medical fields: basic medicine, health management, pharmacology and pharmacy, and public health and preventive medicine. To analyze the reasons for duplication and provide suggestions for the management of medical academic misconduct.
Design/methodology/approach: In total, 2,469 representative Chinese journal papers were included in our research, which were submitted by researchers in 2020 and 2021. A plagiarism check was carried out using the Academic Misconduct Literature Check System (AMLC). We generated a corrected similarity index based on the AMLC general similarity index for further analysis. We compared the similarity indices of papers in four medical fields and revealed their trends over time; differences in similarity index between review and research articles were also analyzed according to the different fields. Further analysis of 143 papers suspected of plagiarism was also performed from the perspective of sections containing duplication and according to the field of research.
Findings: Papers in the field of pharmacology and pharmacy had the highest similarity index (8.67 ± 5.92%), which was significantly higher than that in other fields, except health management. The similarity index of review articles (9.77 ± 10.28%) was significantly higher than that of research articles (7.41 ± 6.26%). In total, 143 papers were suspected of plagiarism (5.80%) with similarity indices ≥ 15%; most were papers on health management (78, 54.55%), followed by public health and preventive medicine (38, 26.58%); 90.21% of the 143 papers had duplication in multiple sections, while only 9.79% had duplication in a single section. The distribution of sections with duplication varied among different fields; papers in pharmacology and pharmacy were more likely to have duplication in the data/methods and introduction/background sections, however, papers in health management were more likely to contain duplication in the introduction/background or results/discussion sections. Different structures for papers in different fields may have caused these differences.
Research limitations: There were three limitations to our research. Firstly, we observed that a small number of papers have been checked early. It is unknown who conducted the plagiarism check as this can be included in other evaluations, such as applications for Science and technology projects or awards. If the authors carried out the check, text with high similarity indices may have been excluded before submission, meaning the similarity index in our research may have been lower than the original value. Secondly, there were only four medical fields included in our research. Additional analysis on a wider scale is required in the future. Thirdly, only a general similarity index was calculated in our study; other similarity indices were not tested.
Practical implications: A comprehensive analysis of similarity indices in four medical fields was performed. We made several recommendations for the supervision of medical academic misconduct and the formation of criteria for defining suspected plagiarism for medical papers, as well as for the improved accuracy of text duplication checks.
Originality/value: We quantified the differences between the AMLC general similarity index and the corrected index, described the situation around text duplication and plagiarism in papers from four medical fields, and revealed differences in similarity indices between different article types. We also revealed differences in the sections containing duplication for papers with suspected plagiarism among different fields.