Journal of Data and Information Science ›› 2020, Vol. 5 ›› Issue (4): 116-125.doi: 10.2478/jdis-2020-0036

• Research Paper • Previous Articles     Next Articles

Priorities for Social and Humanities Projects Based on Text Analysis

Ülle Must()   

  1. Archimedes Foundation, Väike-Turu Street 8, Tartu 51004, Estonia
  • Received:2020-01-20 Revised:2020-06-15 Accepted:2020-07-31 Online:2020-09-20 Published:2020-11-20
  • Contact: Ülle Must


Purpose: Changes in the world show that the role, importance, and coherence of SSH (social sciences and the humanities) will increase significantly in the coming years. This paper aims to monitor and analyze the evolution (or overlapping) of the SSH thematic pattern through three funding instruments since 2007.

Design/methodology/approach: The goal of the paper is to check to what extent the EU Framework Program (FP) affects/does not affect research on national level, and to highlight hot topics from a given period with the help of text analysis. Funded project titles and abstracts derived from the EU FP, Slovenian, and Estonian RIS were used. The final analysis and comparisons between different datasets were made based on the 200 most frequent words. After removing punctuation marks, numeric values, articles, prepositions, conjunctions, and auxiliary verbs, 4,854 unique words in ETIS, 4,421 unique words in the Slovenian Research Information System (SICRIS), and 3,950 unique words in FP were identified.

Findings: Across all funding instruments, about a quarter of the top words constitute half of the word occurrences. The text analysis results show that in the majority of cases words do not overlap between FP and nationally funded projects. In some cases, it may be due to using different vocabulary. There is more overlapping between words in the case of Slovenia (SL) and Estonia (EE) and less in the case of Estonia and EU Framework Programmes (FP). At the same time, overlapping words indicate a wider reach (culture, education, social, history, human, innovation, etc.). In nationally funded projects (bottom-up), it was relatively difficult to observe the change in thematic trends over time. More specific results emerged from the comparison of the different programs throughout FP (top-down).

Research limitations: Only projects with English titles and abstracts were analyzed.

Practical implications: The specifics of SSH have to take into account—the one-to-one meaning of terms/words is not as important as, for example, in the exact sciences. Thus, even in co-word analysis, the final content may go unnoticed.

Originality/value: This was the first attempt to monitor the trends of SSH projects using text analysis. The text analysis of the SSH projects of the two new EU Member States used in the study showed that SSH’s thematic coverage is not much affected by the EU Framework Program. Whether this result is field-specific or country-specific should be shown in the following study, which targets SSH projects in the so-called old Member States.

Key words: Text analysis, SSH, Estonian Research Information System (ETIS), Slovenian Research Information System (SICRIS), Community Research and Development Information Service (CORDIS)