Journal of Data and Information Science

• Research Paper •     Next Articles

Peculiarities of gender disambiguation and ordering of non-English authors' names for Economic papers beyond core databases

Olesya Mryglod1†, Serhii Nazarovets2, Serhiy Kozmenko3   

  1. 1Institute for Condensed Matter Physics of the National Academy of Sciences of Ukraine, 1 Svientsitskii St., 79011 Lviv, Ukraine;
    2Borys Grinchenko Kyiv University, 18/2 Bulvarno-Kudriavska Str., 04053 Kyiv, Ukraine;
    3University of Social Sciences Spoleczna Akademia Nauk, 9 Sienkiewicza St., 90–113 Lodz, Poland
  • Received:2022-07-07 Revised:2022-10-06 Accepted:2022-11-16
  • Contact: Olesya Mryglod (Email:, ORCID: 0000-0003-4415-7061)

Abstract: Purpose: To supplement the quantitative portrait of Ukrainian Economics discipline with the results of gender and author ordering analysis at the level of individual authors, special methods of working with bibliographic data with a predominant share of non-English authors are used. The properties of gender mixing, the likelihood of male and female authors occupying the first position in the authorship list, as well as the arrangements of names are studied.
Design/methodology/approach: A data set containing bibliographic records related to Ukrainian journal publications in the field of Economics is constructed using Crossref metadata. Partial semi-automatic disambiguation of authors' names is performed. First names, along with gender-specific ethnic surnames, are used for gender disambiguation required for further comparative gender analysis. Random reshuffling of data is used to determine the impact of gender correlations. To assess the level of alphabetization for our data set, both Latin and Cyrillic versions of names are taken into account.
Findings: The lack of well-structured metadata and the poor use of digital identifiers lead to numerous problems with automatization of bibliographic data pre-processing, especially in the case of publications by non-Western authors. The described stages for working with such specific data help to work at the level of authors and analyse, in particular, gender issues. Despite the larger number of female authors, gender equality is more likely to be reported at the individual level for the discipline of Ukrainian Economics. The tendencies towards collaborative or solo-publications and gender mixing patterns are found to be dependent on the journal: the differences for publications indexed in Scopus and/or Web of Science databases are found. It has also been found that Ukrainian Economics research is characterized by rather a non-alphabetical order of authors.
Research limitations: Only partial authors' name disambiguation is performed in a semi-automatic way. Gender labels can be derived only for authors declared by full First names or gender-specific Last names.
Practical implications: The typical features of Ukrainian Economic discipline can be used to perform a comparison with other countries and disciplines, to develop an informed-based assessment procedure at the national level. The proposed way of processing publication data can be borrowed to enrich metadata about other research disciplines, especially for non-English speaking countries.
Originality/value: To our knowledge, this is the first large-scale quantitative study of Ukrainian Economic discipline. The results obtained are valuable not only at the national level, but also contribute to general knowledge about Economic research, gender issues, and authors' names ordering. An example of the use of Crossref data is provided, while this data source is still less used due to a number of drawbacks. Here, for the first time, attention is drawn to the explicit use of the features of the Slavic authors' names.

Key words: Scholarly metadata, Economics, Crossref, Open Ukrainian Citation Index, Ukraine, Gender, non-Western authors