Journal of Data and Information Science ›› 2020, Vol. 5 ›› Issue (3): 161-177.doi: 10.2478/jdis-2020-0016

• Research Papers • Previous Articles     Next Articles

Co-occurrence of Cell Lines, Basal Media and Supplementation in the Biomedical Research Literature

Jessica Cox(), Darin McBeath, Corey Harper, Ron Daniel   

  1. Elsevier Labs, 230 Park Avenue, New York, NY 10169, USA
  • Received:2020-01-31 Accepted:2020-04-23 Online:2020-07-20 Published:2020-09-04
  • Contact: Jessica Cox E-mail:j.cox@elsevier.com

Abstract:

Purpose: The use of in vitro cell culture and experimentation is a cornerstone of biomedical research, however, more attention has recently been given to the potential consequences of using such artificial basal medias and undefined supplements. As a first step towards better understanding and measuring the impact these systems have on experimental results, we use text mining to capture typical research practices and trends around cell culture.

Design/methodology/approach: To measure the scale of in vitro cell culture use, we have analyzed a corpus of 94,695 research articles that appear in biomedical research journals published in ScienceDirect from 2000-2018. Central to our investigation is the observation that studies using cell culture describe conditions using the typical sentence structure of cell line, basal media, and supplemented compounds. Here we tag our corpus with a curated list of basal medias and the Cellosaurus ontology using the Aho-Corasick algorithm. We also processed the corpus with Stanford CoreNLP to find nouns that follow the basal media, in an attempt to identify supplements used.

Findings: Interestingly, we find that researchers frequently use DMEM even if a cell line’s vendor recommends less concentrated media. We see long-tailed distributions for the usage of media and cell lines, with DMEM and RPMI dominating the media, and HEK293, HEK293T, and HeLa dominating cell lines used.

Research limitations: Our analysis was restricted to documents in ScienceDirect, and our text mining method achieved high recall but low precision and mandated manual inspection of many tokens.

Practical implications: Our findings document current cell culture practices in the biomedical research community, which can be used as a resource for future experimental design.

Originality/value: No other work has taken a text mining approach to surveying cell culture practices in biomedical research.

Key words: Cell culture, Biomedical research, Text mining