Menu

Diving into Data: Informatics & Population Analytics at Marshall University

At Marshall University, we’re leveraging the power of big data and advanced computational tools to improve health outcomes, drive innovation and prepare the next generation of health professionals. Dr. Trupti Joshi, professor and senior associate dean for informatics and population analytics, explains what informatics is, why it matters and how it can be applied in both research and clinical practice. Discover how this growing field supports our mission to serve the Appalachian region and beyond through research, interdisciplinary collaboration and data-driven care.

What is informatics and population analytics, and how is it used in health care and research? 

Informatics and population analytics is an interdisciplinary field focused on leveraging large scale clinical, research and/or population-based data (often referred to as “big data”) and cutting-edge informatics and computational tools and techniques for improving preventive, personalized and predictive care.

Many application areas exist in health care including precision medicine, genomic epidemiology, health outcomes, public health studies amongst others.

Why is this area important for a health sciences university like Marshall? 

The field of informatics and population analytics utilizes a comprehensive, data centric approach to integrating diverse datasets, uncovering critical trends that enhance health care delivery, disease prevention, and health promotion—particularly for the rural Appalachian population of West Virginia and beyond. 

For many diseases including cancer, obesity, cardiovascular, addiction, aging and others, only by linking the rich clinical electronic medical record (EMR) data with other large-scale datasets (e.g. genomics, multiomics, social, behavioral, economic, etc.) is it feasible to identify at-risk populations, design targeted interventions and uncover patterns from a population or patient cohort perspective.

For a health sciences university like Marshall, big data builds bridges across diverse disciplines and promotes interdisciplinary research. It facilitates better connectivity between the clinical, research and education missions at Marshall by improving clinical care and advancing research while training the next generation with much-needed skillsets in informatics, data science and AI capabilities for promising careers in biomedical science and health care.

How can informatics enhance faculty research? 

The adoption and application of informatics approaches are essential for clinicians and researchers working to improve patient care and advance scientific discovery. Informatics tools leverage emerging concepts from computer science, statistics, and mathematics to enable high-performance computing (HPC) and cloud-based storage solutions (such as AWS and Azure). These technologies support centralized, informatics-driven data management systems that streamline data acquisition, storage, access, analysis, and visualization. 

By integrating and overlaying large-scale datasets, researchers can extract meaningful insights and generate new knowledge. Advanced computational techniques—including machine learning, deep learning, large language models (LLMs), and predictive analytics—can then be applied to drive innovation and support novel discoveries. 

How can resident physicians or students get involved in this work? 

In today’s data-driven world, informatics skillsets are increasingly vital across clinical care, research and education. As the demand for generating and interpreting large volumes of data grows, students, residents and physicians can build their informatics expertise through training workshops, bootcamps, formal coursework in informatics and data science, lab rotations and interdisciplinary collaboration with informatics faculty. 

Many opportunities exist where students, residents and physicians can integrate informatics tools and techniques as consumers as well as develop informatics methods as contributors to the field.

What kinds of resources and tools are available? 

Several informatics tools and methods exist that can be readily applied for multiomics data integration and predictive analytics for biomedical diseases, genomic epidemiology as well as plant science projects. Some publicly available resources and tools developed by Dr. Joshi’s team include:


Biomedical Diseases

  • Knowledge Base Commons (KBCommons) 
  • G2PDeep (deep learning method for phenotype prediction) 
  • IRnet (immunotherapy response prediction) 
  • CIR (cancer ommunoprevention resource) 
  • CrossMP (Cross-Modality translation between scRNA-Seq and scATAC-Seq) 
  • IMPRes (Integrative Multiomics Pathway Resolution) https://impres.missouri.edu/impres 


Genomic Epidemiology

Plant Science Research

How does this initiative promote collaboration across campus? 

Informatics research is very interdisciplinary by nature and adopts diverse concepts and methods from computer science, medicine, biology, statistics, mathematics, chemistry, physics etc. Most projects these days generate a large amount of data (eg. next generation sequencing (NGS) bulk, single cell transcriptomics, spatial transcriptomics, proteomics, metabolomics etc.) and/or often have the need to tap into other large-scale datasets generated by consortiums such as All of US, UK Biobank, KPMP project etc. for certain diseases. These need to be analyzed efficiently and mined in an innovative manner, to generate novel insights that won’t necessarily be feasible by just looking at one piece of the puzzle.  

Many complex clinical and research problems that we face today and, in the future, need multi-site, multi investigator approaches, where everyone involved is an expert in their core domain and often must collaborate with others from a different field to adopt tools and techniques for solving the questions more efficiently and identifying newer solutions. 

Informatics and data science initiatives fulfill this fundamental need by bringing together experts from diverse domains including clinicians, researchers, computer scientists, statisticians etc. and serving as a central backbone to facilitate seamless translation between the fields.  


What are the first steps in getting started? 

A great starting point for students, residents, researchers and clinicians who are new to the field, is to participate in an informatics workshop or a bootcamp which introduces the various informatics application areas, to better understand what the field can offer and how it can be complementary to their current research and potentially offer opportunities for further expansion by application of informatics tools.

Signing up for lab rotations and registering for undergraduate and graduate level (MS and PhD) coursework will help build the necessary technical skillsets needed to become a biomedical or health informatics and data science researcher. The best way to learn informatics is through hands-on experience! 


Date Posted: Thursday, August 7, 2025