Appalachian Informatics Platform

Appalachian Informatics Platform

The Appalachian Informatics Platform is a multi-institutional data warehouse with embedded data analytics and interactive visualization for Clinical and Translational Research. The Platform was developed by the Appalachian Clinical and Translational Science Institute’s Division of Clinical Informatics and consists of four major components, centralized clinical data warehouse (CDW), modeling (statistical and machine learning), visualization, and programming. Overall the platform provides an inexpensive yet seamless way to translate clinical and translational research ideas into clinical applications for regions like Appalachia that have limited resources and a largely rural population.

1. Centralized Clinical Data Warehouse

            The multi-institutional CDW contains 9+ years of billing and clinical data. It is comprised of relational tables as well as dimensions and fact tables (Online Analytical Processing cube [OLAP]) which enable secure data storage and data access. Designed from the start to facilitate information flow, the CDW can send out a stream of near real-time data that can be used for any authorized research purpose.

2. Modeling (statistical and machine learning)

          In machine learning, a machine is “taught” how to recognize patterns without explicitly being programmed. Once taught the developed model can extract information from existing clinical data sets to determine patterns that predict future medical outcomes and trends. The machine learning component of the informatics platform consists of embedded open-source machine learning programs (e.g., R, Python) in freely available developer editions of T-SQL coding.

3. Visualization

          Dynamic visualization of information is an excellent method of providing knowledge that can be easily understood by any member of the health care discipline. Within the informatics platform, Tableau provides interactive drill-down/drill-up capabilities for specific projects.

4. Programming

          A great deal of information is contained within the millions of unstructured notes and reports created during hospitalization and medical visits. Program applications in C#, Python and R provide additional methods of adding, manipulating and analyzing data found outside and within the medical electronic records. An example is an extraction program that converts complex unstructured reports and notes into less complex structures for use in machine learning classification and regression models, storing the resulting large volume of information into “Big Data” structures.

Overall the platform provides an inexpensive yet seamless way to translate clinical and translational research ideas into clinical applications for regions like Appalachia that have limited resources and a largely rural population. We encourage you to read more on the Appalachian Informatics Platform at The Journal of Medical Internet Research.