Analytics is a process, much like any other aspect of healthcare. It begins with data collection and moves through phases of preparation, modeling and analysis to making predictions and even being able to recommend actions.
The terms data mining and predictive analytics refer to the computational process of discovering patterns in large datasets using interdisciplinary methods such as artificial intelligence, machine learning and statistics. Ultimately, the goal of data mining is to discover new information from data residing in electronic datasets. That information can later be used to support healthcare decisions such as medical treatment, prognosis and diagnosis.
The execution of a good analytics strategy is, of course, no small feat. It takes a dedicated team of interdisciplinary professionals to gather the right information, glean actionable insights from it and use it to inform healthcare decision making.
It is with this in mind that USF’s Healthcare Data Mining and Predictive Analytics course was created by Dr. Ali Yalcin to be part of the school’s Master of Science and Graduate Certificate offerings in Healthcare Analytics.
Yalcin is a professor and researcher working at USF specializing in data analytics with special emphasis on healthcare data. He is the author of the course and, having done extensive interdisciplinary research over the course of his career, understands the need for cultivating highly skilled players for analytics teams.
“I don’t know of an MD that is an analytics guru,” Yalcin says. “When you look at individuals who are trying to solve problems in the healthcare space, they have a core expertise. MDs know the human body, pharmacists know the drugs and their effects, computer scientists know code and software, but it’s very difficult for these groups to communicate if they don’t speak a little bit of each other’s language. Through this course and the program overall, we’re trying to teach the students the language and concepts they need to know to be functional on such teams. Whatever their role is currently along that spectrum, hopefully by the time they’re done with this program, they are capable members of a healthcare analytics project group.”
Yalcin designed the course to expose students to one of the most widely used standard process models that data mining experts use in executing data analytics projects. The process model includes seven distinct steps from business understanding all the way to deployment of an analytics project solution.
Following the discussion of business and data understanding, the course covers various data preparation techniques including data imputation. It then proceeds to describe building and evaluating descriptive models including principal component analysis and clustering. The course concludes with an in depth discussion of more advanced topics related to predictive models and their assessment with particular emphasis on decision trees.
Introducing major principles and techniques used in data mining from an algorithmic perspective helps students better understand how data mining technology can be applied to various kinds of data. The course also discusses the basic types of data, data quality, preprocessing techniques and measures of similarity.
Proficiency in data mining methodologies is required for positions within the healthcare analytics field. This course is designed to provide theoretical and practical coverage of descriptive and predictive modeling and how to execute an analytics project from beginning to end. After completing the course, students should be able to:
- Communicate using data mining terminology
- Distinguish between predictive and descriptive methods
- Explain various data mining methods
- Apply commonly used data mining methods in healthcare datasets using SAS.
But beyond the techniques and concepts learned, Yalcin hopes the course sparks intrigue as much as anything.
“I hope the students take curiosity from it,” Yalcin says. “When I designed the course, I did it with breadth in mind. Rather than teaching them one thing really well and in-depth, I decided I was going to teach as many topics as I can with just enough depth for them to be functional and curious about it. So let’s say they like decision trees. I wanted to give them enough information so that if they encounter a problem regarding that topic, they can research and understand it. But are they experts on decision trees or principle component analysis or any of the other topics we cover in the course? No, not necessarily, but they’ll have functional understanding of all these things and will hopefully be curious enough about them to keep refining their knowledge.”
You may wonder if these concepts can be self-taught or learned through experience. In some ways, it is possible; however, Yalcin feels that the challenges that the class presents to students and the way the program is managed through the online environment offer the student a better environment and method of learning complex topics.
“I teach in several online programs between engineering and healthcare,” Yalcin says. “The programs for analytics and informatics are first class in terms of the way they’re designed, executed and continuously improved. The way students are managed in the program is fantastic and, because the process is well done, the student experience is smooth. When you’re not tripped up by technology issues, access problems and so on, learning becomes a pleasure.”
“In terms of the content, I can speak for my course and I spent a lot of time working on it. I wanted to do it right, do it well and have it be relevant. And I think it’s that way across the board.”