NLP Topic Modeling

This project is my Master’s Thesis focused on extracting hidden semantic structures from healthcare communication data. It explores how patients and health coaches interact through digital platforms.

Links

Technologies Used

Project Objective

The goal is to extract meaningful topics from patient–coach conversations to understand behavioral patterns, health concerns, and communication trends in chronic care programs.

BERTopic Results

50 BERTopic topics
All 50 topics extracted using BERTopic. Each bar represents a distinct latent topic found in the dataset.
Topic over time
Topic evolution over time showing how discussion themes change across the dataset timeline.

LDA Results

Topic 0
Topic 0: top words suggest themes around physical activity and exercise behavior.
Topic 8
Topic 8: focused on diet, nutrition, and food-related behavior patterns.
Topic diversity
Topic diversity metric used to evaluate optimal number of topics in LDA modeling.