We used the UCI Diabetes 130-US hospitals dataset, which contains 10 years (1999-2008) of clinical care data from 130 US hospitals. The dataset includes over 100,000 hospital admissions of diabetic patients.
Each record in the dataset represents a hospital stay and includes information about patient demographics, diagnoses, medications, laboratory tests, and whether the patient was readmitted within 30 days.
The dataset required significant cleaning and preprocessing before it could be used for modeling:
We conducted extensive exploratory data analysis to understand patterns and relationships in the data:
We took several steps to identify and mitigate potential biases in our model:
Patient data protection was a priority throughout our research:
For responsible implementation in clinical settings: