Real Heartbeats, Real Insights: QvosAgent Analyzes MIT-BIH ECG Data
Date: 2026-05-08
Tags: MIT-BIH, ECG, HRV, Arrhythmia, Real Data, NeuroKit2, PhysioNet, Biomedical
From Synthetic to Real
In our previous post, QvosAgent generated synthetic ECG signals and extracted Heart Rate Variability (HRV) features using NeuroKit2. But synthetic data, while useful for validation, lacks the complexity and unpredictability of real human physiology.
This time, QvosAgent was challenged to work with real clinical ECG data — specifically, the legendary MIT-BIH Arrhythmia Database from PhysioNet.
The MIT-BIH Arrhythmia Database
The MIT-BIH database is one of the most widely used datasets in cardiac research:
- Source: PhysioNet (https://physionet.org/content/mitdb/1.0.0/)
- Records: 48 recordings, each approximately 30 minutes
- Sampling Rate: 360 Hz
- Channels: Dual-channel (MLII, V5)
- Annotations: Expert-labeled arrhythmia types
It has been the gold standard for testing ECG analysis algorithms for decades.
The Experiment
QvosAgent autonomously:
- Downloaded the database from PhysioNet using the
wfdbPython library - Selected two contrasting records for comparison
- Processed 5 minutes of ECG data from each record
- Extracted 70+ HRV features using NeuroKit2
- Compared the results to identify clinically meaningful differences
Record 100: Normal Sinus Rhythm
Record 100 represents a healthy heart with normal sinus rhythm — the baseline for comparison.
Record 200: Ventricular Premature Beats
Record 200 contains ventricular premature beats (VPBs) — extra heartbeats originating in the ventricles, a common arrhythmia that disrupts the normal rhythm pattern.
Key Findings
Heart Rate Comparison
| Metric | Record 100 (Normal) | Record 200 (VPB) |
|---|---|---|
| Heart Rate | ~74 BPM | ~92 BPM |
| R-peaks (5 min) | 370 | 433 |
| Mean RR Interval | 808 ms | 670 ms |
The abnormal rhythm record shows a higher heart rate and shorter inter-beat intervals, consistent with the stress response often associated with arrhythmias.
HRV Metrics: The Telling Differences
| Metric | Record 100 (Normal) | Record 200 (VPB) | Ratio |
|---|---|---|---|
| SDNN | 28.1 ms | 173.8 ms | 6.2× |
| RMSSD | 30.1 ms | 314.8 ms | 10.5× |
| HF Power | 86.25% | 45.3% | — |
| LF/HF Ratio | 0.16 | 1.21 | 7.6× |
These differences are dramatic and clinically significant:
- SDNN (overall HRV) is 6.2× higher in the arrhythmia record, reflecting the irregular timing caused by premature beats
- RMSSD (short-term variability) is 10.5× higher, indicating extreme beat-to-beat irregularity
- HF power (parasympathetic activity) drops from 86% to 45%, suggesting reduced vagal tone
- LF/HF ratio increases 7.6×, indicating a shift toward sympathetic dominance
Visualizations
Real ECG signals from MIT-BIH: Record 100 (normal sinus rhythm) vs Record 200 (ventricular premature beats)
Record 100: Normal sinus rhythm with regular heartbeat pattern. Heart rate ~74 BPM, SDNN 28.1 ms — representing healthy cardiac variability.
Record 200: Ventricular premature beats (PVCs) showing irregular heartbeat pattern. Heart rate ~92 BPM, SDNN 173.8 ms — the HRV is 6.2× higher due to arrhythmia.
Comparison: The two plots above let you visually compare a normal heartbeat (Record 100) with an arrhythmic one (Record 200). Notice how Record 100 has a regular, rhythmic pattern with consistent R-wave peaks, while Record 200 shows irregular spikes and premature beats that disrupt the normal rhythm. This visual difference corresponds to the dramatic HRV differences: SDNN jumps from 28.1 ms (normal) to 173.8 ms (arrhythmia), and RMSSD increases from 30.1 ms to 314.8 ms — a 10.5× increase reflecting extreme beat-to-beat irregularity.
HRV feature comparison between normal and arrhythmic rhythms — the differences are striking
Poincaré plots: Record 100 shows a compact distribution (regular rhythm), while Record 200 shows a widely scattered pattern (irregular rhythm)
Comparing synthetic data (previous experiment) with real clinical data — real signals show significantly higher HRV
Synthetic vs. Real: A Critical Comparison
One of the most interesting findings is the comparison between our synthetic data (70 BPM, 10-second recording) and real clinical data:
| Metric | Synthetic (10s) | Real Normal (5min) | Real VPB (5min) |
|---|---|---|---|
| SDNN | 11.5 ms | 28.1 ms | 173.8 ms |
| RMSSD | 13.2 ms | 30.1 ms | 314.8 ms |
Real physiological signals show significantly higher HRV than synthetic data, reflecting the complexity of the human autonomic nervous system that simple models cannot fully capture. This underscores the importance of validating algorithms on real clinical data.
Technical Notes
Data Acquisition
import wfdb
# Download and load MIT-BIH database
wfdb.dl_database('mitdb', pdir='./mitbih_data')
records = wfdb.rdrecord('./mitbih_data/100', sampto=18000) # 5 minutes at 360 Hz
Signal Processing
- Sampling rate: 360 Hz (native to MIT-BIH)
- Analysis window: 5 minutes (300 seconds, 108,000 samples)
- Peak detection: NeuroKit2's
ecg_findpeaks()with default parameters - HRV computation: Full 70+ feature pipeline from NeuroKit2
Clinical Implications
The dramatic differences in HRV metrics between normal and arrhythmic rhythms demonstrate why HRV analysis is a cornerstone of cardiac diagnostics:
- Arrhythmia Detection: Elevated SDNN and RMSSD can flag irregular rhythms
- Autonomic Assessment: LF/HF ratio shifts reveal sympathetic-parasympathetic balance
- Risk Stratification: Abnormal HRV patterns correlate with cardiac events
- Treatment Monitoring: HRV changes can track response to anti-arrhythmic therapy
What This Demonstrates
QvosAgent completed this entire workflow autonomously:
- ✅ Database discovery — Found and downloaded MIT-BIH from PhysioNet
- ✅ Data preprocessing — Loaded, filtered, and segmented real ECG recordings
- ✅ Feature extraction — Computed 70+ HRV metrics using NeuroKit2
- ✅ Comparative analysis — Identified clinically meaningful differences
- ✅ Visualization — Generated publication-quality figures
- ✅ Interpretation — Provided clinical context for the findings
Conclusion
Working with real clinical data transforms abstract algorithms into meaningful medical insights. The 6-10× differences in HRV metrics between normal and arrhythmic rhythms aren't just numbers — they represent the difference between a healthy heart and one struggling with irregular beats.
This experiment demonstrates that AI agents can now contribute to biomedical research workflows, from data acquisition through analysis to interpretation. The next frontier: integrating these capabilities into clinical decision support systems.
This analysis was conducted entirely autonomously by QvosAgent, a local, open-source AI agent. All data from the MIT-BIH Arrhythmia Database is used for research purposes only.