Debugging R-peak Detection: How QvosAgent Investigated a NeuroKit2 Anomaly

Published: 2026-05-08 | Tags: ECG, Signal Processing, NeuroKit2, Debugging, Open Source AI

The Mystery

While analyzing real ECG data from the MIT-BIH Arrhythmia Database, a user noticed something puzzling in the visualization generated by QvosAgent: several red dots marking R-peaks did not align with the actual peaks of the ECG waveform. Some dots appeared to sit in the valleys between waves rather than at the tops.

Was this a calculation error? A bug in the algorithm? Or something more subtle?

This is exactly the kind of question that showcases what autonomous AI agents can do — not just run code, but investigate, research, compare, and resolve complex technical issues.

The Investigation

Step 1: Reproduce and Analyze

QvosAgent first reproduced the analysis using MIT-BIH Record 100 (normal sinus rhythm), a 10-second ECG segment sampled at 360 Hz. The original code used NeuroKit2's ecg_process() function:

df, info = nk.ecg_process(ecg_signal, sampling_rate=fs)
peaks = info['ECG_R_Peaks']

Comparing the detected peaks against the official MIT-BIH annotations revealed the problem:

Detected Peak	Official Annotation	Deviation
@2092	@2044 (Atrial premature)	133ms
@2375	@2402 (Normal)	75ms
@2686	@2706 (Normal)	56ms

The detected points were sitting in the Q-wave/S-wave valleys (negative values) instead of the R-wave peaks (positive values) — a difference of over 800mV in signal amplitude.

Step 2: Community Research

QvosAgent searched the NeuroKit2 GitHub repository and found multiple related issues:

Issue #620: Users reported T-waves being falsely detected as R-peaks with the default neurokit method
Issue #752: T-waves with higher amplitude than R-peaks were incorrectly marked, doubling the beat-to-beat interval
Discussion #843: Movement artifacts causing R-peak misplacement during long recordings
Issue #1064: Even with correct_artifacts=True, artifacts were still detected as R-peaks

The community consensus was clear: NeuroKit2's default gradient-based detection has known limitations with abnormal waveforms.

Step 3: Method Comparison

QvosAgent tested three different peak detection methods:

# Method 1: Direct ecg_findpeaks with neurokit method
peaks = nk.ecg_findpeaks(clean_signal, sampling_rate=fs, method='neurokit')

# Method 2: Nabian 2018 (community recommended)
peaks = nk.ecg_findpeaks(clean_signal, sampling_rate=fs, method='nabian2018')

# Method 3: Pan-Tompkins (classic algorithm)
peaks = nk.ecg_findpeaks(clean_signal, sampling_rate=fs, method='pantompkins1985')

The Surprising Result

Method	Detected	Matched	Mean Deviation	Max Deviation
NeuroKit (ecg_findpeaks)	12	12/12	0.5ms	2.8ms
Nabian2018	11	11/11	3.0ms	5.6ms
Pan-Tompkins	12	12/12	14.4ms	44.4ms
~~Old (ecg_process)~~	12	9/12	22.5ms	133.3ms

The root cause was not the algorithm itself, but the API usage pattern. When ecg_process() internally calls peak detection, it uses a different implementation path than directly calling ecg_findpeaks(). The direct call produced near-perfect results with a mean deviation of only 0.5ms.

Method Comparison

Key Takeaways

Separate signal cleaning from peak detection — Use ecg_process() for cleaning, then ecg_findpeaks() with an explicit method for detection
Community knowledge matters — GitHub issues revealed this was a known limitation with documented workarounds
Multiple methods should be compared — Testing three algorithms revealed the best approach
Autonomous investigation works — QvosAgent independently reproduced the issue, researched community discussions, compared alternatives, and identified the root cause

Best Practice Code

import neurokit2 as nk

# Step 1: Clean the signal
df, info = nk.ecg_process(ecg_signal, sampling_rate=fs)
clean_signal = df['ECG_Clean'].values

# Step 2: Detect peaks with explicit method
peaks = nk.ecg_findpeaks(clean_signal, sampling_rate=fs, method='neurokit')
r_peaks = peaks['ECG_R_Peaks']

# Alternative: Nabian 2018 method
peaks = nk.ecg_findpeaks(clean_signal, sampling_rate=fs, method='nabian2018')

This analysis was performed entirely by QvosAgent, an open-source local AI agent that autonomously investigates and resolves technical challenges. The full code and data are available for reproducibility.