Wearable_TimeSeries_Health_Monitor
面向可穿戴设备的多用户健康监控方案:一份模型、一个配置,就能为不同用户构建个性化异常检测。模型基于 **Phased LSTM + Temporal Fusion Transformer (TFT)**,并整合自适应基线、因子特征以及单位秒级的数据滑窗能力,适合当作 HuggingFace 模型或企业内部服务快速接入。
🌟 模型应用亮点
| 能力 | 说明 |
|---|---|
| 即插即用 | 内置 WearableAnomalyDetector 封装,加载模型即可预测,一次初始化后可持续监控多个用户 |
| 配置驱动特征 | configs/features_config.json 描述所有特征、缺省值、类别映射,新增/删减血氧、呼吸率等只需改配置 |
| 多用户实时服务 | FeatureCalculator + 轻量级 data_storage 缓存,实现用户历史管理、基线演化、批量推理 |
| 多场景 Demo | test_wearable_service.py 内置 3 个真实"客户"案例:完整传感器、缺少字段、匿名设备,即使没有原始数据也能立即体验 |
| 自适应基线支持 | 可扩展 UserDataManager 将个人/分组基线接入推理流程,持续改善个体敏感度 |
⚡ 核心特点与技术优势
🎯 自适应基线:个人与群体智能融合
模型采用自适应基线策略,根据用户历史数据量动态选择最优基线:
- 个人基线优先:当用户有足够历史数据(如 ≥7 天)时,使用个人 HRV 均值/标准差作为基线,捕捉个体生理节律差异
- 群体基线兜底:新用户或数据稀疏时,自动切换到群体统计基线,确保冷启动也能稳定检测
- 平滑过渡机制:通过加权混合(如
final_mean = α × personal_mean + (1-α) × group_mean)实现从群体到个人的渐进式适应 - 实时基线更新:推理过程中持续累积用户数据,基线随用户状态演化而动态调整,提升长期监控精度
优势:相比固定阈值或纯群体基线,自适应基线能同时兼顾个性化敏感度(减少误报)和冷启动鲁棒性(新用户可用),特别适合多用户、长周期监控场景。
⏱️ 灵活的时间窗口与周期
- 5 分钟级粒度:每条数据点代表 5 分钟聚合,支持秒级到小时级的灵活时间尺度
- 可配置窗口大小:默认 12 点(1 小时),可根据业务需求调整为 6 点(30 分钟)或 24 点(2 小时)
- 不等间隔容错:Phased LSTM 架构天然处理缺失数据点,即使数据稀疏(如夜间传感器断开)也能稳定推理
- 多时间尺度特征:同时提取短期波动(RMSSD)、中期趋势(滑动均值)和长期模式(日/周周期),捕捉不同时间尺度的异常信号
优势:适应不同设备采样频率、用户佩戴习惯,无需强制对齐时间戳,降低数据预处理复杂度。
🔄 多通道数据协同作用
模型整合4 大类特征通道,通过因子特征与注意力机制实现跨通道信息融合:
生理通道(HR、HRV 系列、呼吸率、血氧)
- 直接反映心血管与呼吸系统状态
- 因子特征:
physiological_mean,physiological_std,physiological_max,physiological_min
活动通道(步数、距离、能量消耗、加速度、陀螺仪)
- 捕捉运动强度与身体负荷
- 因子特征:
activity_mean,activity_std等
环境通道(光线、时间周期、数据质量)
- 提供上下文信息,区分运动性心率升高 vs 静息异常
- 类别特征:
time_period_primary(morning/day/evening/night)
基线通道(自适应基线均值/标准差、偏差特征)
- 提供个性化参考基准,计算
hrv_deviation_abs,hrv_z_score等相对异常指标
- 提供个性化参考基准,计算
协同机制:
- 因子特征聚合:将同类通道的统计量(均值/标准差/最值)作为高层特征,让模型学习通道间的关联模式
- TFT 注意力:Temporal Fusion Transformer 的变量选择网络自动识别哪些通道在特定时间点最重要
- 已知未来特征:时间特征(小时、星期、是否周末)帮助模型理解周期性,区分正常波动与异常
优势:多通道协同能显著降低单一指标误报(如运动导致心率升高),提升异常检测的上下文感知能力,特别适合可穿戴设备的多传感器融合场景。
📊 核心指标(短期窗口)
- F1: 0.2819
- Precision: 0.1769
- Recall: 0.6941
- 最佳阈值: 0.53
- 窗口定义: 12 条 5 分钟数据(1小时时间窗,预测未来 0.5 小时)
模型偏向召回,适合“异常先提醒、人机协同复核”的场景。可通过阈值/采样策略调节精度与召回。
🚀 快速体验
1. 克隆或下载模型仓库
git clone https://huggingface.co/oscarzhang/Wearable_TimeSeries_Health_Monitor
cd Wearable_TimeSeries_Health_Monitor
pip install -r requirements.txt
2. 运行内置 Demo(无需额外数据)
# 默认跑 ab60 案例
python test_wearable_service.py
# 批量跑全部预置客户
python test_wearable_service.py --case all
# 想从原始 stage1 CSV 抽样测试
python test_wearable_service.py --from-raw
test_wearable_service.py 将自动:
- 加载
WearableAnomalyDetector - 读取配置驱动特征
- 构建窗口并执行预测
- 输出每位“客户”的异常分数、阈值、预测详情
3. 在业务代码中调用
from wearable_anomaly_detector import WearableAnomalyDetector
detector = WearableAnomalyDetector(
model_dir="checkpoints/phase2/exp_factor_balanced",
threshold=0.53,
)
result = detector.predict(data_points, return_score=True, return_details=True)
print(result)
data_points为 12 条最新的 5 分钟记录;若缺静态特征/设备信息,系统会自动从配置/缓存补齐。
🔧 输入与输出
输入(单个数据点)
{
"timestamp": "2024-01-01T08:00:00",
"deviceId": "ab60", # 可选,缺失时会自动创建匿名 ID
"features": {
"hr": 72.0,
"hrv_rmssd": 30.0,
"time_period_primary": "morning",
"data_quality": "high",
...
}
}
- 每个窗口需 12 条数据(默认 1 小时)
- 特征是否必填由
configs/features_config.json控制 - 缺失值会自动回落到 default 或 category_mapping 定义值
输出
{
"is_anomaly": True,
"anomaly_score": 0.5760,
"threshold": 0.5300,
"details": {
"window_size": 12,
"model_output": 0.5760,
"prediction_confidence": 0.0460
}
}
🧱 模型架构与训练
- 模型骨干:Phased LSTM 处理不等间隔序列 + Temporal Fusion Transformer 聚合时间上下文
- 异常检测头:增强注意力、多层 MLP、可选对比学习/类型辅助头
- 特征体系:
- 生理:HR、HRV(RMSSD/SDNN/PNN50…)
- 活动:步数、距离、能量消耗、加速度、陀螺仪
- 环境:光线、昼夜标签、数据质量
- 基线:自适应基线均值/标准差 + 偏差特征
- 标签来源:问卷高置信度标签 + 自适应基线低置信度标签
- 训练流程:Stage1/2/3 数据加工 ➜ Phase1 自监督预训练 ➜ Phase2 监督微调 ➜ 阈值/案例校正
📦 仓库结构(部分)
├─ configs/
│ └─ features_config.json # 特征定义 & 归一化策略
├─ wearable_anomaly_detector.py # 核心封装:加载、预测、批处理
├─ feature_calculator.py # 配置驱动的特征构建 + 用户历史缓存
├─ test_wearable_service.py # HuggingFace Demo脚本(内含预置案例)
└─ checkpoints/phase2/... # 模型权重 & summary
📚 数据来源与许可证
- 训练数据基于 “A continuous real-world dataset comprising wearable-based heart rate variability alongside sleep diaries”(Baigutanova et al., Scientific Data, 2025)以及其 Figshare 数据集 doi:10.1038/s41597-025-05801-3 / dataset link。
- 该数据集以 Creative Commons Attribution 4.0 (CC BY 4.0) 许可发布,可自由使用、修改、分发,但必须保留署名并附上许可证链接。
- 本仓库沿用 CC BY 4.0 对原始数据的要求;若你在此基础上再加工或发布,请继续保留上述署名与许可证说明。
- 代码/模型可根据需要使用 MIT/Apache 等许可证,但凡涉及数据的部分,仍需遵循 CC BY 4.0。
🤝 贡献与扩展
欢迎:
- 新增特征或数据源 ⇒ 更新
features_config.json+ 提交 PR - 接入新的用户数据管理/基线策略 ⇒ 扩展
FeatureCalculator或贡献UserDataManager - 反馈案例或真实部署经验 ⇒ 提 Issue 或 Discussion
📄 许可证
- 模型与代码:Apache-2.0。可在保留版权与许可证声明的前提下任意使用/修改/分发。
- 训练数据:原始可穿戴 HRV 数据集使用 CC BY 4.0,复用时请继续保留作者署名与许可信息。
🔖 引用
@software{Wearable_TimeSeries_Health_Monitor,
title = {Wearable\_TimeSeries\_Health\_Monitor},
author = {oscarzhang},
year = {2025},
url = {https://huggingface.co/oscarzhang/Wearable_TimeSeries_Health_Monitor}
}
Wearable_TimeSeries_Health_Monitor
A multi-user health monitoring solution for wearable devices: one model, one configuration, enabling personalized anomaly detection for different users. The model is based on Phased LSTM + Temporal Fusion Transformer (TFT), integrating adaptive baselines, factor features, and second-level data sliding window capabilities, suitable for deployment as a HuggingFace model or rapid integration into enterprise services.
🌟 Model Highlights
| Capability | Description |
|---|---|
| Plug-and-Play | Built-in WearableAnomalyDetector wrapper, load the model and start predicting, supports continuous monitoring of multiple users after a single initialization |
| Configuration-Driven Features | configs/features_config.json defines all features, default values, and category mappings; adding/removing features like blood oxygen or respiratory rate only requires configuration changes |
| Multi-User Real-Time Service | FeatureCalculator + lightweight data_storage cache enables user history management, baseline evolution, and batch inference |
| Multi-Scenario Demo | test_wearable_service.py includes 3 real "client" cases: complete sensors, missing fields, anonymous devices, allowing immediate experience even without raw data |
| Adaptive Baseline Support | Extensible UserDataManager integrates personal/group baselines into the inference pipeline, continuously improving individual sensitivity |
⚡ Core Features & Technical Advantages
🎯 Adaptive Baseline: Intelligent Fusion of Personal and Group
The model employs an adaptive baseline strategy that dynamically selects the optimal baseline based on user historical data volume:
- Personal Baseline Priority: When users have sufficient historical data (e.g., ≥7 days), use personal HRV mean/std as baseline to capture individual physiological rhythm differences
- Group Baseline Fallback: For new users or sparse data, automatically switch to group statistical baseline, ensuring stable detection even during cold start
- Smooth Transition Mechanism: Achieve gradual adaptation from group to personal through weighted mixing (e.g.,
final_mean = α × personal_mean + (1-α) × group_mean) - Real-Time Baseline Updates: Continuously accumulate user data during inference, baseline dynamically adjusts as user state evolves, improving long-term monitoring accuracy
Advantage: Compared to fixed thresholds or pure group baselines, adaptive baselines balance personalized sensitivity (reducing false positives) and cold-start robustness (usable for new users), especially suitable for multi-user, long-term monitoring scenarios.
⏱️ Flexible Time Windows & Periods
- 5-Minute Granularity: Each data point represents 5-minute aggregation, supporting flexible time scales from seconds to hours
- Configurable Window Size: Default 12 points (1 hour), adjustable to 6 points (30 minutes) or 24 points (2 hours) based on business needs
- Uneven Interval Tolerance: Phased LSTM architecture naturally handles missing data points, stable inference even with sparse data (e.g., sensor disconnection at night)
- Multi-Time-Scale Features: Simultaneously extract short-term fluctuations (RMSSD), medium-term trends (rolling mean), and long-term patterns (daily/weekly cycles), capturing anomaly signals at different time scales
Advantage: Adapts to different device sampling frequencies and user wearing habits, no need to force timestamp alignment, reducing data preprocessing complexity.
🔄 Multi-Channel Data Synergy
The model integrates 4 major feature channels, achieving cross-channel information fusion through factor features and attention mechanisms:
Physiological Channel (HR, HRV series, respiratory rate, blood oxygen)
- Directly reflects cardiovascular and respiratory system status
- Factor features:
physiological_mean,physiological_std,physiological_max,physiological_min
Activity Channel (steps, distance, energy consumption, acceleration, gyroscope)
- Captures exercise intensity and body load
- Factor features:
activity_mean,activity_std, etc.
Environmental Channel (light, time period, data quality)
- Provides contextual information, distinguishing exercise-induced heart rate elevation vs. resting anomalies
- Categorical features:
time_period_primary(morning/day/evening/night)
Baseline Channel (adaptive baseline mean/std, deviation features)
- Provides personalized reference baseline, calculating relative anomaly indicators like
hrv_deviation_abs,hrv_z_score
- Provides personalized reference baseline, calculating relative anomaly indicators like
Synergy Mechanism:
- Factor Feature Aggregation: Use statistical measures (mean/std/max/min) of similar channels as high-level features, enabling the model to learn association patterns between channels
- TFT Attention: Temporal Fusion Transformer's variable selection network automatically identifies which channels are most important at specific time points
- Known Future Features: Time features (hour, day of week, is_weekend) help the model understand periodicity, distinguishing normal fluctuations from anomalies
Advantage: Multi-channel synergy significantly reduces single-indicator false positives (e.g., exercise-induced heart rate elevation) and improves context-aware anomaly detection, especially suitable for multi-sensor fusion scenarios in wearable devices.
📊 Core Metrics (Short-Term Window)
- F1: 0.2819
- Precision: 0.1769
- Recall: 0.6941
- Optimal Threshold: 0.53
- Window Definition: 12 data points of 5-minute intervals (1-hour time window, predicting 0.5 hours ahead)
The model favors recall, suitable for "anomaly-first alert, human-machine collaborative review" scenarios. Precision and recall can be adjusted through threshold/sampling strategies.
🚀 Quick Start
1. Clone or Download the Model Repository
git clone https://huggingface.co/oscarzhang/Wearable_TimeSeries_Health_Monitor
cd Wearable_TimeSeries_Health_Monitor
pip install -r requirements.txt
2. Run Built-in Demo (No Additional Data Required)
# Run ab60 case by default
python test_wearable_service.py
# Run all predefined clients
python test_wearable_service.py --case all
# Sample from raw stage1 CSV for testing
python test_wearable_service.py --from-raw
test_wearable_service.py will automatically:
- Load
WearableAnomalyDetector - Read configuration-driven features
- Build windows and execute predictions
- Output anomaly scores, thresholds, and prediction details for each "client"
3. Call in Business Code
from wearable_anomaly_detector import WearableAnomalyDetector
detector = WearableAnomalyDetector(
model_dir="checkpoints/phase2/exp_factor_balanced",
threshold=0.53,
)
result = detector.predict(data_points, return_score=True, return_details=True)
print(result)
data_pointsshould be 12 latest 5-minute records; if static features/device information are missing, the system will automatically fill from configuration/cache.
🔧 Input & Output
Input (Single Data Point)
{
"timestamp": "2024-01-01T08:00:00",
"deviceId": "ab60", # Optional, anonymous ID will be created if missing
"features": {
"hr": 72.0,
"hrv_rmssd": 30.0,
"time_period_primary": "morning",
"data_quality": "high",
...
}
}
- Each window requires 12 data points (default 1 hour)
- Whether features are required is controlled by
configs/features_config.json - Missing values automatically fall back to default or category_mapping defined values
Output
{
"is_anomaly": True,
"anomaly_score": 0.5760,
"threshold": 0.5300,
"details": {
"window_size": 12,
"model_output": 0.5760,
"prediction_confidence": 0.0460
}
}
🧱 Model Architecture & Training
- Model Backbone: Phased LSTM handles unevenly-spaced sequences + Temporal Fusion Transformer aggregates temporal context
- Anomaly Detection Head: Enhanced attention, multi-layer MLP, optional contrastive learning/type auxiliary head
- Feature System:
- Physiological: HR, HRV (RMSSD/SDNN/PNN50…)
- Activity: Steps, distance, energy consumption, acceleration, gyroscope
- Environmental: Light, day/night labels, data quality
- Baseline: Adaptive baseline mean/std + deviation features
- Label Source: High-confidence questionnaire labels + low-confidence adaptive baseline labels
- Training Pipeline: Stage1/2/3 data processing ➜ Phase1 self-supervised pre-training ➜ Phase2 supervised fine-tuning ➜ Threshold/case calibration
📦 Repository Structure (Partial)
├─ configs/
│ └─ features_config.json # Feature definitions & normalization strategies
├─ wearable_anomaly_detector.py # Core wrapper: loading, prediction, batch processing
├─ feature_calculator.py # Configuration-driven feature construction + user history cache
├─ test_wearable_service.py # HuggingFace Demo script (includes predefined cases)
└─ checkpoints/phase2/... # Model weights & summary
📚 Data Source & License
- Training data is based on "A continuous real-world dataset comprising wearable-based heart rate variability alongside sleep diaries" (Baigutanova et al., Scientific Data, 2025) and its Figshare dataset doi:10.1038/s41597-025-05801-3 / dataset link.
- This dataset is released under Creative Commons Attribution 4.0 (CC BY 4.0) license, allowing free use, modification, and distribution, but attribution and license link must be retained.
- This repository follows CC BY 4.0 requirements for original data; if you further process or publish based on this, please continue to retain the above attribution and license information.
- Code/models can use MIT/Apache or other licenses as needed, but any parts involving data must still follow CC BY 4.0.
🤝 Contributions & Extensions
Welcome to:
- Add new features or data sources ⇒ Update
features_config.json+ submit PR - Integrate new user data management/baseline strategies ⇒ Extend
FeatureCalculatoror contributeUserDataManager - Provide feedback on cases or real deployment experiences ⇒ Open Issues or Discussions
📄 License
- Model & Code: Apache-2.0. Can be used/modified/distributed freely while retaining copyright and license notices.
- Training Data: Original wearable HRV dataset uses CC BY 4.0; please continue to retain author attribution and license information when reusing.
🔖 Citation
@software{Wearable_TimeSeries_Health_Monitor,
title = {Wearable\_TimeSeries\_Health\_Monitor},
author = {oscarzhang},
year = {2025},
url = {https://huggingface.co/oscarzhang/Wearable_TimeSeries_Health_Monitor}
}