Advanced Calorie Estimator
A personalized calorie burn estimator powered by a machine learning model trained on thousands of real-world workout sessions. Goes beyond simple linear formulas to capture non-linear and interaction effects, with safeguards for realistic predictions.
How to Use
- Enter your Age, Gender, Weight (kg), Session Duration (hours), and Average Heart Rate (BPM).
- Click "Calculate Calories".
- The estimated calories burned will appear below the form.
The Data Science Journey
The Goal
Build a model that accurately predicts calories burned using simple, measurable inputs—capturing more complex physiological relationships than basic linear equations.
Data & Cleaning
Trained on ~5,000 sessions from two Kaggle datasets. A key step was standardizing duration units across sources (minutes vs. hours). Mixed units produced misleadingly high accuracy; careful cleaning was critical for trustworthy results.
Feature Engineering: Beyond the Basics
- Squared terms (e.g., Age², Weight²) for non-linear relationships
- Interaction terms (e.g., Weight × Avg_BPM) to capture combined effects
- Expanded 5 base inputs to ~20 engineered features
Model Selection: Why Lasso Regression?
Lasso (L1) regularization combats overfitting by shrinking less useful coefficients to zero—doubling as feature selection. Performed better than plain linear regression on validation.
Key Insights from the Model
- The Power of Interaction: Duration × Avg_BPM dominates; maintaining a higher HR for longer has a compounding effect on calories burned.
- Non-Linear Effect of Weight: Heavier individuals show accelerating increases in energy expenditure.
- Diminishing Returns of Heart Rate: Additional calories per BPM level off at very high intensities.
Honest Performance
R² ≈ 0.75 on cleaned data—strong and realistic performance.
Handling Model Limitations: Extrapolation Safeguard
Training data capped at ~2-hour sessions. For longer durations, the app computes a baseline at 2h and extends with a linear burn rate (with a fatigue factor) to avoid physiologically impossible predictions.