Applied machine learning extension of an academic GLM project

Bike Sharing Demand Prediction

Model comparison platform for operational demand forecasting

The project extends a GLM-based academic study into a broader machine learning workflow. It compares linear, tree-based, and boosting models under a consistent evaluation setup and keeps the code organized for reuse.

View code

Models tested

0.85+

Best R2

180-220

RMSE

Problem

Demand forecasting for urban mobility needs a reproducible process for comparing models, validating feature engineering, and exposing the best model for downstream use.

Approach

Prepared the data, added feature engineering and target transformations, tuned several models with GridSearchCV, logged the comparison outputs, and packaged the project structure with Docker and serialized artifacts.

Results

The strongest models achieved an R-squared above 0.85 with RMSE around 180 to 220 rentals, improving on a more classical GLM baseline.

What is in the repository

Benchmarked six models under a shared evaluation framework.

Applied Box-Cox normalization and feature engineering before model selection.

Saved trained artifacts and figures for reproducibility.

Organized the codebase for reuse instead of leaving it in notebooks only.

Role and scope

Forecasting workflow design, model benchmarking, and packaging

Project context

Applied machine learning extension of an academic GLM project

Main stack

Pythonscikit-learnXGBoostLightGBMPandasDocker