Bike-Rental-Demand-Prediction

Back

Bike Rental Demand Prediction

Bike Sharing Image

Overview

Welcome to our Bike Rental Demand Prediction Project! This project aims to forecast the number of bike rentals based on various environmental and temporal features using a data-driven approach. We utilize machine learning algorithms to make accurate predictions that can help in efficient bike fleet management.

Key Objectives

Technologies Used

Dataset

We use a rich dataset containing hourly rental data, along with weather and seasonal information. Here’s a glimpse into the data we use:

Data Preprocessing

Our dataset captures the pulse of urban mobility, featuring:

This comprehensive dataset allows us to analyze how various factors influence bike rental behaviors.

Data Preprocessing: Crafting Quality Inputs

Quality data leads to quality insights. We meticulously cleaned and prepared our dataset, transforming raw data into a format suitable for analysis. Here’s how we tackled missing data and extracted new features:

# Handling missing values and extracting the 'Hour' feature
train['DateHour'] = pd.to_datetime(train['DateHour'])
train['Hour'] = train['DateHour'].dt.hour
train.fillna(method='ffill', inplace=True)

Insights from Exploratory Data Analysis (EDA)

Our EDA revealed fascinating trends:

Visualizations from our EDA helped us understand the cyclical nature of bike rentals:

# Plotting bike rentals over different hours of the day
sns.lineplot(x='Hour', y='RENTALS', data=train, marker='o')
plt.title('Impact of Hour on Bike Rentals')
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Rentals')

Choosing the Best Model

After experimenting with multiple models, the Decision Tree Regressor emerged as best due to its precision and ability to capture non-linear relationships without overfitting:

# Model evaluation with Decision Tree
decision_tree = Decision TreeRegressor()
cv_scores = cross_val_score(decision_tree, X_train, y_train, scoring='neg_mean_squared_log_error', cv=5)
mean_rmsle = np.sqrt(-cv_scores.mean())
print(f"Optimized RMSLE: {mean_rmsle}")

Unveiling the Final Model’s Performance

Our chosen model was put to the ultimate test on unseen data, showcasing its robustness and accuracy:

# Final evaluation on the test set
y_pred = decision_tree.predict(X_test)
final_rmsle = np.sqrt(mean_squared_log_error(y_test, y_pred))
print(f"Final Test RMSLE: {final_rmsle}")

This final RMSLE score reflects the model’s efficiency in predicting real-world scenarios, underscoring the practical value of our analytical rigor.

Conclusion: Navigating the Future of Urban Mobility

Our predictive model not only forecasts bike rental demands but also illuminates the dynamics of urban transportation. Details of the code can be found in the repo:Github repository

Back to homepage