About CalPredict

A machine learning-powered tool for predicting California housing prices, built on real census data and modern ensemble methods.

Accurate Predictions

97.8% R² score using Gradient Boosting — our champion model captures complex non-linear price patterns.

California-Specific

Trained exclusively on California housing data, capturing regional nuances from Bay Area tech hubs to Central Valley.

3 Model Ensemble

Compare predictions across Gradient Boosting, Random Forest, and Ridge Regression.

Confidence Intervals

Tree-based models provide 95% confidence intervals so you understand the range of likely values.

Deep Insights

Explore feature importance, regional patterns, and market drivers to understand pricing.

Production Ready

Deployed on Vercel with Next.js for fast, reliable predictions at scale.

Technology Stack

Next.jsReact framework & frontend

VercelHosting & deployment

scikit-learnMachine learning models

Tailwind CSSUtility-first styling

RechartsData visualization

pandas / NumPyData processing pipeline

How to Use

Pick a city

Select from 20 California cities — area data is auto-filled.

Adjust your property

Use the sliders to set rooms, bedrooms, and household size.

See your estimate

Get both 1990 census and inflation-adjusted 2024 values instantly.

Explore insights

Visit Insights for model performance and feature importance.

Compare cities

Use the Explorer to compare prices across California.

About the Dataset

This model is trained on the California Housing dataset from scikit-learn, derived from the 1990 U.S. Census. It contains 20,640 block-group level observations with 8 features including median income, housing age, average rooms, population, and geographic coordinates. The target variable is the median house value for each block group. While based on historical data, the model captures fundamental relationships between economic, demographic, and geographic factors that continue to drive California real estate pricing.