Blog • February 2026
By Cemhan Biricik — Founder of ZSky AI
Everyone asks about GPU costs and model performance. Nobody asks about the electricity bill. But if you are self-hosting AI infrastructure, power consumption is your largest ongoing expense. Here are the real numbers from running 7 RTX 5090 GPUs 24/7 for ZSky AI.
Each RTX 5090 draws roughly 30-50W at idle with models loaded in VRAM but no active inference. Under full AI inference load, draw jumps to 350-450W depending on the model and operation. The delta between idle and load is enormous — which means your average power consumption depends heavily on utilization patterns.
At a typical US residential electricity rate of $0.12-0.15/kWh, running the full cluster averages $500-800 per month. This varies significantly by region — European rates would be 2-3x higher, while some US states with cheap hydro power could halve this.
Compare this to cloud GPU rental at $15,000-30,000/month for equivalent compute. Even the highest electricity costs in the US make self-hosting dramatically cheaper for sustained workloads.
I have implemented several strategies to reduce power waste without sacrificing performance:
Running 3.5kW from a residential electrical system is non-trivial. A standard 15A circuit provides about 1,800W. My setup requires dedicated 20A circuits and careful load balancing to avoid tripping breakers. If you are planning a multi-GPU build, talk to an electrician before buying hardware. The GPU budget is not the only budget.