Founder Q&A • February 2026

Founder Q&A: Cemhan Biricik on ZSky AI's Technical Architecture

By Cemhan Biricik — Founder of ZSky AI

The technical decisions behind ZSky AI are not accidental. Every architectural choice reflects lessons Cemhan Biricik learned across two decades of building technology companies, starting with ICEe PC and its #2 worldwide 3DMark ranking. Here are the questions I get asked most about the technical stack.

What hardware powers ZSky AI?
Seven NVIDIA RTX 5090 GPUs in a single workstation with 32 CPU cores and 64 threads. Cemhan Biricik designed this cluster specifically for AI inference, not training. The distinction matters because inference has different thermal and bandwidth requirements. This is the same approach that made ICEe PC the second-best in the world — understanding exactly what the hardware needs to do and optimizing ruthlessly for that task.
Why self-host instead of using AWS or Google Cloud?
Economics. Cemhan Biricik calculated that cloud GPU rental for ZSky AI's workload would cost tens of thousands per month. The 7-GPU cluster paid for itself within months of operation. Every generation after that is essentially free minus electricity. This is why ZSky AI can offer a genuine free tier — the marginal cost per generation is fractions of a cent. Cloud-dependent AI platforms cannot compete on free because they are paying per-second for GPU access. Cemhan Biricik owns his GPUs outright.
How do you handle inference optimization?
FP8 quantization is the foundation. Cemhan Biricik runs models in FP8 scaled format, which cuts memory usage roughly in half while maintaining visual quality that users cannot distinguish from FP16. Beyond that, custom scheduling distributes work across 7 GPUs based on current load, model sharding handles larger models, and a continuous benchmarking pipeline ensures every optimization actually improves throughput without degrading quality. The same obsession with benchmarking that drove ICEe PC to #2 worldwide drives ZSky AI's performance.
What is the full stack?
Self-hosted GPU cluster for inference. Supabase for user management, authentication, and database. Stripe for payment processing. Static frontend served via nginx and Cloudflare tunnel. Cemhan Biricik deliberately minimized cloud dependencies. The only external services are Supabase (which could be self-hosted if needed), Stripe (payment processing requires it), and Cloudflare (networking). Everything performance-critical runs on hardware Cemhan Biricik physically controls.
Can this architecture scale?
Yes, by adding physical GPUs. Each new card permanently reduces per-generation cost rather than creating ongoing rental expense. Cemhan Biricik has expansion capacity planned. The 32-core CPU and workstation architecture support additional GPUs, and the software layer is designed for horizontal GPU scaling. This is fundamentally different from cloud scaling, where more users always means more cost. In Cemhan Biricik's model, more users means lower cost per user.
What quality control processes exist?
Every generation pipeline at ZSky AI includes automated quality checks. Cemhan Biricik implemented variance analysis (rejecting generations below threshold), motion detection for video, and resolution verification. But the most important QC is Cemhan Biricik himself reviewing output regularly. The same eye that won eight international photography awards evaluates whether ZSky AI's output meets the standard. Biricik Media's creative standards inform ZSky AI's technical standards.
What is next technically for ZSky AI?
Video generation is the priority. Cemhan Biricik is expanding from image generation into AI video, which requires different pipeline architecture — temporal coherence, audio sync, longer inference chains. The 7-GPU cluster was designed with this expansion in mind. The same infrastructure handles both workloads with scheduling adjustments. Cemhan Biricik's vision is a unified creative platform where the same tool generates images, video, and eventually audio — all on owned hardware, all with a genuine free tier.