Explainer
Power and Cooling for Multi-GPU AI Rigs
Power and cooling for multi-GPU AI: PSU sizing for 1-4 GPUs, transient spikes, blower vs liquid cooling, slot spacing, and the wall-circuit ceiling.
Most people building a multi-GPU AI rig spend weeks agonizing over which cards to buy and almost no time on the two things most likely to actually stop the machine from working: feeding it enough clean power and getting the heat back out. Power and cooling are the underestimated half of a multi-GPU build. A pair of modern GPUs can pull more sustained wattage than a space heater, spike far above their rated draw for fractions of a second, and dump enough heat into a closed case to throttle themselves within minutes. Get this wrong and you do not get a dramatic failure; you get a rig that reboots under load, throttles silently, or quietly trips a breaker in the next room.
This guide covers how to size a power supply for one, two, or four GPUs, why transient spikes matter on current cards, how to choose between blower, open-air, and liquid cooling, and the wall-circuit reality that puts a hard ceiling on how big a rig you can run on a normal household outlet.
Sizing the power supply
Start with the math, then add a margin for reality. Add up the rated TDP of every GPU, add roughly 150 to 200 watts for a modern CPU under load, another 50 to 100 watts for the motherboard, drives, fans, and RAM, and then add headroom on top so the unit is not running at its absolute limit. A power supply is most efficient and longest-lived running at around 50 to 80 percent of its rating, not pinned at 100 percent. As a rough planning anchor, an RTX 4090 is a 450-watt card and an RTX 5090 is rated around 575 watts.
The wrinkle that catches people is transient spikes. Current high-end cards draw far more than their rated TDP for very brief moments. An RTX 4090 has been measured pulling well over 600 watts in short bursts, and an RTX 5090 can spike toward 900 watts for under a millisecond. A power supply that only just covers the average draw can shut down when two cards spike at once. This is why an ATX 3.1 unit matters: that standard is built to tolerate transient loads up to roughly twice its continuous rating for short windows, exactly the behavior these GPUs produce.
For a dual-GPU workstation, a single high-quality 1200 to 1600 watt ATX 3.1 unit is the clean answer. For a four-GPU rig, you are usually past what one consumer power supply can deliver, and the practical approach is two power supplies linked with an add2psu adapter or a chassis built for dual PSUs, splitting the GPUs across both.
- Sum every GPU TDP, add ~150-200W for the CPU and ~50-100W for the rest of the system.
- Add headroom so the unit runs near 50-80% of its rating, not at 100%.
- Single GPU: a quality 850-1000W unit is plenty.
- Dual GPU: one 1200-1600W ATX 3.1 unit handles the load and the spikes.
- Quad GPU: plan for dual power supplies; one consumer unit rarely covers four cards plus transients.
- Prefer 80+ Gold or Platinum, ATX 3.1, with native 12V-2x6 connectors to avoid daisy-chained adapters.
Rating, rails, and connectors
The 80+ rating measures efficiency, not safety, but it is a useful proxy for build quality. A higher tier wastes less energy as heat and tends to use better internal components. For a rig that runs many hours a day, Gold is the sensible floor and Platinum pays for itself slowly through lower waste heat and a smaller power bill. Efficiency also varies with load, which is another reason to leave headroom rather than running the unit at its ceiling.
Single-rail versus multi-rail is the other old debate. A multi-rail design splits the 12-volt output into separate over-current-protected groups, which is safer in theory but can trip if one rail is overloaded by a hungry GPU. For a high-draw multi-GPU build, a strong single-rail unit is simpler: the full current capacity is available to whichever connector needs it, with no risk of an individual rail tripping under a card's spike. Most quality high-wattage units today are single-rail for exactly this reason.
Connectors deserve a final word. Modern cards use the 12V-2x6 connector, and you want one your power supply provides natively rather than through a bundle of adapters. A poorly seated high-current connector is a genuine fire risk; seat it fully, route it without tight bends, and check it after the first few heat cycles.
Moving the heat back out
Cooling style decides whether GPUs can sit next to each other at all. Open-air cards, the standard three-fan design, dump heat sideways into the case and rely on case airflow to carry it away. They run cool and quiet in isolation but choke when stacked, because each card breathes the hot exhaust of the one below it. Blower cards push their heat straight out the back of the case through the bracket, which makes them far better for dense multi-GPU layouts even though they are louder and run a little hotter individually. Liquid cooling, whether all-in-one coolers or a custom loop, moves heat to radiators away from the cards and is often the only way to pack three or four GPUs without severe throttling.
Slot spacing is the constraint most builds get wrong. Two open-air cards jammed into adjacent slots will heat-soak: the top card's intake fans pull in air already warmed by the card beneath it, temperatures climb, and both throttle. You want at least one empty slot between cards, a motherboard with well-spaced PCIe slots, and ideally a case tall enough to give each card its own breathing room. If the cards must sit close, blower or liquid designs are no longer optional.
Case airflow ties it together. Aim for clear front-to-back or bottom-to-top flow, with enough intake to feed the cards and enough exhaust to clear their output. A few well-placed high-static-pressure fans beat a case crammed with fans fighting each other. In a stacked build, the heat-soak problem is cumulative, so over-provision airflow rather than tuning it to the edge.
- Open-air (3-fan): coolest and quietest alone, but heat-soaks badly when stacked.
- Blower: exhausts out the back, the better choice for dense multi-GPU layouts.
- Liquid (AIO or custom loop): often the only way to run 3-4 cards without throttling.
- Leave at least one empty slot between cards to avoid heat-soak.
- Plan clear directional case airflow with balanced intake and exhaust.
- Watch memory junction temperature, not just the core, on stacked cards.
Power-limiting to cut heat and draw
One of the highest-leverage tricks in a multi-GPU rig is simply telling the cards to use less power. With a single command per card, you can cap each GPU's power limit below its default. The payoff is large: trimming the limit by 10 to 20 percent typically costs under 5 percent in performance, because these cards spend the top of their power band chasing tiny clock-speed gains. Inference is especially forgiving here, since token generation is bound more by memory bandwidth than by raw compute, so the clock reduction barely registers in tokens per second.
The second-order effects are where this pays off in a dense build. Less power drawn means less heat produced, which means lower temperatures, less throttling, quieter fans, and a smaller total load on your power supply and wall circuit. A four-card rig power-limited to 80 percent can be the difference between tripping a breaker and running stably, while giving up only a sliver of throughput. For an always-on machine, this is close to a free win.
It is worth measuring rather than guessing. Set a limit, run your actual workload, and compare tokens per second and temperatures against the default. Most people find a sweet spot well below the factory power limit where speed is virtually unchanged but heat and noise drop noticeably.
The wall-circuit reality and where to put the rig
Here is the limit nobody mentions until a breaker trips. A standard US household circuit is 15 amps at 120 volts, which is 1,800 watts maximum, but electrical code treats anything running three or more hours as a continuous load and caps it at 80 percent, or about 1,440 watts. A 20-amp circuit raises that continuous ceiling to roughly 1,920 watts. An AI rig left training or serving overnight is the definition of a continuous load, so a big multi-GPU machine plus its monitor and accessories can brush right up against that 1,440-watt limit and trip the breaker. Note that the power your rig pulls from the wall is higher than its DC output, because the power supply is not perfectly efficient, which eats into your margin further.
This is why four-GPU rigs often need a dedicated 20-amp circuit, or even a 240-volt outlet of the kind used for large appliances, which doubles the available headroom. Before building a large rig, find out what circuit the outlet is on and what else shares it; discovering that the room's lights and your rig are on the same 15-amp breaker is a bad way to learn this lesson. The cloud-versus-buy math should account for this too: the power a rig draws is a real monthly cost, and a rig that needs an electrician's visit before it can run safely is part of the true price of buying.
Finally, think about where the machine lives. A multi-GPU rig under sustained load is loud and produces real heat, enough to noticeably warm a small room. An always-on machine belongs somewhere the fan noise will not bother anyone and the heat has somewhere to go, ideally a basement, utility room, or closet with airflow, not on the desk beside your head. Plan the location with the same seriousness as the parts list; it is the difference between a rig you actually keep running and one you quietly unplug after a week.
Related builds
Dual RTX 4090 Workstation
Twin 4090s for high-throughput 34B–70B inference with NVLink-ready parts.
View buildTeam Server Inference
High-memory build for concurrent team inference with vLLM on large models.
View buildFrequently asked questions
- What size power supply do I need for two GPUs?
- For a dual-GPU AI workstation, a single high-quality 1200 to 1600 watt unit is the right target. Add up both GPU TDPs, roughly 150 to 200 watts for the CPU, and 50 to 100 watts for the rest of the system, then leave headroom so the supply runs around 50 to 80 percent of its rating rather than at its limit. Choose an ATX 3.1 unit so it can absorb the brief transient spikes modern cards produce.
- Why do I need a bigger PSU than the GPU's rated wattage suggests?
- Because current high-end cards draw far more than their rated TDP for very short bursts. An RTX 4090 can spike past 600 watts and an RTX 5090 toward 900 watts for under a millisecond. A supply sized only to the average draw can shut down when those spikes hit, especially with two cards spiking together. An ATX 3.1 unit is designed to tolerate transients up to about twice its continuous rating, which is exactly what these spikes require.
- Will a multi-GPU rig trip a normal household breaker?
- It can. A standard US 15-amp, 120-volt circuit maxes out at 1,800 watts, but code limits continuous loads, anything running three or more hours, to about 1,440 watts. An always-on training or inference rig counts as continuous, so a large multi-GPU machine can hit that ceiling and trip the breaker. Big rigs often need a dedicated 20-amp circuit or a 240-volt outlet, and power-limiting the cards helps stay under the limit.
- Does power-limiting a GPU hurt inference speed much?
- Surprisingly little. Cutting a card's power limit by 10 to 20 percent usually costs under 5 percent in performance, because the top of the power band buys only tiny clock gains. Inference is especially forgiving since it leans on memory bandwidth more than raw compute. In a dense rig the benefits, lower heat, less throttling, quieter fans, and a smaller load on your PSU and wall circuit, are well worth the small speed trade. Measure your own workload to find the sweet spot.
- Can I stack two GPUs in adjacent slots?
- Not comfortably with open-air cards. Two three-fan cards in adjacent slots heat-soak each other, since the top card breathes the hot exhaust of the one below it, and both end up throttling. Leave at least one empty slot between cards, use a motherboard with well-spaced PCIe slots, and ensure strong case airflow. If the cards must sit close together, switch to blower-style or liquid-cooled cards, which handle dense layouts far better.
Related reading
Some links in this article are affiliate links. If you buy through them we may earn a commission at no extra cost to you. See our affiliate disclosure.