Server room monitoring: what to measure and why

Most server room outages don't happen because of hardware failure. They happen because nobody noticed the air conditioning died on a Sunday afternoon, and by Monday morning the servers had been cooking themselves for fourteen hours.

Monitoring your server environment is the cheapest insurance in IT. If you already have a rack somewhere in your building, you need this. Here's what to measure and why.

Temperature — the obvious one, done badly

Everyone knows to measure temperature. What most people get wrong is where they measure it.

A single sensor dangling from the ceiling tells you the average temperature of the room. What you actually care about is the temperature of the air being drawn into the servers — specifically at the front of each rack, near the top. That's where heat accumulates. A ceiling sensor can read 22°C while the top of your densest rack sits at 38°C and slowly throttles your CPUs.

What to measure: intake temperature at the top, middle and bottom of each rack. Three sensors per rack is ideal, one per rack is acceptable, zero per rack is how outages happen.

What to alert on: any reading above 27°C for more than 5 minutes. That's not a hard limit — servers will run hotter — but it's the point where something is wrong and you want to know before it becomes very wrong.

Humidity — the one everyone skips

Humidity is boring until it isn't. Too dry (below 30%) and you get static electricity damaging components. Too wet (above 70%) and you get condensation on cold metal surfaces.

In coastal Croatia, humidity is the bigger risk. Check the actual readings in your server room for a week — if they ever cross 65%, you have a ventilation problem waiting to cause corrosion.

Power — where the real money is

Temperature alerts save your hardware. Power monitoring saves your revenue.

What to monitor:

UPS state — battery percentage, input voltage, output load. A UPS running at 85% capacity will not last the 20 minutes it claims.
PDU per-outlet current — this tells you which specific server is drawing more than expected. Sudden jumps are usually malware or a failing power supply.
Mains voltage stability — brownouts under 200V cause intermittent crashes that look like software bugs. They aren't.

The cheapest mistake you can make in an IT budget is buying a UPS and never looking at its status page again. The battery is dying the day it's installed and you won't know until you need it.

Access and physical security

Who opened the server room door, and when? This is not about distrust — it's about having a timestamp to correlate with problems. When something breaks at 14:37 and the access log shows someone was in the room at 14:35, you've just saved yourself an hour of debugging.

A simple magnetic door sensor logging open/close events is enough. Anything more elaborate is nice-to-have, not need-to-have.

Putting it together

For a small server room — two or three racks — the full monitoring kit looks like this:

3–6 temperature sensors (front of each rack, top and bottom)
1 humidity sensor (middle of room)
1 UPS with network management card (not the USB-only consumer kind)
1 networked PDU per rack (for per-outlet monitoring)
1 door sensor
1 small dashboard showing all of it, pulling from whichever protocol your gear speaks (SNMP, Modbus, HTTP)

Total investment for a small-to-medium room: under €2000. The first time it catches a failing AC unit on a Saturday, it pays for itself several times over.

The rule that matters most

Whatever you monitor, make sure someone gets an SMS or phone call when it alerts. An email to an inbox nobody reads on weekends is not monitoring — it's bookkeeping.

Good monitoring wakes the right person at 3 AM on a Sunday. Anything less is theatre.

Need help with Infrastructure & Maintenance?

Let's talk about what a long-term IT partner in Rijeka can do for your business.

Start a conversation