You are here

SMSR: cooling

The server rooms have cooling systems, and these occasionally fail. When this happens, computing staff may contact you, or power down your machines - see the emergencies page for details.

How to check the temperature

You can monitor the temperature of the room yourself. Here are two ways:

1. The server rooms have temperature probes.

These are not accessible from outside the Informatics network.

2. Servers might be equipped with their own sensors.

If you server does have sensors, you can typically query them using IPMI. For example:

# ipmitool sdr type Temperature
Inlet Temp       | 04h | ok  |  7.1 | 18 degrees C
Exhaust Temp     | 01h | ok  |  7.1 | 26 degrees C
Temp             | 0Eh | ok  |  3.1 | 32 degrees C
Temp             | 0Fh | ok  |  3.2 | 31 degrees C
# ipmi-sensors -t Temperature
ID | Name         | Type        | Reading    | Units | Event
20 | Inlet Temp   | Temperature | 18.00      | C     | 'OK'
21 | Exhaust Temp | Temperature | 26.00      | C     | 'OK'
22 | Temp         | Temperature | 32.00      | C     | 'OK'
23 | Temp         | Temperature | 30.00      | C     | 'OK'

Look for "Inlet Temp", "Ambient" or the like.

Automation

If you like, you can script a regular check of the temperature detected by each machine, and your script can shut the machine down if the temperature is too high.
DICE servers do this using a script called toohot. It queries the air temperature, compares it to the maximum permissible temperature for that model of server, and shuts down the server if the temperature is looking too high. It's explained in a blog post:

If you manage servers, you might want to run something similar on them.

More info

Find out more about the self-managed server rooms:

Last reviewed: 
18/02/2021

System Status

Home dirs (AFS)
Network
Mail
Other services
University services
Scheduled downtime

Choose a topic