The server rooms have cooling systems, and these can fail. If they do, computing staff may contact you, or they may power down your machines.
See the emergencies page for details:
- ⇒ Emergencies.
How to check the temperature
You can monitor the temperature of the room yourself. Here are two ways:
1. The server rooms have temperature probes.
- The probe for B.Z14 is at bz14.env.f.net.inf.ed.ac.uk
- The probe for B.01 is at b01.env.f.net.inf.ed.ac.uk
- The probe for B.02 is at b02.env.f.net.inf.ed.ac.uk
- Readings are logged at netmon.inf.ed.ac.uk/cometLogs.txt
These links only work from computers on the Informatics network.
2. Some servers have their own temperature sensors.
If you server does have sensors, you can typically query them using IPMI. For example:
# ipmitool sdr type Temperature Inlet Temp | 04h | ok | 7.1 | 18 degrees C Exhaust Temp | 01h | ok | 7.1 | 26 degrees C Temp | 0Eh | ok | 3.1 | 32 degrees C Temp | 0Fh | ok | 3.2 | 31 degrees C # ipmi-sensors -t Temperature ID | Name | Type | Reading | Units | Event 20 | Inlet Temp | Temperature | 18.00 | C | 'OK' 21 | Exhaust Temp | Temperature | 26.00 | C | 'OK' 22 | Temp | Temperature | 32.00 | C | 'OK' 23 | Temp | Temperature | 30.00 | C | 'OK'
Look for "Inlet Temp" or "Ambient" or the like.
Automate your temperature checks
If you like, you can script a regular check of the temperature detected by each machine, and your script can shut the machine down if the temperature is too high.
DICE servers do this using a script called toohot
. It queries the air temperature, compares it to the maximum permissible temperature for that model of server, and shuts down the server if the temperature is looking too high. It's explained in a blog post:
If you manage servers, you might want to run something similar on them.
More info
Find out more about the self-managed server rooms: