Power and Heating, Ventilation, and Air Conditioning (HVAC) are equally important to the reliable operation of your data center. It matters little if you can maintain power to your server racks if your cooling system fails and the room temperature passes 105 degrees F (40° C). As with all aspects of data center design, you start with a risk assessment and then consider the relevant controls that can be used to reduce the risk to an acceptable level. You also need to balance building a single particularly resilient data center versus two geographically separated, less resilient data centers.
Having only sufficient (say N) UPS units to handle the load means some equipment will have to be disconnnected during maintenance.
Having a spare UPS (N+1) and appropriate switching gear means that units can be removed from service for maintenance without disrupting operations. But should a unit be out of service when the power fails, each of the other UPS units is a single point of failure. It also means that the switching gear is a single point of failure.
Having completely redundant UPS systems from separate utility feeds all the way to the rack is referred to as 2N and eliminates any single point of failure.
The degree of redundancy within a single data center is typically characterized by a tier level. The exact requirements for each tier vary somewhat between the different standards but generally provide availability and redundancy similar to what is shown in Table 3.5.
Smaller data centers or server rooms without direct and dedicated connections to the power utility’s distribution network need to consider who their neighbors (electrically speaking) might be. A server room located near a pump room, air conditioning, or refrigeration compressor, or an industrial facility with large electrical motors might need special power conditioning equipment to remove the interference and voltage spikes introduced into the power circuits by the noisy neighbors.
There are many types of UPS systems which vary in their design and features. Battery UPS systems can differ in a number of important aspects:
- Load: The capacity of the unit to deliver a specified level of continuous power
- Capacity: The time during which the unit can maintain the load
- Filtering: The ability of the unit to isolate the equipment from noise, surges, and other problems with the utility power
- Reliability: Some designs trade low cost for reliability
Non-battery UPS systems exist that use large-mass rotating flywheels connected to provide short-term backup. These are appropriate for larger loads (> 200KW) and can provide higher reliability and lower lifetime costs than comparable battery UPS systems.
Typically a UPS is intended only to carry the load during short outages, and for the short time it takes for a backup generator to start and be able to take the full load. So, most data centers or server rooms will need a generator to handle the load, should the power interruption last longer than that which the UPS can handle.
Generators are available in a wide range of capacities and use different fuels (gasoline, diesel, natural gas, and hydrogen). The advantage of natural gas is the elimination of the need to store fuel. The risk is that certain natural disasters can cause both power and gas distribution outages.
Cooling systems must be designed so that there are multiple units with sufficient capacity so that the data center can not only be maintained below the maximum safe operating temperature even in the face of the failure (or maintenance) of one or more units, but also the maximum rate of temperature change is kept within permitted limits (for example, less than 5°C/hour if magnetic tapes are being used, 20°C/hour otherwise), even if a unit is taken out of service for maintenance.
Finally, humidity also needs to be managed. Low humidity leads to increased static electricity, and high humidity can lead to condensation. Both conditions will lead to lower equipment reliability.
With both power (UPS and generator) and HVAC systems, due consideration has to be made for:
- Regularly scheduled maintenance
- Regular testing under full load (of UPS and generators, and backup HVAC equipment if not used in production)
- System fault detection and alerting (and regular tests of those subsystems)
- Periodic checks and audits to ensure all of the above are being properly and regularly performed
Without the above, the risk mitigations from your expensive backup systems might be more imaginary than real.
As discussed earlier in the “Industrial Control Systems” section, the system that monitors and manages your HVAC or UPS system can be a vulnerability itself. If an attacker can remotely access your HVAC and disrupt its operation, possibly on a weekend when your data center is not staffed, causing the cooling to be disabled and servers to overheat, that can be just as effective a denial of service as one launched by a 100,000 bots. Similarly, unauthorized remote admin access to your UPS can result, at best, in your UPS being disabled and the next power outage not being mitigated; at worst, a direct power outage triggered by taking the UPS offline.
A variety of industry, national, and international standards cover HVAC, utilities, and environmental systems. These include:
- The Uptime Institute’s Data Center design certification tiers assess areas including facility mechanical and electrical systems, as well as environmental and design considerations. Specifications for the tiers can be found at https://uptimeinstitute.com/resources/.
- The International Data Center Authority (IDC) provides open standards for data centers, facilities, and infrastructure at https://www.idc-a.org/ data-center-standards.
- The ASHRAE standards for ventilation, refrigeration, building automation, and a variety of other related topics can be found at https://www.ashrae.org/ technical-resources/standards-and-guidelines.
Other standards including the LEEDS standard, BICSI-001, and those created by one of the three European standards organizations (CEN, CENELEC, or ETSI) may all be useful or even required in the design of utilities and HVAC systems.