Evaluating the Availability of UPS Architecture
Summary of a White Paper produced by Centiel UK Ltd and Università della Svizzera Italiana, Lugano, which is in collaboration with the ETH Zurich
Ensuring a continuous supply of electrical power is vital in safety-critical environments such as hospitals, data centres, commercial institutions etc where even the shortest of duration interruption may cause significant financial losses or even endanger lives. An Uninterruptible Power Supply system is used to provide power when the main source is interrupted or even fails, it also ensures high-level of power quality. Therefore, ensuring the highest possible level of availability of a UPS system is of paramount importance.
Redundancy is the most common way to improve the availability of a UPS system. However, different architectural solutions with the same number of redundant components may result in significantly different levels of availability.
The white paper entitled: Evaluating Availability of Centiel UPS Architecture – which was developed through a collaboration between Università della Svizzera Italiana, Lugano and the ETH Zurich can be downloaded in full from: www.centiel.co.uk. In the white paper, different UPS architectures were examined, ranging from a single UPS to fully redundant, parallel path and topologies to compare availability.
Availability is defined as readiness of a system to provide a corrective service. Steady-state availability is the most commonly used metric for availability quantification and is defined as the fraction of time a system is operational during its expected lifetime.
Availability equals Mean Time to Failure (MTTF) divided by MTTF plus Mean Time to Repair (MTTR).
MTTF is a mean time between two consecutive failures, whereas Mean Time to Repair is a mean time needed for the repair. In other words, MTTF is the average time during which the system is up after it has been repaired and before it fails again.
To make the comparison between architectures straight forward, steady-state availability is usually expressed with a number of nines. For example “Five nines” availability means that steady-state availability is 0.99999 or 99.999%.
[Fig 1] A simplified Reliability Block Diagram (RBD) of a UPS module
A single UPS unit comprises of a Rectifier, Battery and Inverter. The Mean Time to Failure for the Rectifier is an average of 50000 hours, the Battery 100,0000 hours and the Inverter 50,000 hours. Because we know the availability of the individual components we used a hierarchical modelling tool called SHARPE, to calculate the Mean Time to Failure of a single UPS unit as 20,000 hours (more details about this calculation can be found in the whitepaper). This assumes Mean Time to Repair is 6 hours and so the availability of a single standalone UPS unit is 0.997. Therefore, in one year the expected downtime for a single UPS is 2 hours 37 minutes.
To improve the availability of a single UPS unit, it can be placed in parallel with the main power source using a Static Bypass Switch (SBS). Then, if any of the UPS components fail, power can be switched, ensuring there is no interruption.
We compared the availability of a UPS with the main power source and Static Bypass Switch.
[Fig 2] a Reliability Block Diagram of a UPS with the main source.
By calculation, a customer using only the mains supply without a UPS system will experience 17.5h of power interruption per year, compared with a single UPS module where interruptions are 2.6 hours per year. Including the Static Bypass Switch, availability is significantly improved to just 1 second of annual downtime or availability of 0.99998.
To get higher power outputs, several UPS units must be combined. Next we analysed these architectures.
A system with higher output requires several single UPS units to be connected in parallel to the main power, over a Static Bypass Switch. The UPS units must also be connected to each other to ensure synchronous operation. Communication is established via a parallel bus (PBUS). However, despite the fact that the parallel bus is a highly reliable component, in a typical configuration, failure of any of the buses will cause the entire system to fail. Availability of a system example of 160kVA output therefore, has a reduced availability of 0.999993 equivalent to 3 minutes and 40 seconds per year.
To boost dependability, redundant UPS units are introduced, known as n+1 configuration. Here, the system may continue operation with the remaining n modules. The MTTR of a modular architecture decreases to only 0.5h. Such systems may tolerate failure of a single UPS unit, however, again failure of any of the parallel buses makes the entire system fail.
Centiel’s n+1 modular architecture introduces numerous changes to improve the overall dependability. Specifically the control logic allows the communication between UPS units to be maintained even when one of the parallel buses has failed.
In addition the MTTR of a single UPS is additionally decreased by placing the Static Bypass Switch fuse on each module at the frame level, out of the module. In this way, the fuse may be replaced without the need of pulling-out or opening the entire module. We have assumed the MTTR of the fuse is five minutes and the fuse causes 5% of the overall module failures.
In Centiel’s architecture, there are changes that increase system’s MTTF and decrease MTTR. In this white paper we only consider two: the improved parallel bus configuration and isolating the Static Bypass Switch module fuse. In this way, Centiel’s architecture significantly outperforms the one with typical n+1 architecture, as it increases from six and a half to even ten and a half nines. Annual downtime is decreased from 12 seconds to 0.0006 Seconds
However, the 1+1 configuration also has a limited output of 40kVA. As the output increases, availability of both architectures slightly decreases, but the level of availability improvement with the Centiel architecture (when compared to a typical one with the same level of redundancy) gets even higher. For example, when the output is 160kVA (4+1 configuration), a typical architecture has availability of six nines (annual downtime is 32 s), whereas availability with the Centiel architecture is nine nines or 0.03 seconds downtime per year.
In conclusion, the white paper demonstrates that Centiel’s unique approach that combines innovative features with modular architectures achieves the higest levels of availability even for high-output configurations. Therefore, with an advanced architecture it is possible to improve availability by multiple orders of magnitude.
Article originally produced for Data Centre Dynamics Magazine
Click here to download the full Centiel Availability White Paper