Energy Conservation And Reliability

Reliability Assessments for Data Centers and the Liability Pitfalls

By Peter V.K. Funk, Jr.
Mission Critical magazine
November 30, 2011

As data centers seek increased levels of availability and efficiency, they are turning to manufacturers and consultants to provide reliability and efficiency assessments. Reliability may require a review of the HVAC, power supply, and protection infrastructure. Assessment providers should be aware that giving a data center a passing grade in connection with a reliability assessment may be an open door to provider liability if the assessment misses a vulnerability that results in a downtime event. 

The history of reliability assessments in one mission-critical industry — utility electrical power plants — underlines the importance of contracting parties agreeing upon the specific scope of reliability review, the extent to which the data center can rely upon the assessment, and the limits of provider liability.

Engineering reviews of equipment and systems during utility plant scheduled outages must be performed carefully, yet expeditiously, since getting the plant back on-line is mission critical. Every minute that the plant is inoperative can represent a loss of sales of power since utilities typically have ready markets for power from baseload plants.

An illustrative example involved a scheduled outage for a 500-megawatt (MW) oil- and coal-fired plant. (The facts are changed and names are omitted for confidentiality purposes, but the facts are similar to those in the actual case.) 

A well-known engineering firm was engaged by a utility to perform an engineering review of a power plant, including an inspection of a few rivets that had come loose from among the numerous rivets supporting the economizer assembly (heat exchange devices that preheat water used to produce steam with exhaust gases from the boiler). Elements of power plant steam production systems are suspended vertically from overhead girders within the power plant building and the economizer was part of the hanging burner and steam production facility.

The engineering report stated that the missing rivets could be replaced at the next scheduled outage and did not present any immediate threat of failure. This advice was dramatically wrong. The rivets were not repaired based upon that engineering advice, the plant went on-line and a short time later, the remaining rivets failed. As a result, the economizer collapsed, resulting in severe damage to all the equipment hanging below it and causing the plant to shut down. Since this was a failure of equipment and not a casualty, the utility’s insurance did not cover the cost of repairs.

The utility sued the engineering firm for $40 million based upon direct damages and lost sales on grounds of contract and of negligence. The engineering firm defended itself on the grounds that its contract capped breach of contract liability to the cost of the contract; approximately $75,000. In addition, the firm argued that breach of contract damages did not include consequential damages such as lost profits. The utility claimed that the firm’s failure to properly advise the utility was so egregious that it amounted to negligence and that damages for negligence were not specifically limited by the provisions of the contract. The utility also alleged that the damages in question were not consequential but were actual, based on lost sales, inasmuch as that utility could sell all the baseload power it generated from that plant, a fact which was known to the engineering firm.

In light of a possible adverse outcome of litigation due to the contract’s scope of work, and the extremely adverse consequences of the engineering firm’s erroneous advice, the engineering firm agreed to settle. It also appeared that the firm’s concern with its reputation was a major factor.

An important lesson of this incident is that, even with protective provisions in place, if the scope of work involves a reliability assessment, if the error is sufficiently egregious, and if the facility relies upon the erroneous assessment to its detriment, an assessment provider may perceive no choice but to pay a substantial settlement to avoid litigation.

Data Center Reliability Assessments

A possible danger to providers of assessment services to data centers is that the language of the contract may be targeted upon reassuring the customer that the assessment will be accurate without sufficient regard to the scope of that promise. Overly reassuring language may lead to liability. In addition, any ambiguity in the contract as to intent may lead a court or arbitrator to interpret the contract based upon “outside” information such as on-line marketing materials. Examples of language excerpted from the websites of companies that provide or arrange for data center assessments that appear to provide broad assurances of reliability include the following:

  • We have designed a proactive approach to ensuring system reliability and risk reduction knowing that your systems are working as designed and will provide peace of mind and added confidence in your system infrastructure.
  • [Our data center reliability analysis] report helps to prioritize upgrades and creates a plan to maintain a reliable data center far into the future.
  • A comprehensive facilities assessment can help you determine how well your physical infrastructure meets your current and planned networking needs. The XYZ Data Center Assessment Service evaluates your physical data center infrastructure, including the overall site, power, cooling, physical security, and operational practices. The power and cooling assessment can help you make sure that your facility provides the power and cooling that your network requires for high reliability. The power and cooling assessment provides the following deliverables: (i) Electrical distribution system capacity, redundancy, and points of failure; (ii) Air conditioning system capacity, redundancy, and points of failure; (iii) UPS capacity and redundancy; and (iv) Generator capacity and redundancy.

A provider of assessment services may focus primarily upon services other than the reliability assessment, such as energy-efficiency assessments and optimization of existing equipment. The energy system reliability assessment portion of the work scope may be viewed as providing an added benefit but not constituting the core of the services and may, for that reason, not be specific as to the extent to which the data center can rely upon the energy elements of the reliability assessment or address the potential liability of the provider. Notwithstanding the focus of the assessment provider, so long as any part of the scope of work includes a reliability assessment, the potential for an “economizer” situation exists.

How can providers of data center reliability assessment protect themselves against unexpected liability? How can data centers be certain of the extent to which they can rely upon reliability assessments? What contractual remedies may be available if an incorrect assessment causes liability?

Protection for the Assessment Provider

  • A disclaimer informs the data center that an assessment, although helpful, is not intended to be thorough and that an omission will not lead to provider liability or provide a remedy: “Our intent is to perform a careful and comprehensive reliability assessment of the power and cooling systems, but we do not provide a guarantee that we will identify every possible point of system failure.”
  • Language qualifying the scope of work such as “best efforts” and “seek to identify” rather than words such as “will identify” also serve a similar purpose as the disclaimer. During the course of my career I have seen egregious language in contracts, the worst of which was a contractor who promised that the quality of the work would be “perfect.”
  • A provider may include specific guarantees of performance and exclude all others.
  • A provider may include covenants that limit the potential damages to some predetermined amount and to provide that damages do not include indirect or consequential damages, which terms must be defined in the contract.

Protection for the Data Center

  • The data center should understand the scope of work offered and analyze whether it is sufficient for the desired purpose.
  • The price of the assessment should be consistent with the degree to which the assessment can be relied upon. The principle “you get what you pay for” is worth considering in order to arrive at the desired level of availability, meaning the amount of time the plant is available for generating electricity or gas.

One index used by both utility power plants and data centers measures availability of power in terms of nines. A six nines (99.9999 percent) level of uptime means 31.5 s of downtime per year; five nines (99.999 percent) means 5.26 min of downtime per year and four nines (99.99 percent) means 52.56 minutes of downtime per year.

A data center seeking a level of assessment supporting a six nines level of availability and a guarantee of performance and significant remedies against the provider for defective performance should make certain that both a defined scope of work and corresponding guarantees are in the contract so that the data center is protected. Such a high level of services will necessarily be priced at a corresponding high level. Conversely, if a basic assessment is sought, one that supports a much lower level of reliability, the contract’s scope of work, potential liability, and cost should be correspondingly low.