Learning Ceph(Second Edition)
上QQ阅读APP看书,第一时间看更新

Power requirements-amps, volts, and outlets

In the US, standard outlets are 110-120V, 60 Hz, with NEMA 5-15 plugs, rated at ~ 1,875 watts. Other parts of the world use different standards, but the ideas apply. Many (but by no means all) small to medium and even large data centers and server installations use the local common standard. Servers today often incorporate power supplies that can handle a range of input voltages and frequencies, requiring only a localized power cable to adapt to the proper plug/socket type.

Some larger storage appliances, disk arrays, and even certain dense servers, however, may require (US) 240V power that may not be readily available in your data center. Even servers that accept conventional power may pull so much current that you cannot fill a rack with them. Say you have 40 U of rack space available, and chose a 4 U server that pulls 800 watts. If your data center racks are only provisioned for 5,000 watts each, then you would only be able to stack six of these servers per rack, and might be paying dearly for space occupied only by blanking panels or 26 RU of expensive air.

Some data centers similarly provide only a limited number of outlets per rack, and with lightweight servers—say a 1 RU model using SSDs—you may run out of outlets long before you use up the power budget for a given rack. Some data centers, especially those that favor telco gear, offer 48V DC power. Major server vendors usually offer models or power supplies compatible with 48V power, but that may not extend across the entire product line. Some data centers favor DC power for efficiency, avoiding the losses inherent in one or more AC-DC or DC-AC conversion steps through the stages of building power, conditioning, uninterruptible power supply batteries (UPS), and server power supplies. Often, this is an all-or-nothing proposition for an entire data center; do not assume that yours can accommodate these niche servers without consulting your data center management people.

In practice, one often balances voracious and leaner gear within a rack to accommodate outlet and current budgets, say, mixing IDF/ODF panels and network gear to average out. Factor these layout decisions into your Ceph cluster's logical topology to avoid crippling failure domains.

Within the server itself, you may have multiple options for the power supplies you order. You must evaluate the tradeoffs among the following factors:

  • Local datacenter standards
  • Startup, peak, and steady-state server wattage requirements
  • Power supply efficiency
  • Redundancy

Many Ceph installations find that a 2RU server is the sweet spot or best fit; this may or may not be true for your data center and use case. It is also very common in an enterprise setting to provision servers and other gear with redundant power supplies to both guard against component failure and to allow electrical maintenance without disrupting operations. Some data center managers even require redundant power.

This author experienced the meltdown of a power inductor within a Ceph OSD server, which caused the entire server to shut down. The monitoring system alerted the on-call team to the failure of that node, and the cluster was rebalanced to remove it from service. It was not until the next day that I learned that the failure had also taken down half of the power for the entire row of 20 racks. The servers were all provisioned with dual power supplies, which allowed them to survive the outage.

Server vendors often allow one to select from an array of power supply unit (PSU) choices:

  • Single, dual, or quad
  • Multiple levels of efficiency
  • Multiple current ratings
  • Voltage compatibility

Your choice can significantly affect your cluster's reliability and data center infrastructure costs. Dual power supplies also mean dual power sockets, which may be in short supply in your racks. They also mean dual power cords, a commodity on which server vendors may enjoy considerable markup. Always ensure that your power cords are rated for enough current.

A single PSU architecture simplifies cabling and often costs less than a dual solution. The downside is that the failure of that PSU, of the power circuit/PSU takes down the entire server or the entire rack/row. It is also vulnerable to the cable becoming unseated by a technician brushing while walking down the aisle or working on a nearby server, or even trucks rumbling down the street. In large clusters of smaller servers, distributed among multiple racks, rows, or even rooms, this may be an acceptable risk with careful logical topology choices and settings. This author, however, suggests that the operational cost of failure or a protracted outage usually outweighs the additional cost and layer 1 complexity.

With any PSU architecture, you must ensure that you provision adequate capacity for the server's present—and future—needs. For example, as I wrote, one prominent server vendor offers no fewer than ten PSU options, with ratings of 495, 750, and 1,100 watts. Server specs and configurators often provide a sum of the current / wattage needs of the selected chassis and components. Be careful that the startup, peak, and typical power consumption of components—especially rotational drives—can vary considerably. Drive spin-up in some cases can be staggered to de-align the peaks.

Plan also for future upgrades. If you order a chassis model with 24 drive bays but only populate 12 today, you will want to provision a power infrastructure that can accommodate the additional load. Two years from now you may also wish to replace your 4 TB drives with 8 TB models, which itself may require additional RAM. Those adjustments may add non-trivially to your power needs. That said, if your foreseeable power needs per server add up to say 600 watts, dual 1000 watt PSU's may be overkill, and dual 750 watt units would serve just as well. Incremental price increases or savings per unit add up rapidly in clusters with dozens or even hundreds of servers.

Also, be sure that if you lose a single power supply, those that remain have enough capacity to run the server properly. In the preceding example, this is why dual 495 watt units would not be a wise choice. These days, many servers used for Ceph offer only single or dual PSU configurations, but there are still those where you may have the option of four or even three. Say you need for 900 watts of power and four slots for PSUs. You might provision a single or dual 1,000 watts PSU configuration, three 495 watt units, or even four 300 watt units. With any multi-PSU choice, if the loss of one unit does not allow the server to run, you've actually decreased reliability compared to a single-PSU architecture.

Some server vendors offer multiple power supply efficiencies. A PSU with a greater efficiency rating may result in lower overall costs for data center power and cooling, at the trade-off of a higher price. Consider household natural gas furnaces: a traditional model may be rated at 80% efficiency. A condensing model recovers heat from effluent, saving on monthly gas bills, but at the expense of more expensive up-front cost and installation. For example, as I write one prominent server vendor offers Silver PSUs at 90% efficiency, Gold at 92%, and Platinum at 94%. Consider your local CapEx and OpEx trade-offs as well as corporate green guidance and data center standards.