Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Round Robin Scheduling Not Power-Efficient

Posted by kdawson on Fri May 09, 2008 10:01 AM
from the toward-cooler-server-farms dept.
Via_Patrino writes "While having to distribute load between several servers, round robin, or any other technique that balances load equally, is the most common approach because of its simplicity. But a recent study shows that trying to accumulate load on some servers can improve energy efficiency because the other servers will be mostly unused during off-peak periods and then able to make better use of power saving methods. Specially, where load involves lots of concurrent power-consuming TCP connections, which was the case in the study, a new load-balancing algorithm resulted in an overall 30% power savings. Here's the paper (PDF)."
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Logical conclusion (Score:5, Insightful)

    by DoofusOfDeath (636671) on Friday May 09 2008, @10:04AM (#23350294)
    So if we're willing to sacrifice speed for energy savings, shouldn't we just use the bare minimum number of computers that can handle the workload without crashing?
    • by TooMuchToDo (882796) on Friday May 09 2008, @10:12AM (#23350402)
      What this means is someone needs to architect an intelligent loading system. Ideally, it would manage the load on your base load servers (that are on all the time), and when those servers reach 85-95% of capacity (numbers from my ass) other servers should be brought out of low power/sleep mode to start serving.

      Of course, if you use Amazon EC2, this is all moot, as they can shift load around to have their cluster run at peak efficiency.

      • It can be done with your standard job scheduling/load balancing systems and, yes, about half a dozen shell scripts.
         
      • It's not clear from the article that they're referring only to persistent connections which remain open but which don't have much activity. The one they analyze is Windows Live Messenger.

        They talk about 30% savings in these applications, but also give 59bn kWh as the figure for total power usage for all data centers, the majority of which probably wouldn't benefit from tweaks suited to persistent connections.
    • BTW Dynamic workflow based provisioning of VMs can (or will eventually) allow you to do this without sacrificing speed.
    • Re: (Score:2, Informative)

      I think that's the point of the study and solution. Round robin doesn't account for under-utilization of resources so it still balances between multiple servers when not needed. What their new algorithm does is allow the servers that are not needed to use their power saving features and maximize utilization of only the needed resources(servers).
    • by MozeeToby (1163751) on Friday May 09 2008, @10:23AM (#23350562)
      We've been sacrificing computing power for efficiency for years. New Server CPUs tout thier energy savings atleast as much, and quite often more than they tout their computational power. As electricity gets more expensive and data centers continue to grow this trend can only continue; it's simply too expensive to a warehouse full of server racks unless you focus on efficiency.

      I'm waiting for the first company to put a data center a few hundred feet under water, where the water temp is low. You'd be surrounded by the worlds biggest heat sink. The environmentalists would have a hissy fit but that's never stopped industry before, and of course you could argue that you are saving electricty on cooling.
      • I would totally be willing to have a site hosted in a data center that's under the water table and depends on a reliable source of power to keep its pumps going. It sounds like nothing could go wrong. I'll just throw in a few extra servers for failover if anything happens.

        Screw you, environmentalists!
        • I'm not quite sure if this is a positive comment, or a negative comment wrapped in positive sounding sarcasm. I'm responding as though it were the latter.

          Your comment assumes that losing power to the facility is a catastrophe. I say it's not, if it's unmanned.

          Yeah, your cooling setup is unpowered, but so are your heat sources.

          I bet you could put a set of stilts into the ground, and build a computing environment wrapped around the stilts. Lower the equipment into the water when it's running, and ra
      • 1: Then put turbine blades above your data center, so that the upwelling heated water spins them, generating electricity.

        2: Use some of the electricity to power your data center, and the rest to power other thermodynamically impossible projects.

        3: Profit!! (no hidden step needed here, just an impossible one)
        • Re: (Score:3, Informative)

          Enwave Energy Corporation in Toronto, Ontario is already doing this. They have a 59K ton integrated district cooling plant using deep lake water as an energy sink. Chicago is thinking of doing something similar with the huge volume of water they already draw from the lake for other purposes. The Toronto project probably kept another coal plant from coming online because it's got a cooling capacity of 207MW which would require about 400MW of electricity between transmission losses and cooling system ineffici
    • Re: (Score:2, Interesting)

      by Anonymous Coward
      You aren't sacrificing speed as long as you have properly benchmarked your servers, and understand where the performance hockeystick starts.

      Apply a connection limit slightly below the performance hockeystick in your load balancers / content switches and you will get maximum power utilization with minimum performance impact.

      One other way I see customers getting maxim utilization out of their servers is by using dynamic resource schedule and vmware esx to move virtual servers around behind a load balancer. At
  • by athloi (1075845) on Friday May 09 2008, @10:05AM (#23350306) Homepage Journal

    Confronted with distributing food rations to hungry orphans, people would rather be fair than efficient, even if it means letting some of the food go to waste, a US study shows.

    But the tests demonstrated that most people preferred equity in distributing food -- that all the hungry mouths got fed equally -- rather than an efficiency that perhaps meant that one orphan got almost nothing but also that no food went to waste.

    http://news.yahoo.com/s/afp/20080509/ts_alt_afp/ussciencepsychologymoralityresearch_080509123210 [yahoo.com]


    This problem shows up in many places.
  • by dsginter (104154) on Friday May 09 2008, @10:08AM (#23350340)
    I don't think that we should go down this road again - why don't we talk about religion or politics, instead?
  • by Colin Smith (2679) on Friday May 09 2008, @10:11AM (#23350382)
    Just switch them off...

    If the load on your boxes is below a threshold, remove one of them from the load balance list, wait for connections to end, or migrate the processes off to another machine, and switch it off. When the load is above a certain threshold, you power on an additional node, configure it for whichever service and add it to the load balancer.

    Oh come on people, you call yourselves engineers? It really isn't that difficult.

     
    • by russotto (537200) on Friday May 09 2008, @10:18AM (#23350484) Journal

      If the load on your boxes is below a threshold, remove one of them from the load balance list, wait for connections to end, or migrate the processes off to another machine, and switch it off. When the load is above a certain threshold, you power on an additional node, configure it for whichever service and add it to the load balancer.


      Sure, that's not too difficult to do. But it does add complexity. And it does mean your system can't respond to increased load as quickly, as you have to wait for your additional boxes to boot up. If the increased load is predictable, you can anticipate, but that adds more complexity. It doesn't save you on capital costs as you still have to size your power and A/C systems for peak load. Powering the boxes on and off may shorten their lives or reduce their reliability. The question isn't whether it can be done; it's whether it's worth it.
      • If you have a fairly "dumb" system where you're running a webapp across an array of web servers, and you have one DB server, adding the complexity to save power is probably not worth it. If you're Google, Amazon, etc. and your power bill every year is bigger than the real estate bill for some medium sized companies, than you probably should be integrating power efficiency architecture into your process somewhere.
      • Powering the boxes on and off may shorten their lives or reduce their reliability.
        I thought this was debunked a while ago?
      • It isn't a problem. By that, I mean... Watch where you put your state...

        Powering the boxes on and off may shorten their lives or reduce their reliability.
        Who cares, they are disposable 300 boxes. When it dies you take it out and put another one in it's place, send the old one back to the manufacturer to be replaced under warranty.

         
      • or have the "stand-by" system(s) in a sleep mode so they can be ready for the extra load more quickly. This would trigger bringing in/powering on another box which would go into "stand-by" mode if load keeps going up.

        it would be silly to have all your boxen running at 5% load because of a dumb load balancing scheme. Energy wasteful to say the least.

        LoB
    • Re: (Score:3, Informative)

      Agreed. How hard is it to understand that if you use 50% load on 10 servers, you will probably be using more energy than a 100% load on 5 servers. It's common sense when you realize that a 50% load != 50% power consumption.

      I am starting to think I didn't miss much by not going to a big name computer science school.
      • What you describe is more closely related to electrical engineering than computer science.
    • Re: (Score:3, Insightful)

      Oh come on people, you call yourselves engineers? It really isn't that difficult.

      You'd be surprised how much of engineering is taking "obvious" ideas and banging your head against them for months/years trying to get all the details to work out right.
  • It doesn't take any account of the load on each box. If one is dying, it will still hand it, say half the work.

    Load balancing is where you actually check the load and then make an informed decision about where to allocate the work.

    OK, rant over. Now back to your scheduled programming.

    • It depends on what you consider load. If server process load is your load that your balancing, then yes you have to check that load to balance it. If connections, bandwidth or people are your load, then round robin is best. For balancing something like serving static files, round robin is probably faster, cheaper and more reliable.
  • by Animats (122034) on Friday May 09 2008, @10:14AM (#23350428) Homepage

    Operators of multiple steam boilers have been dealing with this problem for a century. The number of boilers fired up is adjusted with demand, with the need for some demand prediction because it takes time to get steam up. This was done manually for decades; now it's often automated.

    The same thing applies to multiple HVAC compressors. Usually there's a long-term round-robin switch so that the order of compressor start is rotated on a daily or weekly basis to equalize wear.

    More and more, IT is becoming like stationary engineering.

    • Re: (Score:3, Interesting)

      Similar idea to modern fuel efficient engines shutting down cylinders when you're idling as well (probably oversimplifying there but you know what I mean)
    • by russotto (537200) on Friday May 09 2008, @10:27AM (#23350648) Journal

      Operators of multiple steam boilers have been dealing with this problem for a century. The number of boilers fired up is adjusted with demand, with the need for some demand prediction because it takes time to get steam up. This was done manually for decades; now it's often automated.


      Which, alas, won't stop someone from patenting it with respect to servers. Even if it's already been done with computers too.

      Incidentally, I've seen descriptions of currently available HVAC control systems for office buildings which takes into account the season, the direction the building faces, the thermal mass of the building, demand, etc, and even learns some of these parameters while running, rather than forcing the installer to calculate them. But every office building I've worked in has had crappy systems which amount to running the compressors on a timer and using individually controlled dampers to provide even cooling (poorly). It seems that we have the technology, but not the will (or the capital) to use them.
    • Similar issues in electricity generation too. They have big (coal/nucular) power stations to satisfy the base demand in electricity and then less efficient gas turbine stations that can fire up (and down) quickly to meet the peak demands.

      It just seems too obvious for there to not be a solution to this in computing already, let alone it requiring a study to come to this conclusion.

      This being so suggests there's more to the story than the summary itself suggests; but to test that I'd have to follow all the li
    • The only thing that makes this hard is a metric of what "fully loaded" means for a server. With generators and boilers, you have a single number which represents output, and you know what the capacity of each unit is, so you know when to start up the next unit. Computer servers are more difficult to characterize.

      So you have to measure some values of server load, convert that to a single number, and use it for load measurement purposes. Then it all works just like boiler scheduling.

      You don't even nee

  • Pound, haproxy (Score:4, Insightful)

    by QuoteMstr (55051) <dan.colascione@gmail.com> on Friday May 09 2008, @10:21AM (#23350538)
    We're running a no-frills OpenBSD load balancer at work. Right now, it's running Pound (the quickest thing we could get up once traffic spiked a few weeks ago), but we're considering other approaches too. haproxy's load balancing knobs look interesting. It looks like you can configure it so the maximum number of clients scales with the current load. The problem is that there's no feedback system.

    Some kind of loadavg-based, or even response-time, feedback mechanism would be great! Pound has that (I believe), but since Pound requires downtime for every configuration change, we want to move away from it ASAP.
    • Re: (Score:3, Informative)

      pen [siag.nu] can perform some configuration changes on the fly using an optional control service; you can set server weightings at least. It's also event driven rather than the thread-per-connection model I believe pound uses, so it should scale better.
  • more obvious statements

    A cluster of computers doing a job is less efficient that a single server doing the same job. Adding to that having a cluster creates more points of failure, and more overhead communicating between those statements.

    If you have the option to run the DB & the application on the same server, try to do so.
  • by mlwmohawk (801821) on Friday May 09 2008, @10:27AM (#23350638)
    This is a very cool idea, and I don't think it will affect usability too much either. As long as the load balancer keeps tabs on system loading, via snmp or something, it can turn on/off machines based on need.

    Assuming your system scales smoothly, i.e. gets proportionally slower as the system load starts to exceed processing capacity. For example, a process will always take 100ms as long as there is CPU time to spare, but once the CPU gets to 100% utilization, you have to start time slicing more processes, that 100ms starts to be 150ms. The load balancer can spin up a new server an start bring down the processing times.

    This is an obvious solution to an obvious problem, but until now, we've just never had to examine it.
  • Everybody knows that it should be Round Batman, it is soo much better :P

  • BigIP (Score:3, Interesting)

    by ZonkerWilliam (953437) * on Friday May 09 2008, @11:08AM (#23351270) Journal
    BigIP's can use round robin and use prioritizing, in other words one server receives the most connections over the others. So how is this new?
    • More smarts, I think.

      Does your setup allocate ZERO connections to certain servers over some length time, which are set up to reduce energy use upon such zero connections? If not, this looks like it might help.

      They're claiming real-world energy efficiency gains, so it looks like it's an improvement somehow.

      I would assume it's because this now adds dynamic adjustment, which could be based on total system-stack metrics of peak_load_capability, energy_minimization, acceptable_response_time, etc. Somet

  • Amazing that all these discoveries can now be repeated with Green Tech phrasing & sound like they're new. Now a new discovery. Busy waits R not energy efficient. Where's my nobel prize?

  • by bill_kress (99356) on Friday May 09 2008, @12:15PM (#23352310)
    I think it's probably simplistic to simply distribute a load to all cores of a CPU evenly. Although asymmetrical might be tougher, I could see a system with one low-power always-on core to deal with system requests and organization (Maybe even low enough power to remain on during a suspend), One to handle all GUI threads and interact with the GPU on a private bus, a couple normal cores to handle typical user threading, one of which doesn't come on until the first is like 50% loaded, and one or two high-speed high-power cores that run all-out when the system is plugged in and needs them for intensive processing.

    It would take some targeted software design to take advantage of this, but I think we could be looking at a moores law style increase in power...

  • by viking80 (697716) on Friday May 09 2008, @12:26PM (#23352464) Journal
    14 soccer moms are taking the team of 14 kids to a game. They have two options:
    A. Spread the kids among all the cars, and drive all the cars (14 cars)
      or
    B. Fill up a car, and send off. Repeat until done. (6 cars)

    What is more energy efficient?

    Soccer moms have solved this without statistical analysis or engine torque curves.
    • Re: (Score:3, Interesting)

      Parent post has it all.
      Car analogy? Check.
      Soccer Moms? Check. Check. (no mention of how many are single though)

      But... a lot of soccer moms don't care. They're busy with their other kids and errands too (each server runs more than just apache), so they want the flexibility of driving their own car. Show me a website where the hardware is designed to be energy efficient, and I'll see a site that can't handle a good slashdotting.
    • by wombert (858309) on Friday May 09 2008, @01:08PM (#23352978)
      I believe your calculations are wrong. It's understandable, though, since soccer parenting is a fairly unique branch of mathematics.

      First off, you're assuming a standard car with 1 adult driver and 4 passengers; instead, you should be using an SUV with a capacity of 6-8, including driver.
      (Result: 4-5 vehicles)

      Next, you have to consider that not all parents will attend every game. The primary reason that soccer moms drive SUVs is that they must occasionally transport several of their child's teammates to a game (or, worse, to practice!) when their turn comes up in the rotation. Therefore, you only need enough SUVs to cover the number of child passengers, and the number of adults will follow.
      (Result: 2-3 vehicles)

      However, you might recall that the other reason that soccer moms drive SUVs is that they often have additional children that have not yet reach sports playing age, and must be transported along with the parent, in a car seat (which, in the case of a standard car, would reduce passenger capacity by at least 20% by rendering the back center seat useless.) Assume that approximately 1 in 3 soccer moms have an additional child to transport, and the child adds to the overall passenger count.
      (Result: 3-4 vehicles)

      Finally, realizing that the overloaded schedule and priorities of child + parent create scheduling conflicts, it is impossible to get optimal performance. At least 1 child per SUV will be late, leaving a seat empty and requiring another parent with car to tranport them.
      (Result: 6-8 vehicles)

      The result is a range of possible values, but your initial calculation of 6 vehicles is optimistic at best.

  • by perlith (1133671) on Friday May 09 2008, @12:59PM (#23352882)

    "Round Robin Scheduling Not Power-Efficient when using Windows Live Messenger"

    RTFA, in the abstract, "In this paper, we characterize unique properties, performance, and power models of connection servers, based on a real data trace collected from the deployed Windows Live Messenger."

    The research itself appears pretty solid. I'd be interested if they publish a followup paper where the model was based off of a variety of applications which utilize round-robin, not just one.

  • by cabazorro (601004) on Friday May 09 2008, @02:57PM (#23354438) Homepage Journal
    Here is the solution. In the winter run your web farm in the North hemisphere. In the winter migrate to the South hemisphere. Run it in basements of large apartment complex. Charge for the heating. Heating oil is going up the roof.