Slashdot Banner
Stories
Slash Boxes
Comments
typodupeerror delete not in

Comments: 122 +-   Amazon's Cloud May Provision 50,000 VMs a Day on Wednesday September 30, @07:08AM

Posted by kdawson on Wednesday September 30, @07:08AM
from the golden-lining dept.
internet
hardware
news
Dan Jones writes "It has been estimated that Amazon Web Services is provisioning some 50,000 EC2 server instances per day, or more than 18 million per year. But that may not be entirely accurate. A single Amazon Machine Image (the virtual machine) may be launched multiple times as an EC2 instance, thereby indicating that the true number of individual Amazon servers may be lower, perhaps much lower, than 50,000 per day. So, even if it's out by a factor of 10 that's still 1.8 million VMs per year. Is that sustainable? By way of comparison, In February of this year, Amazon announced S3 contained 40 billion objects. By August, the number was 64 billion objects. This indicates a growth of 4 billion S3 objects per month, giving a daily growth total of about 133 million new S3 objects per day. How big can the cloud get before it starts to rain?"
story

Related Stories

: by
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • If history tells us anything, it is that there _will_ be a failure.

  • tag: Dumbquestion (Score:4, Insightful)

    by drinkypoo (153816) <martin.espinoza@gmail.com> on Wednesday September 30, @07:27AM (#29591773) Homepage Journal

    How big can the cloud get before it starts to rain?"

    Clouds don't work like that, they let go their rain when they enter a pressure zone where they can no longer hold water.

    If Amazon is centrally dispatching, then they deserve to fail. If not, then there's no reason why getting larger would necessarily cause any particular problem.

    • Re: (Score:3, Interesting)

      This. Maybe instead of atmospheric clouds, they're talking about the Oort Cloud [wikipedia.org].

      • Those comet thingamabobs don't rain now do they?
        • Re: (Score:3, Funny)

          If EC2 has the same uptime as bits of that cloud destroying life on earth, I think it'll be around for a while.

          And if one does hit us, I guess it won't matter anyway.

  • Please stop... (Score:5, Insightful)

    by broken_chaos (1188549) on Wednesday September 30, @07:28AM (#29591783) Homepage

    Cloud is bad enough. Starting up bullshit analogies with clouds and rain just muddy whatever you're talking about far, far more than is necessary.

    • by Josh04 (1596071) on Wednesday September 30, @07:34AM (#29591855)
      I agree, the rain does muddy the waters somewhat. Not to mention the flood of comments deriding it as such.
    • Yes the summary didn't make any sense to me either. "How big can the cloud get before it starts to rain?" Huh? Is it saying how long will it be until the cloud starts making a profit for Amazon, or until the cloud collapses under its own weight? I still remember when AOL signed-up too many customers, and the result was a service that was slow and unresponsive.

      • I still remember when AOL signed-up too many customers, and the result was a service that was slow and unresponsive.

        Yeah, I remember their grand opening, too.

      • AOL did that several times as I recall, seemed they never learned to scale properly. I also remember that they sent out so many floppies back when they were still useful, that I never had to buy any...I often wondered how they could send that many floppies out in pretty packaging and stay in business...later they started using cdroms, so then they became a nuisance as apposed to something useful.
        • Re: (Score:3, Interesting)

          Oh man, I was in art school in early 90's. All those AoL CD's were great for material for art projects and stuff.

    • Cloud is indeed bad enough. But then talking about the mud that is caused by the rain just entirely washes away any sense it could have made.
    • by suso (153703) * on Wednesday September 30, @08:04AM (#29592153) Homepage Journal

      Oh, stop raining on everyone's parade.

    • Re:Please stop... (Score:5, Insightful)

      by moon3 (1530265) on Wednesday September 30, @08:42AM (#29592601)
      Managers love this kind of terminology, because from their point of view Internet just 'happens' somehow, they do not have a real clue how, but the cloud fits perfectly into this kind of thinking. That is why cloud hosting is so popular, they just order 4GB/100Mbit/s cloud and the hosting company creates one for them. They do not have to worry about setting up DNS, SQLs, multiple servers, domains, SMTPs and get schooled by some lowlife nerdy IT guys, they understand the dumbed down cloud interface well enough themselves, they just interact with the web interface and are happy it is all working for them.. somehow, somewhere, in the cloud.
      • This is so well put and describes where I work perfectly.
      • Re:Please stop... (Score:5, Informative)

        by slim (1652) <john AT hartnup DOT net> on Wednesday September 30, @10:02AM (#29593757) Homepage

        Managers love this kind of terminology, because from their point of view Internet just 'happens' somehow.

        And cloud computing makes them right. You pay some money, and the entity you're paying the money to, makes it happen.

        Just like when I buy a tin of soup from a supermarket, I don't need to understand anything about the supply chain that got it there.

  • by Viol8 (599362) on Wednesday September 30, @07:33AM (#29591827)

    I've never really understood the fuss around VMs. Sure , they're useful if you want to test run an OS install or run a different OS on top of another. But otherwise whats the point? Instead of having app + OS you end up with app + VM + OS so how exactly is that benefiting anyone other than the power company for the extra electricity used?

    • by SappoMan (51574) on Wednesday September 30, @07:47AM (#29591987)
      Ok, you don't work in IT right? At least not on the admin side.
      VM are mainly about server consolidation. That means that given the fact that servers are usually under utilized you can put quite a number of VM per core. Usually for server workloads the number is around 2: 2VM * 4 cores * 2 cpu (typical blade) yields 16 VM. You see, in the end the power company gets paid only for a physical server every 16 OS instances. Not bad.
      Server consolidation is not the only reason you use virtualization. Other issues you can solve are: high availability and fault tolerance, quick deployment of new servers, hardware abstraction and many others
      • Hardware abstraction being one of the more compelling features IMO.

        How long you think it would take you to move all the services to that new server hardware you just received, because the current server hardware warranty just expired?

        Well, with VM's you can do that _A LOT_ faster.

        • Takes about 10 seconds and happens without dropping so much as a packet. Of course that will depend on your back-end storage. I'm fortunate enough to run with NetApp though so it's all gravy. On a side-note, Citrix Essentials for XenServer is pretty cool with their SAN integration technology. Thin-provisioning on the fly is the way to be! When I make a storage repository it is 20megs until I create a VM and even then it will only take up space that is actually taken up. Combined with volume level deduplicat
      • Re: (Score:3, Insightful)

        The point is that multi-tasking operating systems already support server consolidation by protecting processes from each other so you can run multiple processes on a host safely. And they do it in a FAR more efficiently than VMs, which have an entire OS instance for every process, and memory partitioned statically between them.

        However, the OS doesn't quite finish the job. The need for VMs arises from design shortcomings at the OS level and above. Here are a few:

        1. You can't install an app and all its dep
        • by bertok (226922) on Wednesday September 30, @09:09AM (#29592971)

          I thinl you're missing my point - why have multiple OSes if they're all the same type of OS and the apps could all happily run on the same OS instance? As for deployment - have you never heard of a tarball? OS dies - take app tarball to new server , untar. Hows that different to copying a VM machine file over?

          In the real world, people run apps like Exchange or Oracle, which take hours to install to a vanilla state, and that's not counting the potentially terabytes of data associated with them.

          Even the most primitive "tar ball" Linux app will have dependencies on the OS, and those can and will eventually break, unless you freeze your OS version forever. If you have enough apps and servers, that will become a nightmare to manage. Do I upgrade or not upgrade? Will this patch or that patch break one of the apps? This is how people end up running Linux 2.2, or 32-bit Windows on 64-bit platforms, because migrating 1 app is hard enough, but migrating a server with 20 apps on it is a recipe for disaster.

          Virtualization lets you quite literally drag & drop a running host OS from server to server. During maintenance time, that's like magic. No more 3am hardware replacement jobs for me! You can clone a machine while it's running, isolate the clone onto a virtual network, and test an upgrade without interrupting users. Sure, you can do that with most backup & restore tools, but VM platforms do it quicker, and with fewer admin steps. You don't even need spare hardware.

          I once replaced every single hardware component of a running VM farm, servers, cables, switches, even the SAN, while it was running. During the day. Zero outage, no packets lost, no TCP/IP connections closed or user sessions disconnected. We even had terminal server (Citrix) and console (SSH) users on. Not one user even noticed what was going on. I'd love to see you try that with 'tar'.

    • Re: (Score:3, Informative)

      we have used VMWare for a few years. Our devs would write a java app and it would require it's own server but it would use maybe 20% if not less of the resources. Now we just provision a VM. less server clutter in the datacenter and smaller electricity bills. Also great for DR. we ship the entire VM to a DR site so all we have to do is bring it up, change the IP and we're ready to go. otherwise we would spend days trying to configure all the apps, find the source, etc.

      i have my own server i used to test a S

      • Our devs would write a java app and it would require it's own server but it would use maybe 20% if not less of the resources.

        This is the part I don't get, that is left out of the answers above (the migration issue makes sense independently, though!)

        My question is simple: how on Earth do you write an app that "would require its own server" but only use 10 or 20 percent of the machine's resources? I Just Don't Get It when you say an app would "require its own server" but not max out the server's resources.

        W

        • Re: (Score:3, Informative)

          Security/Separation of Duties.

        • Re: (Score:3, Interesting)

          having one app conflict with another app. 10 years ago we had a few apps. today there are too many to count and constant point releases where minor functionality is added by user request or small bugs fixed.

          and it's not just java apps. weblogic instances, other apps we might buy or code internally. then there is QA since they need everything production has. Moving QA to VMWare was one of the first things we did when we bought it. the QA and Dev SQL servers are still physical, but a lot of their apps are now

    • VM's are great for many things. First off, know that most hardware is severely under-utilized. Then factor in the ease of replication, testing, security(via sandboxing and other methods), ability to scale horizontally quickly. There are downsides too of course which is why we prefer to run our own XEN setup, then use http://www.eucalyptus.com/ [eucalyptus.com] light up more VM's in case of load need or disaster.

      VM are a huge cost saver, and the fastest development environment.

    • Change "OS" to "hardware", so it's:

      app/OS + hardware vs app/OS + VM + hardware. The fuss is you get to disassociate your app and OS from a specific piece of hardware. If the hardware fails all you have to do is move the VM "image" to new hardware.

      Or, if the needs of the up go up or down you can move it to less powerful(cheaper) or more powerful(expensive) hardware as needed without much effort.

    • Basicly you're right.

      But there are some neat tricks you can do with VMs like taking an instant snapshot and use that for debugging.
      Migrating VMs to another (hardware) server is a non-issue. (just Copy over the image)
      If you're working with a cluster anyway, creating another node is also mainly a matter of copying the image.

    • There are plenty of reasons why you might choose to host two services on two different machines, even if one machine would have enought power. Things like being able to take one down without affecting the other.

      VMs let you keep some of that model, while consolidating down to less hardware.

      Plus it makes deployment easy: get your system how you want it, then save it as an image. Now you can clone it as much as you like. Now that there are OSS VM hosts, the commercial virtualisation companies are concentrating

    • by teshuvah (831969) on Wednesday September 30, @08:04AM (#29592149)

      I've never really understood the fuss around VMs. Sure , they're useful if you want to test run an OS install or run a different OS on top of another. But otherwise whats the point? Instead of having app + OS you end up with app + VM + OS so how exactly is that benefiting anyone other than the power company for the extra electricity used?

      Because for the most part, most servers don't run anywhere near full capacity (and if they do, then they are probably not good candidates for virtualization, except possibly for high availability purposes which I will go over in the second paragraph). I forget the study but I read once that on average a typical server sits at 5-15% utilization. So the idea behind products like VMware ESX is that if you need 5 unique servers, instead of buying 5 servers at $5,000 a piece, you buy 1 server for $5,000 + 1 $5,000 VMware license, and run the 5 virtual servers on that. So you spend $10,000 instead of $25,000, and your footprint is 1/5th of what it was before, meaning less racks, less cooling, less power, etc. And the numbers I gave are very conservative. A lot of people do 10-20 VMs per server easily.

      So cost, power, and cooling issues aside, there are other issues. In a typical server environment, if a physical server suffers from a catastrophic hardware failure, that server is down until someone can walk over and swap the hardware. With VMware, if a VM is running on a server and that server fails, the VM is cold booted on another ESX server automatically, and is typically up in 30-60 seconds. With the newest release of ESX server, called vSphere, they take it a step further. You can optionally choose to have A VM mirror itself on to another physical ESX server. So in the event of a hardware failure, the VM keeps running on the mirrored host. And then, it becomes the primary VM and sets itself up to mirror automatically on another ESX server. So you have ZERO downtime and the app re-mirrors itself. These are just some of the many useful features in VMware.

      And no, I do not work for VMware. I am a contractor for the Air Force and over the past 2 years I have converted almost 200 physical servers to VMs. We are a relatively small program, but our projections show that we will save $2,000,000 over 10 years just on the cost of servers (and yes, i have added in the cost of VMware licenses and support into that equation), and that doesn't even account for power and cooling savings. We've gone from almost 200 physical servers distributed over 7 full racks racks down to 28 servers in 2 racks (2 racks only because they are two separate facilities. Each rack only contains a single HP c-class chassis)

      I think the real question is, how can you NOT understand the fuss around VMs?

    • There are two things that appeal to me about VMs.

      The first is the east of backup/recovery/spawning new VMs. Want to play with altering ProgramA, no problem, let me just copy ProgramA's VM and start it up.

      The other is less hardware. Perahps ProgramA and ProgramB don't want to run on the same server... they will generally run in seperate VMs fine. Perhaps ProgramA requires a different version of SQL, or some other dependancy; no problem in VMs. Things that were before going on underutilized servers can now be

    • I used them for two reasons.. Application Isolation, and Disaster recovery..

      Many of the apps we used (hello, Oracle Colaboration suite, looking at you) require really messing with system files to make work decent. This makes other programs very unhappy, so apps like these really need to run on their own box. Since it wasn't disk or CPU intensive, it was easy enough to just stick in a VM, so I could do other things with the server too. Secondly, its kinda nice when you need to restart a machine to fix a p
      • by reashlin (1370169) on Wednesday September 30, @07:52AM (#29592039)
        Its more than that.

        Most machines run at around 10% of all possible utilisation. Often web servers will run at less than this. In a datacenter you have two options a) run hundreds of very slow cheap machines each running one instance of your webserver. b) consolidate lots of machines onto one powerful box and running it at 70-80% utilisation.

        Option b) has the advantage that should a website get hit heavily (maybe because its been linked too on /.) then you still have the beefy hardware to cope with it. You will also find heating bills go down. You'll usually even get the costs of the hardware down as well.

        If your still not convinced then look at the work by most VM software manufacturers who are making it so the VM can move around on physical hardware. Now if your hardware fails - the VM and OS does not. It just moves off somewhere else and continues to operate with little/no drop in performance or uptime.
        • by hodet (620484) on Wednesday September 30, @08:58AM (#29592825)
          It makes perfect sense. His clients want a dedicated host for their server. 10 clients, 10 virtual servers on one powerful box instead of 10 servers running at minimum capacity. More profit for parent. Data Centers are using virtualization big time because it saves money. Very easy to move the guest OS around if needed, even geographically.
            • Re: (Score:3, Insightful)

              So use 1 server and have 10 client logins on it FFS.

              1 client wants RHEL 4.
              1 client wants RHEL 5.
              2 clients want Windows Server, both want a weekly reboot, but during different maintenance slots.
              2 clients want stable Debian, but one wants a weekly 'apt-get dist-upgrade', the other wants it monthly ... etc.

              Give each one a VM, and you can deliver all this on one physical machine very, very easily.

        • by bertok (226922) on Wednesday September 30, @09:29AM (#29593223)

          Sorry , that makes no sense. By definition you could do it on the same hardware without a VM unless your VM somehow magics processing power out of the ether.

          Except that unless you have a magic crystal ball, you'll never be able to predict application load ahead of time. Hence, some servers will be underutilized, and some will be sitting at 100% half the time. The only alternative is to install every application onto every server you have, and load balance everything - but that requires that every app is compatible with every other app, and that every app can operate as a cluster. In practice, that's impossible for typical businesses.

          What the latest virtualization platforms do is load balance, on the fly. A large VMware cluster will analyze the load pattern and redistribute virtual machines around the cluster to balance things out, so that each host is evenly loaded. I've seen clusters set to an average of 70% CPU load, and it was just fine. If one host starts heading towards 100%, a few VMs are shuffled around until the load is evened out again. Users can't really tell the difference between, say, 20% and 70% load. It's only at 90% or higher that you get contention and increases in response latency. It takes about 5 seconds to move a VM, but the actual outage is only a few milliseconds, if that, so users never notice.

          One thing I noticed with VM deployments is that most apps get faster on less hardware. This is counterintuitive, but I've seen it before in well designed Terminal Server / Citrix deployments. The basic concept is that you can afford much better hardware if you need less of it. You can buy beefier servers, 10Gb ethernet, SAN storage, etc... When 1 app needs lots of power, it gets it, and then it gives up its share when it doesn't to other apps that do.

          So yeah, in a sense, virtualization does magic processing power of the ether, because it actually lets you use the processing power you paid Intel or AMD thousands of dollars for.

            • When did installing multiple apps on 1 server go out of fashion?

              When it became clear it's a management headache.

              "Hi it's ops. You know your foo server sits on the same box as the bar server? Yeah, well the bar guys have found out they need a kernel with a higher filehandle limit, so we're going to be rebooting that box. You'll need to tell your users about the outage. Oh, and you'd better have QA test the foo server with the new kernel too."

  • by nweaver (113078) on Wednesday September 30, @07:41AM (#29591927) Homepage

    Lets give a 12 hour lifespan, and say 25K VMs at the same time.

    At 5 VMs/physical host (I suspect it is MUCH denser actually), thats only 5K servers. At 50 servers/rack, its 100 racks.

    Or, in translation, not THAT much.

  • I call shenanigans (Score:3, Interesting)

    by Anonymous Coward on Wednesday September 30, @08:02AM (#29592131)

    My company tried to provision 10,000 amazon instances to perform scalability testing of our software that runs on many computers. The math was simple - 10,000 servers * $0.15 / hour = $1,500 / hour for testing. We liked the multiple OSes & versions (Linux - Redhat, SLES, Windows - 2000, 2003, 2008?) and software stacks (mysql, apache, websphere, sql server, iis, etc...) that we all available out of the box.

    However, if you need more than 20 servers, you have to fill out a form. A sales rep and tech guy called to discuss our needs. It turns out that they could only handle around 1000 instance request across all data centers unless we "reserve" the machines at $300 / each, which blew the math - 10,000 servers * $300 = $3,000,000 to start.

    Looking at the article, it is likely that people are re-requesting the same machine be started & stopped multiple times per day - 50,000 is probably off by an order of 10.

      • Re: (Score:3, Insightful)

        Even if it was $300/machine with 20VMs/machine it would be quite costly to reserve 500 machines.

        They raise the price because they can't scale that much on a dime. They probably have to add hundreds of machines a day in order to keep up with the demand for EC2 instances, you can't expect them to keep thousands of machines ready in case someone wants to figure out how high the cloud really scales. It would simply cost too much.

        No matter the cloud-hype, in the end Amazon and every other hosting supplier have t

  • Stock Exchange (Score:3, Interesting)

    by MyDixieWrecked (548719) on Wednesday September 30, @08:30AM (#29592453) Homepage Journal

    I went to an Amazon's AWS talk in NYC a couple months ago where they brought some start-ups in to talk about their projects, the cloud and how the cloud helped them build their applications faster and better. During the opening talk, the speaker showed some use-cases, one including the New York Stock Exchange and how, at the closing bell, they provision over 3000 EC2 instances to crunch numbers overnight to be ready for the next morning.

    A guy from a startup that I was talking to before we were seated was talking about how his company keeps between 5 and 10 instances up all the time for their application (dynamically bringing them up and down to scale with demand) and how they frequently had 4 and 5 sets of these servers running on the side for testing (20-40 instances at a time). He was talking about the metrics they were using to keep track of their use and how it was flawed due to the fact that they had hundreds of instances a day going up and down all the time.

    Just because 50,000 instances are started per day doesn't mean that those 50,000 instances are running for any period of time. I frequently bring up an instance, tweak some things, create an image, then bring it down... or bring up an instance to test something for 20 minutes, then bring it down. EC2 has really benefitted my QA/Testing/Experimentation in that I really have an unlimited pool of resources to play with. It's a much more robust system than I have at home with VMWare... vmware was a gamechanger for me since before that, I had 2 physical servers at home and stacks of 40GB and 60GB HDs with multliple versions of OSs on them.

    Of course AWS isn't for everyone. EC2 can be expensive for what they offer and the biggest advantage to AWS's services are that they are on-demand and work really well with applications that need to scale up AND down in real-time. If you've got an application that doesn't require to-the-minute scaling responses, it's less expensive to get a physical dedicated server with Xen on it and create your own virtual infrastructure... although if you don't have the skills or time to learn the tools, then AWS offers a much better learning curve.

  • Define "Objects" (Score:3, Informative)

    by Dersaidin (954402) on Wednesday September 30, @09:22AM (#29593119)
    Objects?

    "Objects" doesn't mean VMs, objects can be files, processes, etc.

    • by RealityProphet (625675) on Wednesday September 30, @07:36AM (#29591877)

      who cares how many potential VMs the "cloud" can host. its methodone for most end users/devs real problems: inefficient code. the "just pitch machines at it until it runs fast!" mentality will catch up to us.

      That's not true. We use Amazon's cloud to host some of our servers. The reason we do it is for two main reasons. (1) We don't need to worry about equipment maintenance. Let me repeat that lest you think its not a big deal: We don't need to worry about equipment maintenance! (That is a big deal when you leave your basement but don't necessarily have a dedicated IT staff). (2) We are in a rapid growth phase. We cannot estimate well enough what are computing needs, our storage needs, are going to be 1- 2- 6- months down the road. We also don't have $50k to drop on equipment and storage that may be utilized 6 months from now, but we sure as hell know if we bought it now it wouldn't be used immediately. Amazon's cloud makes it trivial to keep up with our growing demand without paying up front for it. Sure we pay more to "rent" the stuff from Amazon, but its simply the big(O) argument: Amazon's pricing scales worse than the classic alternatives, but the constants out front are tiny.

      • by commodore64_love (1445365) on Wednesday September 30, @07:46AM (#29591975)

        So to use a car analogy (cough)

        - It's the same reason why people lease cars instead of buying them. It's cheaper in the short term, and easier to come up with $300 for rent than $20,000 for purchase. Plus adding extra cars as new employees join the company is trivially easy.

        • by afidel (530433) on Wednesday September 30, @08:45AM (#29592655)
          But it's even better than a car lease, because you can end the lease on the VM with no penalty. If you have a really big batch job that needs to run once a month then you just spin up the VM's for the duration of the batch job paying for your usage and them deprovision them for the rest of the month.
        • not really

          a lot of businesses don't have cash on hand to meet their needs even if they are profitable. Even Best Buy has to borrow short term to buy up enough inventory to meet demand. Suppliers want to be paid right away. Amazon's solution is ideal for cash poor companies

      • We don't need to worry about equipment maintenance.

        For the scenario you described, I think S3 would be a good choice. Likewise if a bigger company had a division or department with out-sized or highly variable data storage needs, might work in that situation as well. Judging by the number of objects, a lot of people are finding uses for that capacity.

        I know for a while Walmart was using some paint-by-numbers hosted application provider that was based on ASP. Don't know if they still do, but for those

    • Re: (Score:2, Interesting)

      Its too early to predict if the Amazon cloud will do anything meaningful or if its going to be a spectacular failure.

      Considering 64 billion objects and counting, if the latter is to happen it's bound to give a whole new meaning to "when it rains, it pours".

Support Bingo, keep Grandma off the streets.