Few of our clients understand the difference between operating a cloud infrastructure and operating a traditional datacenter, but it's not that they're dumb; it's just that the whole idea of cloud is new and different. There aren't a lot of fully functioning cloud infrastructures out there so, obviously, there's not a lot of personnel experienced running those infrastructures. With this post I want to explain what it means to run a cloud infrastructure and by that I mean I will explain the difference between what you know now versus what you need to know—and change—later, when you're faced with operating one of those beasts.
The first thing you have to do is basically unlearn everything you know about operating your datacenter…ha ha, OK, not exactly everything, but there's a lot that goes on in your datacenter now that is irrelevant and inadequate for operating a cloud infrastructure. The whole idea of people doing "stuff" (i.e. provisioning servers, configuring switches, carving up storage, etc.) has to be completely rethought because in a cloud infrastructure, you rely on the systems for doing those things. In a cloud infrastructure, you are basically decoupling the business benefit of your compute assets (storage, servers, networks, etc.) from the management and operation of those assets so that you can individually optimize each. You have the freedom to focus on the business aspects versus the technology or, vice versa, the technology over the business aspects.
How this is done is by the extensive use of automation and orchestration of commodity IT functions (i.e. The "stuff" mentioned above) directly integrated to change and configuration management systems that are governed by rules and policies set by you and the business. The face of these aggregated systems, the cloud infrastructure, is a self-service portal with a catalog of options that you allow users to choose from (based on their role and responsibility within the organization). You can also see, from a single pane of glass (i.e. a dedicated program or web page), the entire operation of the infrastructure including performance metrics, problem areas and alerts, usage and utilization as well as the ongoing costs of all of the services being delivered…
…and that's the point right there: once you decouple the business benefit from the operation and management of your technology, you are essentially delivering IT services to the business in the form of well-defined business benefits. An example of a business benefit is the delivery of a set of provisioned servers (database, app and web servers), in a very specific configuration, with an explicit set of software development tools installed with a certain amount of dedicated storage and network bandwidth…all at the simple and direct request of a user, without involving IT personnel. And…maybe 90 days later…the user is emailed a request to verify that the server is still in use and, if not, permission to decommission it and return all of the resources it was using back to the specific pools of resources where they came from.
Yes, it is that simple but, creating, managing and ultimately delivering that type of capability isn't easy at all, it takes a lot of upfront work to get right which is why you have to stop thinking about your datacenter the way you do now and start thinking about being a service provider. The basic process is that you need to look at the things your business (your consumers) want to buy from you (more on that in a bit); analyze all of the many processes, the people involved and the tools and resources you will use to provide that thing; translate all of that into an automated process (a workflow) that will be carried out by your systems (and test and QA those workflows); and then add that menu item to the list on the self-service portal that you present to your "consumers" who can then use that process over and over again. And, with respect to your consumers buying from you, because you have direct visibility into all of the various processes involved and completely understand the usage and utilization of all of the supporting resources—CPU, memory, storage, networks, licenses, etc.--you can associate a cost to every service you deliver…both the start-up and the ongoing monthly costs…very easily.
As you can see, this is much, much different than filling out that form for requesting hardware from management; waiting for approval; sending hardware specifications to procurement; waiting for approval; waiting several weeks to several months for delivery; racking and stacking the hardware; waiting for approval from Security to plug into the network; plugging into various switches and networks; using many different interfaces (to storage, to core switching, to virtual infrastructure, etc.) to configure the hardware; using another completely different set of interfaces to load system images, software stacks, etc.; and following various procedures found in different places managed by different divisions. And, after all that work, you only know what the hardware and labor originally cost to buy and setup; you have few options for figuring out what a monthly cost might be (so you don't bother) so your consumers never know the true cost and subsequently over-provision.
So, as you can hopefully see, the systems that have grown up around, and been optimized for, managing the processes used in traditional datacenters (as just noted) are completely inadequate for managing the processes used in a cloud infrastructure because you are delivering IT services which have distinct and explicit requirements that just do not exist in current systems.