Private Cloud Design Rules

Private Cloud Design Rules
The design goals are to deliver maximum total IT efficiency, a combination of Cost Savings, Agility, Redundancy, and Efficiency (operational efficiency) All the rules are meant to be followed as one. The first rule of Private Cloud is to follow all rules, the more followed the better the result and following fewer gets exponentially worse results. The following rules create a virtual infrastructure meant to support a Private cloud management application to deliver Cloud functionality. Virtual Infrastructure alone is not a Private Cloud although it delivers much of the benefit of Private Cloud. My current recommendations for Private Cloud Management applications are Abiquo, Embotics, and VMware vCloud Automation Center (the software formerly known as Dynamic Ops)

General
Rule 1 Virtualize Everything
Routers, firewalls, load balancers, switches, backup …
You might wonder how to virtualize backup. Another way to think of virtual is multi-tenant so look for backup apps that supports notification by unique backup job and assign backup jobs by service / client. Some virtual devices like load balancers much less expensive than physical and enables better scalability and availability

Rule 2 Monitor (alerting and reporting) Everything
Every server, every service, every port, categorize alerts and reports by service / client.

Rule 3 No Tech Silos
All admins manage all components. Assign admins to service / client. Develop admins as well as engineers

Rule 4 Full Redundancy for all components
Redundant internet connections, routers , firewalls, switches, server ports, storage adapters, SAN controllers, datacenters…

Rule 5 Every component requires a web management interface
Enables remote administration from any device without admin software installation, develop admin expertise not command line expertise

Rule 6 Holistic Design
Iinclude all components in the design, every component should be selected to be Cloud design specific from remote access to backup, specifically VPN support and web admin interface

Rule 7 Design for Resource Density
Select devices for as much resource per rack U as available. Remember to consider CPU core density as much as socket density, for backup I suggest 2U LTO 16 cartridge libraries

Rule 8 Design for Functional Density
Select devices that offer multiple functionalities per rack U space, example Juniper SSG devices that combine router, firewall, VPN. Do not use dedicated router hardware unless necessary. Select virtual instead of physical devices, Netscaler load balancers for example.
Suggestion to combine vCenter and Backup server on the same physical server, VLAN tag to all subnets for all level2 backup traffic

Data Center
Rule 9 Datacenter not Server room
Private Clouds should be in datacenters not server rooms to provide cheaper and redundant bandwidth and Air, Power, etc infrastructure.

Rule 10 Standardized Datacenter Design – Datacenter Pods
Standardize datacenter hardware design 3, 5, 7 rack pod designs. Standardized management and capacity racks. Even a 3 rack design can provide up to app. 800 vServer capacity @ 800 Ghz and reasonable amount of associated storage and networking

Rule 11 More smaller datacenters than fewer bigger datacenters
Primary / Backup datacenter design is RAID 1, 100% redundancy cost. Six smaller datacenters RAID5 is app. 15% redundancy cost

Hypervisors
Rule 12 Use Large Servers
Larger servers require less redundancy cost than smaller servers.
4 socket 16 core for core density, 2U, 6 IO slots 4 10gb net interfaces 16gb FC

Rule 13 No local hypervisor vServer storage
RAID 1 + hotspare  or USB only for hypervisor hosts, all vServers on SANs

 Rule 14 Full Virtual Infrastructure
Use virtual infrastructure like vmware vcenter and enable automatic high availability and distributed resource scheduling

Virtual Servers
Rule 15 Configure all vServers on SAN Storage
vServers should be on SAN Storage for redundant controllers and disks, scalability, and efficiency. Local storage does not offer redundant storage

Rule 15.1 Mix vServer types on the same hosts
Place multiple types of servers; App, DB, Mail, Web, etc. on the same hosts to level / balance utilization, reduce spikes

Rule 16 DHCP IP addressing
Use DHCP IP Addressing, yes even for servers

Networking
Rule 17 VLAN everything
Select only devices that support VLAN tagging not just the obvious but WAN optimizers, Load balancers, Remote Access devices

Rule 18 10Gb network
Select 10gb physical server interfaces, redundant of course and divide capacity with virtual switches for capacity density and cable reduction. Use 1U 24 Port 10Gb switches

Rule 19 Switch Clusters
Select switches that support clustering, virtual chassis composed of multiple switches and LACP that spans switches

Storage
Rule 20 All SAN Storage
See  vServers above, worth repeating and the same benefits for servers apply to virtual devices like load balancers as well

Rule 21 Standardized Datastores
Configure all 2TB.  vServer IO increased by striping across multiple datastores

Posted in Uncategorized | Tagged | Leave a comment

Cloud Vendor Watch List

These are the vendors I follow categorized by functionality to build a Private Cloud. These are not listed by preference. I will be adding reasons for these vendors / products shortly, the Why is more important then the What

Bandwidth
TW Telecom, Clear, TowerStream, Level3, Abovenet, Cogent Communications

Datacenters
Equinix, Digital Realty Trust, SuperNAP, BlueLock, Verizon (Terremark), Net2EZ

Monitoring
Quest vFoglight, UpTime Software, Cloudability, Nimsoft, FireScope, NetForensics, Nodeable

Security software
Catbird, Juniper Virtual Gateway (previously Altor Networks)

Cloud Management Software
Platform, Embotics, Abiquo, Nimbula

Routers /Firewalls
I prefer to use combination router / firewall systems to maximize functional density. No dedicated router hardware if capacity not needed
Juniper SSG, SourceFire, Palo Alto Networks, Brocade MLX

WAN Optimizers
SilverPeak, Riverbed

Load Balancers
NetScaler (virtual), A10 Networks (virtual)

Switches
Brocade VDX 10Gb, Juniper QFabric, Force10, Plexxi, BigSwitch, Nicira, Embrane,
notice no Cisco, HP…

Servers
Dell (makes a 2U, 4 socket, 12 core server, HP does not) , SeaMicro

Storage
Compellent, 3Par, Equallogic, Pure Storage, Nexsan, Nexstor

Posted in Uncategorized | Leave a comment

Fluid Computing What and Why

The name Fluid comes from the ability of resources (CPU, Memory, Storage) to flow between Functionalities (router, firewall, servers, and clients). This is done by using virtual servers, virtual appliances (routers, etc) and virtual desktops. Fluid Computing is built on a specific configuration of commodity servers, 10Gb networking, and 100% SAN based storage as is Hybrid RAID Cloud.

The Primary use case of Fluid Computing is Enterprise IT,  24×7 Windows networks supporting Exchange, SQL Server database apps, SharePoint, Active Directory, ERP, etc. While it can support QA and development itself it might be best to support these activities with Public Cloud services through a Hybrid Cloud model

Fluid Computing is designed to be the most Cost Effective, Reliable, Scalable, Efficient IT Architecture possible. The goal is not an incremental improvement but radical improvement. A 2-3x cost improvement and infinite 100% availability.

Cost Effectiveness is improved by only using commodity hardware to provide compute resources; CPU, memory, and storage and virtual appliances to define functionality.

Reliability is improved by using hypervisor redundancy to provide functional redundancy. If hypervisor redundancy is used only one virtual appliance is needed and the complexity and configuration of device redundancy is eliminated as a cause of downtime. Reliability is also improved through the reduction of physical device points of failure. Fluid Computing is built on top of a RAID Cloud Architecture to supply the architecture benefits of RAID Cloud as described in earlier posts.

Scalability is also improved by use of virtual appliances because they can be quickly increased (or decreased) in capacity by simply and quickly changing the virtual appliance configuration.

Efficiency is improved because virtual appliances can be tuned to match the resource requirements and resources freed to be applied to other functionality as needed

This is more then just a theoretical idea. I started to run a partial version of this architecture at a previous company successfully using Citrix Netscaler virtual appliances. Just today 12/12/11 the company Embrain was launched to offer virtual firewall, load balancer, and VPN services in addition to existing Vyatta and Big Switch virtual router appliances.

Stay tuned, next post on Fluid Cloud tablet / remote desktop client design

Posted in Uncategorized | Leave a comment

A short post on Cost Scale

Cost Scale is the function of Capacity to Cost for server, networking, storage etc.  The idea is to evaluate Cost Scale when selecting components. Storage systems specifically can vary greatly in cost per TB depending on the size of the system. Its important to define datacenter component resource requirements to evaluate Cost Scale when selecting vendors and models to develop the most cost efficient design.

 

 

Posted in Uncategorized | Leave a comment

Cloud 3.0 and Fluid Computing (Cloud 4.0)

To summarize Cloud 1.0 functionality is server provisioning, Cloud 2.0 is server provisioning and management, Cloud 3.0 functionality is virtual server, networking, and client (VDI / Remote Desktop)  provisioning and management. Fluid Computing will be the next generation of architecture after Cloud. Fluid Computing replaces all functional specific hardware with commodity hardware. Functionality is managed by virtual appliances (routers, switches, etc) and virtual servers and clients.

I think this will be the timeline for Cloud and Fluid Computing architectures to become standard IT practice
Cloud 1.0 2012
Cloud 2.0 2013
Cloud 3 2014
Fluid Computing 2015

 

 

Posted in Uncategorized | Leave a comment

Cloud Architecture Versions

Cloud 1.0 Functionality – Server Provisioning
Service Catalog, Provisioning, Charge Back

Cloud 2.0 Functionality – Server Management
Ability and information to self manage federated Service Catalog , federated hybrid Provisioning, Charge Back, vServer Modification, Alert Configuration, Snapshot Management, Capacity Planning, Manage Access / Security, vServer Scheduling, Capacity Usage vs. Available information and reports and Client management

Cloud 3.0 Functionality – IT Architecture Management
Ability and information to self manage federated Service Catalog , federated hybrid Provisioning, Charge Back, vServer Modification, Alert Configuration, Snapshot Management, Capacity Planning, Manage Access / Security, vServer Scheduling, Capacity Usage vs. Available information and reports, Network and Client provisioning and management

Fluid Computing 1.0 – IT Architecture in Software
An Enterprise IT Cloud Architecture that allows commodity compute resources, CPU, etc. to change functionality and capacity, from server to network to client and back as needed between multiple datacenters.

Posted in Uncategorized | Leave a comment

Hybrid RAID Cloud Architecture

Hybrid RAID (Redundant Array of Inexpensive Datacenters) Cloud Architecture
An IT Architecture for the Cloud Era

RAID Cloud Architecture (RCA) is a revolutionary IT Architecture (a collection of one or more data centers) based on the same proven design as Storage RAID systems. The design goal is not incremental improvements, but radical improvements; 300-400% cost savings, infinite 100% availability, rapidly massively scalable, and maximum efficiency.  A major difference of RAID Cloud is that it is not designed ad hoc, the complete Architecture design is known from the start before a single server or router is purchased.

Current IT Architecture design is to order the number of racks needed to support initial requirements and some additional empty racks with a Right of First Refusal and when the company has enough resources a backup data center will be added.  The problem with this process is that the end result is usually not a cost effective design, it just continues to grow ad hoc as projects are added and gets less cost efficient, more complicated, unreliable, and difficult to manage. Most companies never get big enough to afford a backup data center, a 100% redundancy cost, and if they do the backup datacenter is designed completely differently than the primary so it’s useless to test against the primary.

A RAID Cloud Architecture STARTS with design combination of 3 or more (preferably 6) Datacenter Modules (DM). A DM is a 1 to N rack complete configuration of routers, switches, servers, storage, etc. DMs are linked by Layer 2 connections to create a single logical Architecture.  In a six DM design the first two DMs should be located very close together and close to the main Enterprise office for low cost point to point connectivity from the office and to each other and to create the first Datacenter Cluster for RAID1c redundancy.  Assuming for budgetary reasons they need to be built as needed, the third datacenter should be in the Center of the US, the fourth on East US, the fifth in the Center US, and the sixth East US to create 3 Datacenter Clusters and a 16% redundancy cost not the usual 100% redundancy cost .  The Datacenter Clusters are created to support live vmotion and storage replication for 100% availability

RAID Cloud Architecture (RCA) is designed to be Cost Effective due to higher virtualized resource and datacenter space utilization and smaller less expensive components. Using multiple data center colocation and bandwidth vendors for each Datacenter Module provides leverage for cost negotiation and price protection. The modules are a small cookie cutter designed to be inexpensive.

RCA is designed for infinite 100% Reliability by using many multiple co location and bandwidth sources. Using Global Server Load Balancing an entire data center can be down for maintenance or service failure and only cause a loss of capacity, not availability.

RCA is Scalable. As more capacity is needed more DMs are added until 6 DMs are created. As more capacity is needed beyond six each DM can be quickly increased in size. Because DMs are generally built as needed they are contracted over time so if less capacity is needed DMs can be decommissioned.

RCA is Efficient, DMs are designed to be quickly configured, usually in under a week, and all the components are the same so the configurations can be duplicated.  Multiple duplicate DMs allow testing of new versions of system software without affecting an entire Architecture. Multiple DMs also provide swap space to upgrade systems, in a single datacenter space has to be left open to install new systems or significant downtime has to be scheduled to support in place upgrades. RCA is based on a heavily virtualized infrastructure so capacity can be quickly added. In addition to low cost connections DMs are located close together and close to the office so multiple DMs can be visited with minimum travel. Estimates of $20k+ annual savings  in driving time alone for the LA Area.

New Technologies and Services that support RAID Cloud Architecture
DCB, TRILL, SSG,  AWS live vmotion, live storage vmotion,

Posted in Uncategorized | Leave a comment