Friday, August 22, 2008

Oversold

So, the question was asked, how much does hosting a website like theguncounter.com cost?

Well, thankfully, Combat Controller is piggybacking our hosting on his colo account, and we're using open source software,so it isn't costing us anything but time; but generally speaking for a site like ours, you'd expect a minimum of $20 to a maximum of about $80 a month for hosting charges, including bandwidth.

At least from a reputable company that actually charges what a service is worth, and then actually provides that level of services.

The followup question was of course "why does it cost that much, when companies like dreamhost sell high bandwidth hosting for $8 a month".

Well... yes and no.

Application services hosting is an interesting business. When most people think of web hosting, they think there's some server somewhere chugging out your pages, just for you.

In reality, all application service hosting (web services are just one of the applications you can host) is virtualized; meaning in this case, that there are many different applications, and sites, running from a single piece of hardware.

Of course it's not like one box can serve an infinite number of applications; so you have to carefully balance how many applications you provision on a given set of computing resources, and a given bandwidth and storage allocation, against how much you charge; to both remain competitive, and make a profit.

It's a tricky balancing act really, and it's one that unfortunately most companies do badly.

Part of this balancing act, is the concept of overselling; which means selling more resources than you actually have (it's also called oversubscribing, which is the same thing, except without the selling part).

This isn't a bad thing. In fact, it's the industry standard; because most applications don't need 100% power, all the time. In fact most only need 10% power or less 90% of the time, and 100% power 10% of the time.

I use the term "power" there advisedly, because oversubscribing isn't just part of the hosting business, and it isn't a rare thing; service providers in every area oversubscribe or oversell. Phone companies don't have enough circuits for every customer to call all at once, power companies don't produce enough power for every household to draw 200 amps all at once etc...

Every hosting provider oversubscribes; some are more honest about it, some less.

Ok, so how does oversubscription work exactly? I mean how can you sell four people 100% power on the same computer? Didn't Mel Brooks write a movie about a play about that kind of thing.. and then a play about a movie... Ok, it's complicated...

Actually, it's pretty simple. Let's say you have 1 cpu, and a particular application may need 1 full cpu worth of power when it's doing heavy processing; however, you know that it only needs that much power 25% of the time or less. If you have four other applications like that, so long as they don't need the CPU all at the same time, you can sell each of them 1 cpus worth of power; and everyone will GET one CPU worth of power.

At that point you've oversold by 4 to 1, but each application is still getting 100% of their needs met, and they're paying MUCH less than they would if you had to provision a full cpu for each of them; just to have that CPU sit idle 75% of the time.

Now remember, every hosting business and every ISP do this, for all but the busiest sites they host.

Only when a customer pays for a true dedicated server will they ever NOT be virtualized, and only very rarely will virtualized hosts not be oversold.

Oh and in my experience, most customer who pay for a dedicated server don't actually need one; and many customer that DO need them try to economize by NOT paying for one.

When you first try to explain all of this to customers, they often feel a sense of outrage, or like they are being cheated. What they don't realize is just how expensive it is to host applications, and that without oversubscription, they would be paying FAR more for their services.

Oversubscription isn't cheating, it's just how the game is played.

I play this game for a living; except the game I play has higher stakes than small scale commercial web hosting, because instead of doing it for web hosts, I do it for business critical applications in a bank (I'm the chief architect for one of the major divisions of one of the worlds top 10 banks). If we have outages, we can lose hundreds of thousands of dollars a minute in revenue, and receive fines in the millions.

Because of this, though we do virtualize many applications, we don't oversubscribe anything above a tier 3 application (there are five tiers), and we do a minimum oversubscription on anything lower. No more than 2.5 to 1 on anything other than low load internal web pages, ever.

Of course we have a LOT of low load internal web pages (thousands), and to save server costs, we oversubscribe them as far as we can get away with.

There's no way we could ever make money doing that as a business; because the market pressures are insane. Every competitor in the hosting market is being pressured to offer more and more services at lower and lower prices; and they could never oversell at such a low ratio and still stay in business.

In this business you have whats called the 60/40 rule. You try and load your hardware up to 40% sustained utilization at least 60% of the time, and 60% peak utilization no more than 40% of the time. You peak to 100% actually, but you only peak for a few seconds or a minute at a time, and it gets averaged out over 15 minute increments, so it comes out to 60%.

When you hit 60% sustained utilization (more than 60% utilized for more than 40% of the time) it's time to upgrade, because you don't have enough capacity to maintain consistent quality of service for the applications on those systems.

So long as you obey the 60/40 rule, and make intelligent decisions about how to allocate applications across your hardware, you should be OK. In a shared web hosting environment you would be amazed at how many relatively busy websites can be hosted on a single server without hitting that 40% mark never mind the 60% mark.

It may seem odd to someone not familair with the business, that you could theoretically host hundreds of applications (usually small web pages) on a single server, and every site will work very well; but as long as you keep to the 60/40 rule, it really works.

It's when you start trying to stretch beyond 60/40 that you start running into problems.

So let's look at what the numbers really are like shall we?

The bread and butter building block of hosting services is the 1u or 2u mid-range rack server. They're generally built with 2 processors, and most of them now have 4 cores each processor (basically 8 processors total), and these days 8 or 16gb of RAM (you try to make it 2gb of RAM per core these days).

They take up just 1.75"x19"x32" (or 3.5" instead of 1.75 depending); which are industry standard rack units; and generally use less than 500 watts of power at normal load (though you have to account for twice that for power and cooling because you have two PSUs that COULD draw 500 each.)

Generally speaking, the servers cost anywhere from $2,000 to $5,000 each (with a hardware service contract) if you buy them in bulk. Let's presume it's $3000 (a good price from a top tier vendor to a high volume customer), and you're running on the industry standard depreciation cycle of 36 months; for a cost of about $85 a month

It is entirely possible to fit several THOUSAND websites on a single mid priced server; though a few dozen small to medium sites, or 4-8 larger sites is more common. The resources provisioned for each site are called a virtual host.

Basically a standard "large" virtualhost will have 1 or 2 full cpus worth of resources available to it at peak, and the associated memory and I/O. You can push a lot of data with a 2 cpu 4 gig server.

Actually, it's MUCH better if you use a pair of servers, both of which are managed to never be at more than 40% capacity; so if one fails it can take all the load for the other.

Better hosting companies will offer that service, or simply build it in; so they don't have downtime and contract penalties. It's a lot less expensive to build services that don't break as easily, and fail gracefully when they do; than to try and save cost by building cheap, to minimum standards.

Spreading the load across 4 servers is even better, because a single failure will have a lower impact with the load spread across three additional systems than one, and because you could sustain multiple failures and still maintain service at reduced performance.

Importantly, you aren't really adding capacity, and you aren't adding additional hardware cost; because you're not buying more hardware than you would have if you were provisioning to the same hardware utilization level on single servers. What you're doing by clustering is adding resiliency and improved availability.

If you architect it properly though, there is no reason why you can't fit 4-8 very large very busy web sites, for which you charge at least $80, and up to $200 a month in hosting fees if you're a good outfit; on that one server... Or even better 32 sites, at $80 a month each virtual host; for 32 sites on a 4 server cluster, generating $2560 a month in revenue, and costing $340 a month in raw hardware costs.

Of course the raw hardware costs are only a small fraction of the total costs of providing hosting.

The hosting business has such tight margins, that they tend to operate at a 60/80 rule, or even an 80/60 80/80, or 80/90 rule, which seriously strains systems... but even then, so long as you keep on top of things, you should be alright (well.. not really with the 80/90... but if you're clustered and REALLY on top of things, you can usually get away with it).

Funny thing is, if you actually are really on top of things, it's pretty easy to manage very highly utilized servers. The real problems come in when you're overselling disk space and bandwidth; but again, if you manage it intelligently, you can still maintain quality of service.

I think you can see where I'm going with this.

There is no industry standard for provisioning storage; because needs vary WILDLY. Generally speaking though, we presume a MONTHLY cost of about $4 per gigabyte minimum, for datacenter storage (it can easily get up over $40 a gig for high performance highly available storage with high performance archive and long retention times etc...).

Why so high when you can buy a 1tb hard drive at your local electronics store for $200?

...Because enterprise class storage doesn't use 1tb consumer grade drives, it uses 146gb or 320gb enterprise class drives that cost $300-400 each in bulk; and you provision it in 8 drive RAID groups, that have an available space of about 2tb (after the filesystem is put on. The raw space is 2560 gigs,) for $3200 raw storage cost. Then you have to put those drives in something; and enterprise class storage arrays are not cheap. Figure $32,000 minimum (plus disk) for an array that can hold 6 of those RAID groups. Then you need to set up the storage network to get the storage to your servers, and that's not cheap either, at a minimum about $16,000 for enough ports to fill up the array.

That's $67,200 for 12 terabytes, not $2400; and a cost of about $5.60 a gig (not including cabling or backup).

You can cut that price a bit by using SAS or SATA drives instead of standard SCSI, and by using larger arrays (up to a point the cost per slot goes down. Get beyond that point, it doubles, then starts going down again); and get it down to around $4 a gig, especially now that 1tb SAS drives are becoming available; but most aren't using them yet.

So yeah, $4 a gig is a hard minimum right now; unless you want to go to dumb storage (storage that doesnt have intelligent management and caching), and use consumer grade drives. It's a bad idea, but some will try and save on costs by doing so.

Above, I noted that it is cheaper to build resilient systems than it is to fix non-resilient systems when they break. This is one of the prime examples. Companies wouldn't pay for enterprise storage if it werent saving them money in the long run.

So, if we presume each of our very busy 32 sites is using 5GB (most likely a drastic overestimation, but that is a common allocation for a low priced hosting account) then it would cost us $20 per site per month just for the storage; or about $640 per month. That's without overselling.

The industry standard for overselling bandwidth for shared services while maintaining quality of service is 7 to 1 for median usage, 4 to 1 for heavy usage, and 12 to 1 for light usage.

That means that generally, if you sell 7 customers a maximum bandwidth of 3 megabit per second, and a bandwidth cap of 500 gigabytes a month; you only need to provision 3 megabit and 500 gigabytes; not 21 megabits and 3500 gigabytes.

Bandwidth costs can vary wildly; but as a general planning number, it currently cost about $250 a month to provide that level of service (raw service charges).

Let's get back to our heavily used 32 virtual host cluster, and say we can oversell them at 4-1. They each want 3 megabits per second of maximum bandwidth, and 1 terabyte a month of transfer. It costs us about $1000 a month to provide enough bandwidth to oversell those 32 virtual servers at 4 to 1 on 3mbit; so about $30 per host.

Now, let's add up the costs so far, to provide excellent quality of service and oversell within industry standards. It's about $2000 a month aggregate RAW COST, not including staffing, overhead, or profit, just for those 32 virtual hosts.

The general IT industry standard admin to server ratio for web servers (real servers not virtual hosts) is 32 to 1 (hosting providers tend to operated at a higher ratio though); and a basic level full time server admin costs you a minimum of $5000 a month. Then you need at least two NOC monkeys for that same server load, at a minimum of about $3000 a month each. So that's $11,000 a month in raw staffing. Industry standard overhead calculation is to double the staffing cost (its a minimum of 1.6 times up to a maximum of about 2.4 so double is a good average) so $22,000 a month to admin 32 servers; and we're admining 4 servers, so our portion of that cost is $2750.

So, our total incremental cost to provide those 32 high utilization virtual hosts, at 3gbps maximum bandwidth, 1TB transfer, and 5gb storage; at industry standard QOS for overselling and staffing, is $4750; or just under $150 a month per virtual host.

Presuming we take $80 a month for them, we're losing $70 a month.

This is why good hosting companies charge so much more than the $10 a month places; because services cost money, and a good host will give you the service you pay for.

Ok, let's say we stretch it out, go to the 80/60 rule and provision 48 virtual servers on the same hardware; then go to a minimum QOS oversell of 12 to 1; and we oversell the storage at 10 to 1 (unfortunately typical).

Our hardware cost remains the same at $340 a month , our storage costs go down to about $100 a month, we go to $500 a month on bandwidth, and our staffing cost remains the same at $2750... $3690, divided by 48 is $76.88.

Wow, we can just about cover it at $80 a month, with a meager $3.12 a month profit each on 48 accounts... and of course we haven't even discussed the marketing costs of getting those accounts in the first place; which is typically 5 times the monthly cost of provisioning the customer, so actually your first five months for that customer cost you double. At that rate, it will take 3 years for you to see a profit on just that one customer.

Remember when I said the hosting business runs on very tight margins?

If as a hosting provider you choose to use the 80/60 rule, you're already stretching your QOS minimums. If you then choose to provision all your customers at the 12 to 1 level, you're stretching things further.

..but, you're still being ethical, because there is a reasonable expectation that you will be able to maintain quality of service across all your customers; providing you have a top notch NOC and datacenter staff, and keep on top of capacity and maintenance.

Now, look at a company like Dreamhost (I chose them because they are one of the few companies that admit to overselling). They are theoretically selling that same level of service, only they are selling at at $8 a month, not $80 a month.

Obviously that is impossible. They cannot possibly provide the minimum QOS that they are selling. However, for $8 a month, who in their right mind would expect them to.

Basically, they are lying to the customer, because everyone else competing in the market is lying to the customer. The customer pretends to believe he can get $80 a month worth of service for $8 a month, and the host pretends to sell it to him.

In order to make any money at all on that price, they need to do some crazy things.

Note, I won't say these exact numbers apply to dreamhost in particular, because I don't know what their provisioning looks like; I'm just using their pricing example to show you how it has to be done to reach that price.

They will sell HUNDREDS of sites on a single server, and generally they don't cluster for their lower tier offerings (they do for higher tier, higher usage sites etc...). For high utilization systems, instead of running at 8 to one, they run at 24 to 1.

For bandwidth and storage, they follow the same 24 to 1 ratio.

Most importantly, they deliberately keep staff levels to a minimum, to cut overhead. Staffing costs are by far the largest proportion of the cost of providing service, so instead of the industry standard 32 to 1, they run at 256 to 1.

So, for our original 32 virtual hosts, this brings their cost down to $113 a month for the raw hardware, $167 for bandwidth, $27 for storage, and about $115 per month for staffing.

That's still $13.75 a month per virtual host.

So for every customer that they have to provision even that degraded quality of service for, they're losing $6 a month.

The only way they are able to sustain this model, is because they have tens of thousand of accounts at $8 a month, literally 90% (or more) of which never use more than one 12,000th of a server (less than 1 minute of CPU time per day on an 8 core box); and require essentially zero admin time. For those sites, they put a couple hundred of them on a box, not just 24; they use 1/100th the bandwidth of the larger virtual hosts, do 1/100th the transfer, and rarely use more than about 10mb of storage.

Basically they take the possible but not very ethical 80/60 level I describe above, and they multiply the loading by 5 (or more).

Again, I should emphasize; for the vast majority of their users, this is just fine; and they are completely satisfied, because they are paying a very low price, and getting their needs met.

The hosts cost to provision those 90% customer is about $3 a month; and those 90% are perfectly happy, because they think they're getting enterprise class service and in effect they do because their needs are never more than 1/100th that of a "real" server.

So if 9 out 10 of your accounts costs you $3, the tenth costs you $13.75 and you take in $80 for the whole lot... hey, that's a decent profit. You're averaging $4 a month profit across all your accounts... which is a little bit more than the company who stretches it out to 80/60, that I described above (they're making $3.12 an account).

However, in order to sustain that model, no account can EVER exceed the usage needed to ensure the 24-1 oversell, and you have to keep that 9/10 ratio of high utilization to low. Otherwise, you're sunk.

So, if you're a home user, that $8 a month is attractive, because in all likelihood you'll never use 1% of what you're theoretically paying for never mind 10%.

But if you're a business, or a web site used by hundreds or thousands of people, and you want to actually get anywhere near what you're theoretically paying for; you have to actually pay for what it costs, plus a little bit extra to allow the company to make a profit. That means that $80 hosting account is the minimum real world, for an application that needs a guaranteed quality of service.

'Course thanks to better virtualization technologies, and higher density servers, we're putting four times as much computing power into the same physical space, and at the same cost, as we were 2 years ago. Labor costs aren't going down, so it's not a 75% reduction; but two years from now, we might be able to provision that same quality of service for $60 instead of $80.