Tuesday, March 2, 2010

Two approaches on Multi-tenancy in Cloud

Continue on my previous blog on how multi-tenancy related to cloud computing

My thoughts has changed that now I think both the Amazon approach (Hypervisor isolation) and Salesforce approach (DB isolation) are both valid but attract a different set of audiences.

First of all, increase efficiency through sharing is a fundamental value proposition underlying all cloud computing initiatives, there is no debate that ...
  • We should "share resource" to increase utilization and hence improve efficiency
  • We should accommodate highly dynamic growth and shrink requirement rapidly and smoothly
  • We should "isolate" the tenant so there is no leakage on sensitive information
But at which layer should be facilitate that ? Hypervisor level or DB level.

Hypervisor level Isolation

Hypervisor is a very low-level layer of software that maps the physical machine to a virtualized machine on which a regular OS runs on. When the regular OS issue system calls to the VM, it is intercepted by the Hypervisor which maps to the underlying hardware. The hypervisor also provide some traditional OS functions such as process scheduling to determine which VM to run. Hypervisor can be considered to be a very lean OS that sits very close to the bare hardware.

Depends on the specific implementation, Hypervisor introduces an extra layer of indirection and hence incur a certain % of overhead. If we need a VM with capacity less than a physical machine, Hypervisor allows us to partition the hardware into finer granularity and hence improve the efficiency by having more tenants running on the same physical machine. For light-usage tenant, such increment in efficiency should offset the lost from the overhead.

Since Hypervisor focus on low-level system level primitives, it provides the cleanest separation and hence lessen security concerns. On the other hand, by intercepting at the lowest layer, Hypervisor retain the familar machine model that existing system/network admin are familiar with. Since Application is now completely agnostic to the presence of Hyervisor, this minimize the change required to move existing apps into the cloud and makes cloud adoption easier.

Of course, the downside is that virtualization introduce a certain % of overhead. And the tenant still need to pay for the smallest VM even none of its user is using it.

DB level Isolation

Here is another school of thought, if tenants are running the same kind of application, the only difference is the data each tenant store. Why can't we just introduce an extra attribute "tenantId" in every table and then append a "where tenantId = $thisTenantId" in every query ? In other words, add some hidden column and modify each submitted query.

In additional, the cloud provider usually need to re-architect the underlying data layer and move to a distributed and partitioned DB. Some of the more sophisticate providers also need to invest in developing intelligent data placement algorithm based on workload patterns.

In this approach, the degree of isolating is as good as the rewritten query. In my opinion, this doesn't seem to be hard, although it is less proven than the Hypervisor approach.

The advantage of DB level isolation is there is no VM overhead and there is no minimum charge to the tenant.

However, we should compare these 2 approach not just from a resource utilization / efficiency perspective, but also other perspectives as well, such as ...

Freedom of choice on technology stack

Hypervisor isolation gives it tenant maximum freedom of the underlying technology stack. Each tenant can choose the stack that fits best to its application's need and inhouse IT skills. The tenant can also free to move to latest technologies as they evolve.

This freedom of choice comes with a cost though. The tenant need to hire system administrators to configure and maintain the technology stack.

In a DB level isolation, the tenants are live within a set of predefined data schemas and application flows. So their degree of freedom is limited to whatever the set of parameters that the cloud provider exposes. Also the tenants' applications are "lock-in" to the cloud provider's framework, and a tight coupling and dependency is created between the tenant and the cloud provider.

Of course, the advantage is that there is no administration needed in the technology stack.

Reuse of Domain Specific Logic

Since it focus in the lowest layer of resource sharing, Hypervisor isolation provides no reuse at the app logic level. Tenants need to build their own technology stack from the ground up and write their own application logic.

In the DB isolation approach, the cloud provider pre-defines a set of templates in DB schemas and Application flow logic based on their domain expertise (it is important that the cloud provider must be the recognized expert in that field). The tenant can leverage the cloud provider's domain expertise and focus in purely business operation.

Conclusion

I think each approach will attract a very different (and clearly disjoint) set of audiences.

Notice that DB-level isolation commoditize everything and make it very hard to create product feature differentiations. If I am a technology startup company trying to develop a killer product, then my core value is my domain expertise. In this case, I won't go with the DB-level isolation which impose too much constraints on me to distinguish my product from "anyone else". Hypervisor level isolation much better because I can outsource the infrastructure layer and focus in my core value.

On the other hand, if I am operating a business but not building a product, then I would like to outsource all supporting functions including my applications as well. In this case, I would pick the best app framework provided by the market leader and follow their best practices (also very willing to live by their constraints), the DB level isolation is more compelling in this case.

2 comments:

Anonymous said...

If I understand your classifications correctly, these seem to map to the terms "IaaS" (infrastructure as a service, hypervisor-level isolation) and "PaaS" (platform as a service, database-level isolation) being used elsewhere.

Andrew said...

Actually I would think that the comparison is Iaas to Saas. Only the software as a service can provide code logic sharing.