EMC Software Solutions Blog

Current Articles | RSS Feed RSS Feed

Storage for Web Applications and Why Sharding Doesn't Scale


cloud storage


One of the most frequent use cases for Atmos is deployment with web applications to provide an easy to manage storage platform that's scale-out in nature and resistant to failure.  Scaling and managing hundreds of terabytes or perhaps even petabytes of data is incredibly difficult.  It's even more difficult to reliably replicate and protect data.    


"Why can't I just use NAS behind my web app?"

That's one of the most frequent questions that I get asked from customers.  Traditional NAS may work perfectly fine with web applications of smaller scale, but managing NFS and CIFS mounts at scale for web applications is untenable. 

Sharding is part of the problem rather than the solution.  The concept of sharding involves physically spreading one or more copies of data out amongst multiple server nodes and disks.  The web application knows where the object was written and therefore also knows where to ask for the object to read it.

But sharding is difficult to manage at scale.  Imagine having 20 web servers and 20 NFS mounts that contain objects.  Each web server needs a connection to each and every NFS mount.  An administrator would need to manage 400 NFS mounts in this case!  


Adding capacity

How does an administrator add capacity easily in the sharding scenario?  They can’t --- not without significant disruption or inordinate effort.  The counter to this is “just add another filer.”  Adding another filer with small scale is fine, but each additional filer increases the complexity disproportionally and the management becomes very difficult.  In other words, it doesn’t scale.


Replicating objects

How do you replicate objects to disparate locations across the country or the world in the sharding scenario?  You can’t --- not without great difficulty and complex, brittle deployments using 3rd party software non-native to the technology stack.

How does an application intelligently control the replication of objects in an automated manner?  It can’t --- not without relying on file system metadata such as create time, last access time, but this information doesn’t properly describe the value of the content to the application and the business.  Atmos user-metadata allows more descriptive elements to be associated with objects that then allow the application developer to drive storage policies.  This simply cannot be accomplished easily in homegrown sharding scenarios.  Atmos makes this functionality turn-key.


When NFS goes down

NFS doesn’t always go down gracefully either when failure occurs depending on the version of NFS and the configuration.  When using the Atmos RESTful web services interface with web applications systems administrators need not manage NFS mounts from web servers nor will failure of a given node or disk result in total failure of the application stack.


"So Why Atmos?"

Web applications want to speak HTTP.  Atmos is a net native storage platform that speaks HTTP at its very core. Managing mission critical web applications at scale is hard.  Home grown solutions to the storage challenge on the back-end will only last for so long.  There are technical complexities involved that scale out, cloud storage platforms such as Atmos solve elegantly and gracefully.  

blog comments powered by Disqus