While virtualization makes it pretty easy to spawn up new VMs quickly (e.g. for load balancing purposes), I always felt that providing concurrent file-based access to the same data to these VMs has been somewhat cumbersome, even though it’s still a requirement for many applications that need to share data between parts of the application, or multiple instances thereof.
If you didn’t have some kind of SAN/NAS solution in your data center, it usually involved quirks creative solutions on the VM side (e.g. setting up a VM instance that acted as a central file service via NFS/SMB, or using a shared disk file system like GFS2 or OCFS2). But even if you did, the underlying virtualization technology did not provide any integration or API-based approach to this (at least that was my impression).
I recently stumbled over Amazon’s Elastic File System (EFS), which was announced on April 9th, 2015. EFS provides shared storage as a service (STaaS) via the NFSv4 protocol. This makes it pretty easy to mount the same share on multiple (Linux-based) VMs. Amazon only charges you for the storage that you actually use (billed monthly, based on the average used during the month), and the use of SSDs should make sure that latency (IOPS) does not suck too badly.
Interestingly, Microsoft has been offering something similar for almost a year now: Azure File Service was announced on May 12th, 2014 already. It provides shared access to files via the SMB protocol (which makes it suitable for both Windows and Linux-based VMs). In addition to that, Azure File Service also provides a REST API to access and manage objects stored on this service, which makes this service more versatile/flexible. Similar to Amazon, Microsoft only charges for the disk space you actually use.
Note that both EFS and Azure File Service are still labeled as “Preview” at the time of writing and have certain limitations you should be aware of (Unsupported NFSv4.0 Features in EFS, Amazon EFS Limits During Preview, Features Not Supported By the Azure File Service) – so make sure to have backups of any data you store on them 🙂
The Open Source community has noticed the requirement for shared file access, too – Red Hat recently announced their participation in OpenStack’s Manila project, which provides a shared file service for this emerging cloud technology. From what I can tell, Manila’s focus currently is more on providing shared storage for OpenStack compute nodes, it’s not entirely clear to me yet if there are any plans to establish this as a solution to provide shared file systems to virtual machines as well (in addition to the object and block storage capabilities they already offer).