Tagged: virtualization Toggle Comment Threads | Keyboard Shortcuts

  • lenz 12:22 on 2015-06-01 Permalink
    Tags: , , virtualization   

    Shared storage in the cloud 

    While virtualization makes it pretty easy to spawn up new VMs quickly (e.g. for load balancing purposes), I always felt that providing concurrent file-based access to the same data to these VMs has been somewhat cumbersome, even though it’s still a requirement for many applications that need to share data between parts of the application, or multiple instances thereof.

    If you didn’t have some kind of SAN/NAS solution in your data center, it usually involved quirks creative solutions on the VM side (e.g. setting up a VM instance that acted as a central file service via NFS/SMB, or using a shared disk file system like GFS2 or OCFS2). But even if you did, the underlying virtualization technology did not provide any integration or API-based approach to this (at least that was my impression).

    I recently stumbled over Amazon’s Elastic File System (EFS), which was announced on April 9th, 2015. EFS provides shared storage as a service (STaaS) via the NFSv4 protocol. This makes it pretty easy to mount the same share on multiple (Linux-based) VMs. Amazon only charges you for the storage that you actually use (billed monthly, based on the average used during the month), and the use of SSDs should make sure that latency (IOPS) does not suck too badly.

    Interestingly, Microsoft has been offering something similar for almost a year now: Azure File Service was announced on May 12th, 2014 already. It provides shared access to files via the SMB protocol (which makes it suitable for both Windows and Linux-based VMs). In addition to that, Azure File Service also provides a REST API to access and manage objects stored on this service, which makes this service more versatile/flexible. Similar to Amazon, Microsoft only charges for the disk space you actually use.

    Note that both EFS and Azure File Service are still labeled as “Preview” at the time of writing and have certain limitations you should be aware of (Unsupported NFSv4.0 Features in EFS, Amazon EFS Limits During Preview, Features Not Supported By the Azure File Service) – so make sure to have backups of any data you store on them πŸ™‚

    The Open Source community has noticed the requirement for shared file access, too – Red Hat recently announced their participation in OpenStack’s Manila project, which provides a shared file service for this emerging cloud technology. From what I can tell, Manila’s focus currently is more on providing shared storage for OpenStack compute nodes, it’s not entirely clear to me yet if there are any plans to establish this as a solution to provide shared file systems to virtual machines as well (in addition to the object and block storage capabilities they already offer).

  • lenz 10:56 on 2015-02-03 Permalink
    Tags: scalability, virtualization   

    Back in the good old days of physical servers, you basically had two choices to increase the performance of your application: you either “scaled up”, by migrating to a beefier server with more RAM, CPUs/MHz, or you “scaled out”, by distributing your application load across multiple individual servers.

    Interestingly, I still observe customers applying this way of thinking to virtual environments, using multiple VMs behind a virtual load balancer for scaling out application load.

    Does this approach really make sense anymore? I think that it puts more load on a hypervisor to schedule multiple VMs for handling the workload than it would if the same load would be handled by one single powerful VM instance (with more vCPUs, more vRAM).

    Does “Scale Out” still make any sense in a virtual environment? It probably also depends on the application and if it can effectively scale with more CPUs and memory, but in general I don’t think it is a valid approach.

    • Ingo 11:58 on 2015-02-03 Permalink | Reply

      I think it can make sense:

      think of scaling out beyond the limits of physical server
      by combining this with a placement policy to more than one availability zone, so you even would achieve HA with this
      performance wise it could be beneficial on a NUMA host to use VMs that are bound to one NUMA Zone. That could be more performant than crossing all NUMA Zones with one VM.

      just my 0.02 €,
      greets, Ingo

      • lenz 12:07 on 2015-02-03 Permalink | Reply

        Hi Ingo, thanks for your comment!

        Good points, I agree that from an HA perspective there is a valid reason for this kind of setup, but you need to have more than one physical host/hypervisor for this. Also more on the HA side of things is using scale out to be able to perform maintenance tasks one one node, without having to take down the service.

        With regards to NUMA zones, I have no experience about the performance impact of this. My gut feeling is that it actually might be more performant to schedule two VMs in different NUMA zones than having one big VM that crosses NUMA boundaries. I need to do some research about this, to educate myself πŸ™‚



        • Christian 18:09 on 2015-02-03 Permalink | Reply

          even if there isn’t any performance gain on NUMA (while I’m pretty confident that there IS w/o having any backing numbers on it, either πŸ˜‰ ) from a Hypervisor perspective it’s still more efficient for it to handle e.g. two 1vCPU vs. one 2vCPU guests – as the VM is only able to demand “I need my CPU ressource” which then would the Hypervisor have to schedule for two physical Cores available to give the 2vCPU VM it’s ressource – even if there was only a single thread task within the VM that asked for it..
          So keep your VMs as small as possible and rather spread single tasks among multiple of those small VMs..

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc