In this article, I will show how to optimize EC2 Linux instances using FS-Cache and instance store block-device storage to cache Amazon Elastic File System (EFS) mounts to improve performance of metadata-intensive workloads like content management or web serving.

Optimizing Amazon Elastic File System (EFS) against metadata-intensive workloads

In this article, I will show how to optimize EC2 Linux instances using FS-Cache and instance store block-device storage to cache Amazon Elastic File System (EFS) mounts to improve performance of metadata-intensive workloads like content management or web serving.

The solution architecture

The ratio of the time taken to perform operations on the metadata of a file on the file system layer to performing operations on its data (information about block placement in the file system structure) determines the difference between large files and small files. Metadata-intensive workload is the term used to identify such workloads.

Amazon EFS’s distributed nature enables high levels of availability, durability, and scalability. This distributed architecture results in a small latency overhead for each file operation. Due to this per-operation latency, overall throughput generally increases as the average I/O size increases, because the overhead is amortized over a larger amount of data.

Furthermore, Amazon EFS file systems can be mounted on up to thousands of EC2 instances concurrently. If you can parallelize your application across more instances, you can drive higher throughput levels on your file system in aggregate across instances. However, if you can’t parallelize your application to optimize the average I/O size and improve performance of meta-data-intensive workloads you can implement a persistent and transparent local cache using instance store block-level storage and FS-Cache. This permits data stored on Amazon EFS mounts to be cached on local disk, thus potentially speeding up future accesses to that data by avoiding the need to go to the network and fetch the file again. For more information, see Amazon EFS Performance.

The facility (known as FS-Cache) is designed to be as transparent as possible to a user of the system. Applications should just be able to use Amazon EFS stored files as normal, without any knowledge of there being a cache.

The following diagram is a high-level illustration of how FS-Cache will work for this solution:

                +-----------+
                |           |
                | EC2 Linux |
                |           |
                +-----------+
                     |               Amazon Virtual Private Cloud (VPC)
                ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                     |
                     |           +----------+
                     V           |          |
                +---------+      |          |
                |         |      |          |
                |   EFS   |----->| FS-Cache |
                |         |      |          |--+
                +---------+      |          |  |   +--------------+   +--------------+
                     |           |          |  |   |              |   |              |
                     V           +----------+  +-->|  CacheFiles  |-->|  Ext4        |
                +---------+                        |  /var/cache  |   |  /dev/md0    |
                |         |                        +--------------+   +--------------+
                |   VFS   |                                ^                     ^
                |         |                                |                     |
                +---------+                                +--------------+      |
                     |                  KERNEL SPACE                      |      |
                ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
                     |                  USER SPACE                        |      |
                     V                                                    |      |
                +---------+                                           +--------------+
                |         |                                           |              |
                | Process |                                           | cachefilesd  |
                |         |                                           |              |
                +---------+                                           +--------------+

FS-Cache

FS-Cache is an intermediary between Amazon EFS and the actual cache backends (CacheFiles) that do the real work. If there aren’t any caches available, FS-Cache will smooth over the fact, with as little extra latency as possible. FS-Cache does not guarantee increased performance, however it ensures consistent performance by avoiding network congestion. Using a cache back-end incurs a performance penalty: for example, cached Amazon EFS mounts add disk accesses to cross-network lookups. While FS-Cache tries to be as asynchronous as possible, there are synchronous paths (e.g. reads) where this isn’t possible.

CacheFiles (cache back-end) is the only cache backend currently available. It uses files in a directory to store the data given to it. The contents of the cache are persistent over reboots.

The cache back-end works by maintaining a certain amount of free space on the partition hosting the cache. It grows and shrinks the cache in response to other elements of the system using up free space, making it safe to use on the root file system. FS-Cache sets defaults on this behavior, which can be configured via cache cull limits. For more information, see Setting Cache Cull Limits.

The use of FS-Cache, therefore, is a compromise between various factors. If FS-Cache is being used to cache Amazon EFS traffic, for instance, it may slow the client down a little, but massively reduce the network and server loading by satisfying read requests locally without consuming network bandwidth.

Furthermore, this release of FS-Cache only caches regular NFS files. FS-Cache will not cache directories, symlinks, device files, FIFOs and sockets.

Bypassing the Page Cache with Direct I/O

In some cases we may want to do completely unbuffered I/O to a file. A direct I/O facility in most file systems allows a direct file read or write to completely bypass the file system page cache. NFS supports direct I/O. With direct I/O enabled, NFS bypasses client-side caching and passes all requests directly to the Amazon EFS mount target. Both reads and writes are uncached and become synchronous (they need to wait for the server to complete). Unlike disk-based direct I/O support, NFS’s support imposes no restrictions on I/O size or alignment; all requests are made directly to the server.

You can enable direct I/O for any file with the directio system call. Note that the change is file based, and every reader and writer of the file will be forced to use directio once it’s enabled. For more information, see fcntl – manipulate file descriptor.

int directio(int fildes, DIRECTIO_ON | DIRECTIO_OFF);
                                                                                                                                                                                       See sys/fcntl.h

Direct I/O can provide extremely fast transfers when moving data with big block sizes (>64 kilobytes), but it can be a significant performance limitation for smaller sizes. If an application reads and writes in small sizes, then its performance may suffer since there is no read-ahead or write clustering and no caching.

Instance Store

Some EC2 instance types come with a form of directly attached, block-device storage known as the instance store. The instance store is ideal for temporary storage, such as cache, because the data stored in instance store volumes is not persistent through instance stops, terminations, or hardware failures.

The instance type determines the size of the instance store available and the type of hardware used for the instance store volumes. Instance store volumes are included as part of the instance’s usage cost. You must specify the instance store volumes that you’d like to use when you launch the instance (except for NVMe instance store volumes, which are available by default).

Raid

Creating a RAID 0 array allows you to achieve a higher level of performance for a cache file system than you can provision on a single or multiple instance store volumes.

For greater I/O performance than you can achieve with a single volume (/dev/md0), RAID 0 can stripe multiple volumes together; I/O is distributed across the volumes in a stripe. If you add a volume, you get the straight addition of throughput. Performance of the stripe is limited to the worst performing volume in the set. Loss of a single volume results in a complete data loss for the array.

The resulting size of a RAID 0 array is the sum of the sizes of the volumes within it, and the bandwidth is the sum of the available bandwidth of the volumes within it.

Now that you understand the architecture of this solution, you can follow the instructions in this section to implement in your AWS account this solution.

Implementing the solution

This article do not cover how to launch EC2 instances or how to create an Amazon EFS file system and authorize instances accessing it, For more information, see Launching an Instance or Getting Started with Amazon Elastic File System.

Some instance types use NVMe or SATA-based solid state drives (SSD) to deliver high random I/O performance. Because of the way that Amazon EC2 virtualizes disks, the first write to any location on most instance store volumes performs more slowly than subsequent writes. It is recommended that you initialize your volumes by writing once to every volume before production use. For more information, see Optimizing Disk Performance for Instance Store Volumes.

Initialing the volumes

To initialize the instance store volumes, use the following commands on the m1.large, m1.xlarge, c1.xlarge, m2.xlarge, m2.2xlarge, and m2.4xlarge instance types: (Make sure to unmount the drive before performing this command.)

[ec2-user ~]$ sudo dd if=/dev/zero of=/dev/sdb bs=1M
[ec2-user ~]$ sudo dd if=/dev/zero of=/dev/sdc bs=1M
[ec2-user ~]$ sudo dd if=/dev/zero of=/dev/sdd bs=1M
[ec2-user ~]$ sudo dd if=/dev/zero of=/dev/sde bs=1M

Note: Initialization can take a long time (about 8 hours for an extra large instance).

Other instance type families like f1, i3, i4 and r4 with direct-attached solid state drives (SSD) and TRIM support provide maximum performance at launch time, without initialization.

Creating the RAID Array

Use the mdadm command to create a logical RAID device from the instance store volumes. Substitute the number of volumes in your array for number_of_volumes and the device names for each volume in the array (such as /dev/xvdf or /dev/nvme1n1) for device_name. You can also substitute MY_RAID with your own unique name for the array. For more information, see “RAID Configuration on Linux” [10].

Note: You can list the devices on your instance with the lsblk command to find the device names.

  1. To create a RAID 0 array, execute the following command (note the –level=0 option to stripe the array):
[ec2-user ~]$ sudo mdadm --create --verbose /dev/md0 --level=0 --name=MY_RAID --raid-devices=number_of_volumes device_name1 device_name2
  1. Create a file system on your RAID array, and give that file system a label to use when you mount it later. For example, to create an ext4 file system with the label MY_CACHE, execute the following command:
[ec2-user ~]$ sudo mkfs.ext4 -L MY_CACHE /dev/md0
  1. In general, you can display detailed information about your RAID array with the following command:
[ec2-user ~]$ sudo mdadm --detail /dev/md0

The following is example output:

       /dev/md0:
               Version : 1.2
         Creation Time : Sun Nov 26 22:37:43 2017
            Raid Level : raid0
            Array Size : 3710937088 (3539.03 GiB 3800.00 GB)
          Raid Devices : 2
         Total Devices : 2
           Persistence : Superblock is persistent

           Update Time : Sun Nov 26 22:37:43 2017
                 State : clean
        Active Devices : 2
       Working Devices : 2
        Failed Devices : 0
         Spare Devices : 0

            Chunk Size : 512K
...
...
...
           Number   Major   Minor   RaidDevice State
              0     259        0        0      active sync   /dev/nvme0n1
              1     259        1        1      active sync   /dev/nvme1n1

Enabling FS-Cache

Currently, Amazon Linux AMI 2017.09 only provides the cachefiles caching back-end. The cachefilesd daemon initiates and manages cachefiles. The /etc/cachefilesd.conf file controls how cachefiles provides caching services. To configure a cache back-end of this type, the cachefilesd package must be installed.

  1. To install the cache back-end, use the following commands:
[ec2-user ~]$ sudo yum install cachefilesd
  1. To configure the cache backend directory, use the following parameter:
[ec2-user ~]$ dir /var/cache/fscache
  1. To mount the RAID device on the cache mount point:
[ec2-user ~]$ sudo mount /dev/md0 /var/cache/fscache/
  1. After mount the RAID device, start up the cachefilesd daemon:
[ec2-user ~]$ sudo service cachefilesd start
  1. To configure cachefilesd to start at boot time, execute the following command:
[ec2-user ~]$ sudo chkconfig cachefilesd on
  1. (Optional) To mount this RAID device on every system reboot, add an entry for the device to the /etc/fstab file.
LABEL=MY_RAID       /var/cache/fscache/   ext4    defaults,nofail        0       2

Using the Cache With Amazon EFS

Amazon EFS will not use the cache unless explicitly instructed. To configure an Amazon EFS mount to use FS-Cache, include the -o fsc option to the mount command:

[ec2-user ~]$ sudo mount -t nfs -o fsc,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 mount-target-DNS:/   ~/efs-mount-point

All access to files under /efs-mount-point will go through the cache, unless the file is opened for direct I/O or writing.

Testing the solution

  1. In general, you can check if data is cached using the following command:
[ec2-user ~]$ sudo du -sh /var/cache/fscache
28K /var/cache/fscache
[ec2-user ~]$ sudo du -sb /var/cache/fscache
28672   /var/cache/fscache
  1. Create a small file for testing in the Amazon EFS mount, using the below command:
[ec2-user ~]$ dd if=/dev/urandom of=$HOME/efs-mount-point/small_file bs=4k count=128
128+0 records in
128+0 records out
524288 bytes (524 kB) copied, 0.0767711 s, 6.8 MB/s
  1. Free pagecache, dentries and inodes, using the below command:
[ec2-user ~]$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync
  1. Read the file for the first time. The file was read in about 27 seconds at speed of 29.0 MB/s:
[ec2-user ~]$ time dd if=$HOME/efs-mount-point/small_file1 of=$HOME/localfile.out bs=4k count=128
128+0 records in
128+0 records out
524288 bytes (524 kB) copied, 0.0181053 s, 29.0 MB/s

real    0m0.027s
user    0m0.000s
sys 0m0.000s
  1. Check if data is cached, using the following command:
[ec2-user ~]$ sudo du -sh /var/cache/fscache
568K    /var/cache/fscache
[ec2-user ~]$ sudo du -sb /var/cache/fscache
581632  /var/cache/fscache
  1. Again, reading the same file took about 8 seconds at the speed of 402 MB/s:
[ec2-user ~]$ time dd if=$HOME/efs-mount-point/small_file1 of=$HOME/localfile.out bs=4k count=128
128+0 records in
128+0 records out
524288 bytes (524 kB) copied, 0.0013054 s, 402 MB/s

real    0m0.008s
user    0m0.000s
sys 0m0.000s

Conclusion

Results:
  . Average speed for first Read operation = 29.0 MB/s
  . Average speed on second Read operation = 402 MB/s
  . For a 512 KB file, it drops from 0.027 to 0.008 sec.

File systems make extensive use of caches to eliminate physical I/Os where possible. In this article, I have shown a way to deploy a local cache on EC2 Linux instances using Amazon Linux to improve performance of Amazon EFS’s average I/O size against metadata-intensive workloads like content management, and web serving.

Leave a Reply

Your email address will not be published. Required fields are marked *