Clever Engineering Blog — Always a Student

Swappiness and Amazon ECS

By Alex Smolen on

At Clever, one of our tenets is “Always a Student”, and in that spirit of learning we wanted to share the changes we made to fix memory allocation issues in AWS Elastic Container Service related to swappiness.

Swappiness is a Linux Kernel setting that specifies how likely it is for a page in memory to be written to virtual memory on the hard disk, or “swapped”. More precisely, it indicates the eviction preference for file cache pages (with a filesystem source) versus anonymous pages (without file backing, e.g. stack pages). A high swappiness value means anonymous pages are aggressively swapped. At 100, file cache and anonymous pages are equally likely to be evicted, whereas at 0 anonymous pages will never be evicted. The default value is 60 and can be tuned by updating /etc/sysctl.conf.

Why is 60 a good default, and when should you change it? A low swappiness value works for databases that manage their own caching or desktop systems that need responsive process switching. For workloads that want to eke out every last bit of virtual memory at the expense of performance, you’d want to maximize the inactive pages being written to disk by setting a high swappiness value.

Our challenge at Clever was not tuning swappiness, but dealing with swappiness changing unexpectedly. In the past, Clever deployed services to EC2 instances directly, and these services tended to have spiky memory access patterns. This memory pressure led to some swapping to prevent out-of-memory conditions given our modestly-provisioned instances, which was fine given that we didn’t care about the performance of these asynchronous workloads. Effectively, we tuned our instance size and services to work with the default swappiness of 60.

Problem: ECS does not support swap

During our move to a container-based infrastructure in late 2013, we relied not only on the default OS swappiness value, but on Docker using this default swappiness value. For containers, there is a swappiness setting in the memory cgroup that allows you to control the swappiness of a container. If the value isn’t specified, the host value is used. As we transitioned container management engines from Mesos to Amazon ECS, we began interacting with Docker through the Amazon ECS Agent.

The ECS StartTask API has a subset of the Docker Start API, and notably it does not include Docker’s swappiness setting. The swappiness value was hard-coded by ecs-agent to -1, which caused the default host value of 60 to be used. Unfortunately, the Amazon ECS EC2 AMI didn’t have a mounted filesystem for the swap to go to, so no pages were swapped.

Creating a swap file for ECS AMI-based instances

We found a discussion of Swap options for Amazon ECS that offered us a workaround, where we added the creation of a swapfile in the userdata for our EC2 container instances:

fallocate -l 5G /swapfile && chmod 0600 /swapfile && mkswap /swapfile && swapon /swapfile

This resolved our issue until one day we canaried a new ECS AMI and started getting out-of-memory errors.

Problem: ECS started using a different swappiness value

We were able to determine that the ecs-agent started sending a swappiness value of 0. This meant swappiness for our Docker containers was set explicitly to 0 by the AWS ECS Agent, and we couldn’t change it. With swappiness at 0, our processes got OOM-killed, and free showed the swap file not being used. We were able to further diagnose by using docker inspect and comparing containers running against the new AMI, running in canary mode, and the old AMI.

Docker MemorySwappiness value of -1 means use host OS value, whereas 0 sets the value to 0

How do we get back to a swappiness of 60?

You can modify cgroup settings while the process is running. We added a hook on docker-events for the machine container instance, listened for starts, and updated the cgroups setting. The code in our docker-events system container looks like this:

func handleStart(client *docker.Client, id string) {
  swappiness := "60"
  for tries := 0; tries < 10; tries++ { 
    if tries > 0 {
      // handle race condition between docker start and existence of cgroup
      time.Sleep(500 * time.Millisecond)
    }
    container, err := client.InspectContainer(id)
    if err != nil || container == nil || container.HostConfig == nil || container.HostConfig.CgroupParent == "" {
      continue
    }
    swappinessFilePath := path.Join("/cgroup/memory", container.HostConfig.CgroupParent, id, "memory.swappiness")
    if err := ioutil.WriteFile(swappinessFilePath, []byte(swappiness), 0644); err == nil {
      break
    } else {
      if strings.Contains(err.Error(), "no such file") {
        continue
      } else {
        break
      }
    }
  }
}

While AWS’s ecs-agent may allow users to specify swappiness in the future, for now there seems to be no other way to get swappiness than manually setting it, using an approach similar to the one above. We thought this was a sufficiently difficult problem and fix to merit a blog post in case anyone finds themselves in a similar predicament.

Credits

This blog post is based on interviews with CTO Rafael Garcia and infrastructure engineer Xavi Ramirez. To learn more about swappiness, I recommend Swapping BehaviorLinux Swappiness, and this StackOverflow answer the question “Why is swappiness set to 60 by default?”.