Securing Saved-password Applications

By Alex Smolen on January 10, 2017

The password is both a ubiquitous and brittle security mechanism. With the emergence of new security trends like post-quantum cryptography and IoT-botnet attacks, it’s easy to overlook attacks that exploit guessable, reused, or coerced passwords. But the wherewithal among users to use strong passwords and keep them safe is rare. Despite decades of practice, managing passwords securely at scale is tough.

Clever saw this pain in schools and developed features like Instant Login and Custom SAML Connections so students only need a single password to access their educational apps. The OAuth and SAML protocols that drive these features don’t require passwords to be stored by or sent to third-parties.

For these features to work, applications need to cooperate and build integrations. While a growing number of apps support Instant Login, some apps don’t yet. Building these integrations can be time consuming, and they may not fall to the front of every prioritized feature list. But to deliver on the promise of one login for every app in the classroom, we needed a way to store and provide, for every user, passwords to apps that don’t support delegated authentication protocols.

This problem is called password vaulting, and it’s difficult to get right. For example, you need to find the right password for the site the user is visiting and the right field to enter the password. If you fail at either of these tasks, you may make it more difficult to log in, or worse, expose a password to an unauthorized service. For examples of practical attacks on password managers, see the “Password Managers: Attacks and Defenses” paper.

Many password managers work with any site on the internet, and so they need to solve the problem generically. What makes this problem more tractable for Clever is that we are only focused on schools. Districts use a set number of applications, and in many cases take care of password generation on behalf of students. Our design challenge was to help districts let students and teachers use these saved passwords, with the important constraint that the usability and security of the overall system match or exceed our existing Instant Login product.

Similar to the approach discussed in our previous blog past about Clever Badges, we used a technique called threat modeling, along with code reviews and other secure software development practices, to identify threats, reduce risk, and ensure the overall security of the resulting system. The following sections describe the significant security design decisions we made while following this process.

System overview

The Saved-password system is a combination of several components:

A Chrome, Firefox, or Edge extension to automatically fill passwords into websites
An OAuth 2.0-authenticated password-retrieval HTTPS API called by the extensions
An SFTP server to allow districts to programmatically upload passwords
A temporary file store for uploads
A database to durably store passwords and other metadata
A worker to asynchronously process uploads
Cryptographic keys to encrypt password data at rest

Figure 1: Communication between Saved-passwords components

The most visible aspect of the Saved-passwords feature is the Chrome extension (now available for Firefox and Edge), which is responsible for driving the browser interaction that authenticates students and teachers to a site using a saved password. Our battle-tested OAuth service does the heavy lifting of authenticating users and verifying connections between districts and applications. We also need to store and serve the passwords. To do that, we built a special service, called Lockbox. Lockbox acts as the holder and gatekeeper of saved passwords. It is a compact, easily reviewable HTTPS API with limited access to required AWS resources.

To allow district administrators to upload credentials, we created a temporary file store (an S3 bucket) that is processed regularly by Lockbox worker, a batch processing tool. CSV files of credentials can be manually uploaded through the existing dashboard or programmatically uploaded using a new SFTP service, which is described later in this post.

These components are built on different technology stacks with concomitant attack surfaces, and the numerous connections between them are potentially exploitable trust boundaries. Our security team analyzed each component for susceptibility to attack, in part or in combination, to ensure that users’ passwords wouldn’t be exposed through accident or malicious action.

How passwords are encrypted

One of the most obvious security requirements for the project was to store the passwords in an encrypted format. We decided to use AWS Key Management Service (KMS) to store the master key. KMS uses Hardware Security Modules (HSMs), which reduce the risk of adversarial extraction of the root key, and provide a simple interface to build against for cryptographic systems. When the passwords are uploaded to a temporary file store, we encrypt them with the master KMS key using S3 KMS encryption. An asynchronous worker then decrypts and processes the data.

Before the data is stored in the database, the master KMS key is used to encrypt a “data key.” This data key is used to encrypt the data and stored with the encrypted data using a process known as envelope encryption. This allows us to use different keys for each application-district connection. In addition to performance improvements, this reduces our resilience to unauthorized password disclosure. For instance, if a SQL injection vulnerability allowed an attacker to access password data for a different district or application, as long as we choose the correct key, an attacker wouldn’t be able to retrieve the data.

The passwords for each application-district connection are encrypted with the data key using the go/crypto implementation of the AES256-GCM cipher suite.

Figure 2: The encryption process for passwords, from upload to access

When an authorized user requests a password, the lockbox service retrieves the associated encrypted data key and decrypts it with the KMS master key. The service can use the decrypted data key to decrypt the password and return it to the user.

By defining the KMS permissions using the principle of least privilege and auditing all calls to this key, we’ve increased our assurance that the passwords are only accessed by authorized users and services.

How the extensions fill passwords

In order to log users into sites, the extensions need instructions about where to place the username and password, what button to click to submit, and similar steps. These instructions vary from site to site. Many password managers use heuristics to evaluate these steps (e.g. are there two input boxes next to each other, and one has a type equal to password). This scales well, but it can lead to broken experiences when sites have unusual login pages. Furthermore, it may lead to the password being accidentally exposed, or if the attacker can control part of a page, maliciously exposed.

Because Saved-passwords operates with a finite number of sites, it’s feasible to manually create password entry instructions for each site. Our first instinct was to use Javascript to write these instructions to traverse and manipulate the DOM, but Javascript is a powerful language, and we needed only a subset of it’s functionality. Instead, we created a simple Domain Specific Language (DSL) we call SquidScript that is interpreted by the extensions and has limited capabilities.

Figure 3: Example SquidScript login instructions

Because login instructions are specifically designed for each site, it’s less likely that we enter passwords in the wrong field. And since we are using a DSL with limited capabilities, the scripts aren’t capable of doing much damage. Even if someone were able to control or change the script, they would have a very limited set of functions that they could perform to attempt to exfiltrate the password. And since the DSL is tailored for the task of logging a user in, it’s easy to use. We can write a “login connector” for a new site in about 15 minutes, although it takes longer to fully test and deploy.

How the SFTP service gets files Into S3

Clever has a long relationship with SFTP, the de facto secure file transfer standard for many of our users. While we’re familiar with hosting an SFTP server and using a variety of homegrown tools to process data from it, we foresaw problems reusing that approach for Saved-passwords. One is that an SFTP server needed to be a “special snowflake” in our infrastructure, since it required OS-level interaction around user accounts and the file system, which don’t play well with Docker. Furthermore, since a user is fundamentally establishing an SSH session with the SFTP server, we felt the exposure to a vulnerability that could lead to cross-account data access was too high.

Instead, we decided to fork a Go sftp library to automatically read and write files from S3, rather than the local file system. This allowed us to take advantage of strict IAM policies for each authenticated session which restrict S3 folder access to only the district associated with the authenticated user. Furthermore, we can easily fire SNS events to kick off processing of files written to the bucket. We liked this approach so much that we recently switched over our more broadly-used legacy SFTP server to use it.

Summary

At Clever, we aspire to be excellent guardians of our users’ data. We have to be careful when we build new functionality that bears the heavy responsibility of student data security. With Saved-passwords, we saw the opportunity to increase the number of applications that students and teachers could access without juggling another password. We knew the effort would not be trivial; many Clever employees had wisely voiced concerns about password vaulting in previous internal discussions. And while no system offers perfect security, we believe Saved-passwords is marked improvement for the overall security of K-12 classrooms. We hope it makes students and teachers digital life just a little bit easier.