Amazon Simple Storage Service (S3) is one of the most widely-used cloud services. Most users of the service know it’s wise to encrypt sensitive data before storing it in S3. In this post we’ll look at how to do that securely using the AWS Java SDK, and how Cryptosense Analyzer will help you spot if you’ve done it wrong.
Note that in this post we’re talking about client-side encryption where the sensitive data must be encrypted locally before it’s sent to AWS S3 servers. There are also options for server-side encryption managed via the S3 console. These only treat the data while at rest, it will still be in clear inside AWS servers (at least briefly) each time it’s accessed.
There are several different client-side encryption modes for S3 offered by the Java S3 SDK. First you need to decide whether you want to manage your master keys yourself, or have AWS manage your master keys in their key management service (KMS).
Master keys will not be used directly to encrypt the data you send to S3. Instead, they will be used to encrypt the data keys that will do the encryption. So, if you manage your master keys yourself, you have to take care of key management, but the key values never leave your perimeter. If you use master keys in AWS KMS, the master keys stay in AWS. Each time you need to encrypt or decrypt, you send a request to the KMS to either create a new data key and send it back (both in clear and encrypted under the master key, for encryption operations), or to decrypt an encrypted data key and send it the key to the client (for decryption operations). The diagram below illustrates the difference.
The next important detail is the encryption mode. Choosing the encryption mode affects two security-critical points: the algorithm used to protect the encrypted data key, and the algorithm used to encrypt the data itself.
Early versions of S3 encryption supported only an unauthenticated encryption mode for encrypting the data (AES-CBC with PKCS5 padding). This is dangerous, first because these modes don’t guarantee that the ciphertext has not been tampered with, and further they can often lead to padding oracle attacks on the underlying plaintext if the application is not extremely careful about treating decryption errors.
In 2014, authenticated encryption using AES-GCM was added. When authenticated encryption was introduced, the algorithms for encrypting the data key under the master key were also updated: if the master key is a symmetric key, AES Keywrap is used, and if it is an RSA key, RSA-OAEP is used. It’s hard to find documentation for what algorithm was being used before, but we can discover this easily simply by downloading the SDK, making an encryption using the old mode and using Cryptosense Analyzer to see what happens.
What this reveals is that the data key is encrypted under AES-ECB mode. This is not good practice: ECB mode has very few security properties, it is deterministic and unauthenticated. Using this wrap mode is now marked as “not recommended” on the AWS SDK crypto page.
If you opted to manage your encryption keys on AWS KMS instead of client-side, then the encryption of the data key by the master key will be carried out using AES-GCM. This is because encrypting the data key is treated just the same as any other KMS encryption operation.
At a Later Date
If you select one of the newer authenticated encryption modes you have another choice: you can either enforce its use (so trying to decrypt any unauthenticated data will fail), or accept old unauthenticated data and decrypt it anyway, or just use plain unauthenticated AES-CBC for everything. Why might you want to use AES-CBC? Well, AES-GCM has a size limit based on fundamental limitations of its construction (you can’t encrypt more than 2^32 blocks before the counter wraps around), so you are limited to roughly 64 GB.
But you don’t have to settle for lousy unauthenticated encryption just because you have a lot of data to encrypt and want to put it on AWS S3. You could use the AWS Encryption SDK instead, for example – we’ll cover that in a future post.
Want to know if your Java applications are using crypto securely, in AWS SDKs and anywhere else, at scale? Try a free trial of Cryptosense Analyzer.