There are several vendors offering encryption for SaaS applications such as Salesforce.com. But most have "gotchas!"
Here are the basic requirements for a SaaS encryption solution:
- Strong encryption via standard algorithm e.g AES-256
- Encryption of specific fields
- Encryption of files
- Search enrcypted data, including partial keywords & wild-card search
- Sort encrypted data
Of course, customers expect that a product will support all of the above simultaneosly. Unfortunately, most vendors suffer limitations. For example, one vendor claims support for all of the requirements but with a caveat as below.
"Data can be encrypted on a per-field, per-word, or per-character basis with configurable variables for random or predefined number of initialization vectors (IVs);" Source: CIPHERCLOUD INFORMATION PROTECTION OVERVIEW
Hmm...constraining the space of initialiazation vectors in AES-256 means it is no longer AES-256! Cyclic ciphers with predefined IV's have been around since the Roman Empire and are easily cracked.
Another vendor claims support for all of the requirements with yet another caveat.
"When tokenization is used as an obfuscation technique within the Blue Coat Cloud Data Protection Server, a Token Vault is created based on a Relational Database (RDBMS) structure and stores the tokens, as well as protected data intercepted and tokenized before it is sent to the SaaS Cloud Glossary 63 Application for processing and storage. It also provides index caching features to preserve search functionality within the cloud SaaS application. This allows users to perform complex searches without needing to use compromised security techniques such as weakened encryption algorithms (i.e. Strong Searchable Encryption) to preserve cloud application usability." Source: Page 62 of Blue Coat Cloud Data Protection Server Administration Guide
Hmm....in other words, the vendor supports searchable tokenization or "obfuscation" as they call it, but not searchable encryption. Tokenization encodes each word of the input string independently, preserving word boundaries in the output string and thereby leaking information. To make matters worse, the vendor's implementation uses deterministic tokenization, which is essentially another cyclic cipher that harks back to the Roman Empire.
"A clear text value that occurs multiple times in the collection shares the same token." Page 4 of Blue Coat Cloud Data Protection Server Administration Guide
In short, both of the above vendors offer a choice between true encryption that is not searchable, or deterministic encoding schemes in the form of encryption with constrained IV or "tokenized obfuscation"that is searchable. Deterministic encoding is very easy to crack via statistical attacks and known plain-text attacks. For example, given the plain-text and tokenized forms of the phone book, you can decode any name. Given the plain-text and tokenized forms of a novel, you can decode almost any word in the language. Given just the tokenized form of a very long novel, you can use statistical analysis to identify the most common word, the next most common word etc.
In contrast, true encryption homogenizes the input string completely so that the output string is a single continuous string with no word boundaries and bears no statistical relation to the input string, ensuring zero leakage of information.
Data encryption is a deep and specialized topic. Encryption standards take years to evolve and must withstand public inspection and attack by experts. As a customer, if you are going to take the trouble to install a Cloud-Access-Security-Broker for encrypting data in SaaS applications, you want to make sure that the data is truly encrypted.
If you want searchable, sortable, true AES-256 SaaS encryption, with 256-bit Initializaton Vectors, check out Bitglass's patented approach. Endorsed by experts such as Professor Martin Hellman of Stanford, inventor of Public-Key Cryptography.