bitcasa and convergent encryption
People have speculated how the new startup bitcasa can both encrypt files client side and dedupe files server side. The CEO says they use a method called "convergent encryption." This encryption method uses the file being encrypted to generate the encryption key. That way only people with the unencrypted file can generate the key but the same files unencrypted will be identical encrypted (and therefore de-dupable). It is believed by the security community to work as advertised, with two possible vulnerabilities:
1) "confirmation-of-a-file attack" - someone who gains access to your files can confirm whether you have a certain file. For example, someone could verify you have a certain movie/music file or leaked document. 2) "learn-partial-information attack" - in certain cases (from what I've read those cases haven't been strictly defined) an attacker could learn some information from a file if the attacker already knew other information in the file. Examples might be a government form where a lot of the text is known but some sensitive text (e.g. your social security number) isn't. I'm a fan of client-side encryption, and even with these "vulnerabilities" it seems to me what bitcasa is doing is a good idea and should be adopted at least as an option by other storage companies. One big limitation that comes with encryption is the inability to do operations like searching text on the server side. This can potentially be addressed through a method called "homomorphic encryption."