Google Open-Sources Fully Homomorphic Encryption (FHE) Toolkit (therecord.media) 78
Google has open-sourced a collection of C++ libraries for implementing Fully Homomorphic Encryption (FHE) in modern applications. From a report: Fully homomorphic encryption, or simply homomorphic encryption, is a form of data encryption that allows users/applications to perform mathematical computations on encrypted data without decrypting it first, keeping the data's privacy intact. While the concept of homomorphic encryption has been around since 1978, when it was first described at a theoretical level, and 2009, when it was first implemented in practice, it has not been broadly adopted in software due to its complexity, advanced cryptography techniques, and lack of open-source code and public documentation. However, despite this, today, FHE is a hot technology in software design.
FHE allows software vendors to work on encrypted data without sharing the encryption/decryption keys with untrustworthy systems such as client-side apps or publicly-hosted web servers, where the keys could be stolen or intercepted by malware or malicious human operators. FHE allows developers to keep data secure, encrypted, and private, all at the same time, and Google hopes that developers will use its FHE libraries as the first step into adopting this new type of encryption technology within their applications.
FHE allows software vendors to work on encrypted data without sharing the encryption/decryption keys with untrustworthy systems such as client-side apps or publicly-hosted web servers, where the keys could be stolen or intercepted by malware or malicious human operators. FHE allows developers to keep data secure, encrypted, and private, all at the same time, and Google hopes that developers will use its FHE libraries as the first step into adopting this new type of encryption technology within their applications.
Re: (Score:1)
You can run those if you want to, but it's just more processing overhead. It's generally recommended to disable that option in your copy, but I realize everyone's situation is different.
Re:not quite (Score:5, Interesting)
Healthcare might be the killer app for this. Companies would love to have access to masses of healthcare data to do analysis on, but privacy is a barrier. If data could be supplied in an encrypted state, not just anonymous but actually encrypted, it might open up a lot of new opportunities for research and services.
Re: (Score:2)
Healthcare is one possible application for this type of encryption.
So it is an "app", perhaps just not in the way you were expecting.
Re:not quite (Score:4, Insightful)
The complication with FHE is that if you don't know what the data is you also don't know (for sure) what the operations you perform are, nor what the results are. FHE enables a company with "protected health information" (PHI) to outsource FHE processing of that data, but it wouldn't give them PHI in the first place, and it wouldn't let the contracted data processors know what they were doing.
What you're probably looking for is generally called "privacy-preserving queries" of a database. The simplest reliable way to do that involves adding noise to many or most aggregate results, depending on how many records go into that aggregate and the population statistics of the underlying records, which is still (a) hard to get truly right in a privacy sense and (b) annoying to the database users.
And some examples (Score:5, Interesting)
The complication with FHE is that if you don't know what the data is you also don't know (for sure) what the operations you perform are, nor what the results are.
And there are numerous examples of this.
AI uses big batches of data everywhere, and you can get some interesting sources of data from Kaggle and online corpora. When you examine the data, you find many individual data fields are invalid for some reason or another - data is missing, data is typo'ed, fields truncated and so on.
Suppose you want to compute average patient age using FHE: if there is a single missing data element represented by 9999 (or -1, or 0), the average will be somewhat off. If your average is then part of the cross-correlation matrix calculation, then your matrix will be somewhat off. And if the CCM is used to calculate a multivariate regression, then the results of *that* will be off. Depending on the scope of the errors, it can be impossible to detect that the end result is invalid.
If you make a mistake in a computer program, it crashes. If you make a mistake in a data mining operation, you get noise. And there's no way to distinguish noise in the data from noise in your process.
The first step in any data mining operation should be to check the validity of the data. I don't think this can be done with FHE.
Re: (Score:2)
The onus of data validation would have to be on the data sources. Those teams or organizations who ingest ciphertext will need to have "integrity" trust in the data sources, rather than data source having "confidentiality" trust in the processor.
Re: (Score:2)
Re: (Score:2)
It could BE the owner of the data running the query. Every time you decrypt the data there is a chance of exposure. Doing operations while the data is encrypted improves your own security.
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
FHE enables a company with "protected health information" (PHI) to outsource FHE processing of that data
This.
FHE is pretty much a pre-requisite to run calculations on actually sensitive(*) data in the cloud.
(*) I mean, when it's taken seriously. Lots of companies deal with sensitive data in a way that should land them in jail, but we all know that it won't happen because it's ordinary people whose data is going to be lost, and who cares about them?
Re: (Score:2)
This is too weird, how on earth do you perform mathematics on numbers when you don't know what those numbers are? I thought Par files were mad, this is total computer voodoo.
Re: (Score:2)
This is a gross over simplification but the idea here is that for at least certain defined operations either doing the operation or some other algorithmic transform on two cipher texts and then decrypting is the same as if you did the operation on the plain texts.
a + b = c
dec(SomeOperation(enc(a), enc(b))) = c
Re: not quite (Score:5, Informative)
Re: (Score:2)
Even with the explanations by DarkOx and mcelrath I'm still quite confused as to how encrypted data can have operations performed on it and the result can be decrypted correctly without the data being decrypted for the actual operation and the result being re-encrypted.
Admittedly I'm very weak on math... but it seems to me that the types of encryption that can be used and the types of operations that can be performed would need to be severely limited OR the encryption isn't really encryption and is more li
Re: not quite (Score:5, Informative)
It's pretty easy to demonstrate with the venerable (and super secure) rot13 algorithm, or even with one time pads.
Suppose I've got the numbers 5 and 10 and I want to send them to you to add together. I generate the pads 12838 and 83850, giving me the encrypted values 12843 and 83860, which I send to you.
You add them together, getting 96703. I can then decrypt that value by subtracting 12838 + 83850, giving 15.
As DarkOx said, that's a gross simplification, but this toy example shows the concept that, by constructing your encrypting scheme carefully you can set things up so computations done on the encrypted values correspond to computations (usually not the same ones) on the plain text ones.
Re: (Score:2)
A bit more technically. A homomorphism h from algebraic structure (A, +) to alg. struct. (B, +) must satisfy
for all x, y in A h(x + y) = h(x) + h(y) where the first + is from (A, +) and the second from (B, +)
Now think of h as being an encrypter, it has an inverse g. (I'll suppress the keys, they are parametric anyhow, think of h and g being equipped appropriately).
So I want g(h( x + y )) but you must do the addition. You get h(x) and h(x) and you return h(x) + h(y).
We know that g a
Re: (Score:3)
OK, that certainly clarifies it, though with the simple example it seems that this is fairly useless. Why off load the simple operation when you have to do the encryption/decryption and then the same basic operation to handle the results?
Is it possible to do calculations complex enough to counter balance the overhead involved?
Again, math dummy but it seems anything complex enough to make it worthwhile would fail, eg "here's an array of 100001 numbers where the index is the X and the value is the Y of a poi
Re: not quite (Score:5, Informative)
The lure of homomorphic encryption is that you can hand over your encrypted data to someone else and they do something with your data and give you the result. This way they do not get to know your data and you do not get to know their algorithm. It should be blindingly obvious why a company like Google wants this.
Re: (Score:2)
Re: (Score:2)
Yes, that's the idea. My simple example is more to demonstrate that it's possible, rather than how it's actually done. The guys with the algebra are describing the actual operation. If you set up your encryption system appropriately you can enable more complicated operations and it's more akin to encrypting all the input with the same key, then decrypting the result with that same key (as opposed to the OTP I used). So theoretically I could give you a fairly small, encrypted input, you could do some very c
Re: not quite (Score:2)
As a parent poster said, I also am not so good at math. Do you loose strength of encryption with FHE data?
Re: (Score:2)
That depends on what you mean by strength of encryption. As far as I know there's no theoretical reason why an FHE encryption scheme can't protect your data from being decrypted just as well as another type of encryption. Non-homomorphic encryption does protect your data from modification though, and homomorphic of course doesn't.
If you send me the number 10 via regular symmetric or public key encryption, if somebody messes with it in transit the result will be incomprehensible to both of us. If that same n
Re: (Score:2)
This is too weird, how on earth do you perform mathematics on numbers when you don't know what those numbers are? I thought Par files were mad, this is total computer voodoo.
Not voodoo. Just massively slow with no benefit that can be simply expressed.
Try this for an example : https://web.mit.edu/sonka89/ww... [mit.edu]
Your algorithm is a circuit of gates (no feedback - it's combinatorial). Each gate is implemented with a bunch of encryption and KDF functions (hence the massive slowdown). Your data size on the wire got a lot larger.
Re: (Score:2)
IBM open sourced their toolkit already, maybe there is something to it.
Re: (Score:2)
This is too weird, how on earth do you perform mathematics on numbers when you don't know what those numbers are?
The amount of number pairs that add to a given number is infinite - given number "10", you have 1+9, 2+8, 3+7, 4+6, 5+5, 6+4, and so on - plus negative numbers (which means even if we're limited to whole numbers, our potential set is now infinite), plus decimals on top of that.
Even zero, arguably the most simple result, has an infinite amount of number pairings - 1 + -1, 2 + -2, and so on.
Because of the transitive property of addition, it's trivial for me to come up with "encoded" number pairs, for instance
Re: (Score:2)
not really. this scheme is good for validation but not really for research. you may perform consistent operations on the data, which is a lot, but nothing more. for healthcare applications it is meaningless that e.g. the exponentiation of all the blood pressure values in a dataset is consistent with its encrypted counterpart, they would need the actual values.
this could be used to anonymize part of the dataset but probably overkill as there are far simpler methods to do that.
Re: (Score:1)
The word sounds to me like something that modifies someone's homosexuality (i.e. some kind of gay conversion therapy).
Re: (Score:1)
Re: (Score:3)
Somebody added 'sexuality to this conversation where it didn't exist
Is it tough to walk down the milk aisle?
Re: (Score:2)
The word sounds to me like something that modifies someone's homosexuality (i.e. some kind of gay conversion therapy).
I imagine a Red/Southern state will try to ban Homomorphic Encryption sooner or later ... :-)
Hey . . (Score:5, Funny)
Re: (Score:1)
You may be joking, but it would make sense to be homomorphoic encryption phobic. This essentially means that malware can perform fully encrypted computation on your system using your CPU time. There is no way to know what data is being processed or what type of processing is being applied. This instantly brings to mind a bittorrent-esque malware network that is entirely impossible to identify or protect against; all blended in with an uncountable number of secret processes executed on encrypted data.
Such pr
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
OK. That seems fair.
As an aside regarding reverse engineering and why it's more necessary than it used to be:
I've been doing software development for a long time. In the past, we used to write, test, and deploy our own code, using libraries only when truly appropriate and only when fully understood, since we were ultimately responsible for the result. Lately, however, with the rise of Framework Oriented Programming, we basically glue other people's code together until it sort of works, then hope that bit
It is also dead slow (Score:5, Interesting)
The main problem remaining is that this is dead slow. That makes it more something of theoretical interest, because any data transformations protected this way will need to do more than just a few simple steps to be of any value and then things will take forever.
Don't get me wrong, FHE is a great and fundamental theoretical result. But its practical applications are rather limited and usually get vastly overstated.
Re: (Score:2)
Yes, but I imagine the idea is to use dedicated hardware to speed things up. There will always be a penalty, just a lessor of one.
Re: (Score:2)
Yes, but I imagine the idea is to use dedicated hardware to speed things up. There will always be a penalty, just a lessor of one.
I know that. The penalty will always be extreme here though.
Re: (Score:2)
Can someone explain this screenshot from the git repo's examples,
where the operation is a simple CapitalizeString(),
and the "Total Time" is 0.7 seconds (capitalizing just 30 characters or so!)
but the "CPU Time" is 46 seconds.
https://raw.githubusercontent.... [githubusercontent.com]
Re: (Score:3)
Explain what? Yes, it is slow, nobody is claiming otherwise. As for CPU time vs total time, they parallelized the operation over a bunch of CPUs. 64 CPUs working for .7 seconds would be about 45 seconds of CPU time.
Re: (Score:2)
This also gives a nice example of the performance. Capitalizing 30 chars on a modern CPU should be in the area of less than 1us. This takes around 50'000'000 times longer.
Re: (Score:2)
There are some applications where you're dealing with highly sensitive data, and throwing more CPU power at the problem to do it like this is cheaper than setting up an entire dedicated system with hardening, strict auditing, secure interfaces etc.
And, you know, a few CPU generations down the line, what seems theoretical today is merely a bit expensive then.
Re: (Score:2)
There are some applications where you're dealing with highly sensitive data, and throwing more CPU power at the problem to do it like this is cheaper than setting up an entire dedicated system with hardening, strict auditing, secure interfaces etc.
Unlikely. Because the amount of data you can realistically process is tiny.
And, you know, a few CPU generations down the line, what seems theoretical today is merely a bit expensive then.
Nope. The mathematics does not allow that. Also. CPUs have basically hit a wall performance-wise about 5-10 years ago.
Re: (Score:2)
I think you don't appreciate how slow this is.
It seems the computers would need to be around 2^30 times as fast as they are now to do the string capitalization task as fast as current computers can do locally. So, assuming Moore's Law holds (this task seems to be suitable for parallelization), that's around half a century.
Code Link (Score:5, Informative)
Re:Not "encrypted" if you can glean information (Score:4, Informative)
You can't get any information about the contents, unless you have the encryption keys. You can execute computations, but you have no idea what the result means, unless you have the encryption keys.
Magical technology (Score:1)
TFA should have just linked to this:
https://github.com/google/full... [github.com]
Personally I don't have a need for this and don't see much of a point of running code on systems you don't trust. Add to that security of these systems rely on the untrusted server faithfully executing a given code and value prop gets a little weird.
Nonetheless the idea and concepts are interesting themselves and transpiler makes it easier to use than I assumed it would be. Sum/calc examples in github literally just define functions that
Not how its being sold (Score:2)
I've had several training classes from seemingly highly competent individuals claiming Homomorphic Encryption can be used to park your data in the cloud and allow you to use their tools to search, index, etc to optimize your data yet keep the cloud provider from being able to see the data.
I've argued (and from this description, rightfully so) that if their tools can read your data, they can read your data, extract it, provide it to governments, or anything else they want.
They are trying to sell this tech to
Re:Not how its being sold (Score:5, Informative)
SELECT MAX(salary) FROM employees
with your data and you get the result and decrypt it on you end.
Re: (Score:3)
It should be easy for you to prove your assertion, so do it. Otherwise, it just sounds like 'I don't understand it so it must be wrong'.
What you use it for (Score:3)
I think the first use for this will be password and other login information storage. I will keep your Credit card info, username and password in the cloud and my cloud provider won't know who's information I have stored but I can query this database for say users who's accounts will expire next month.
Re: (Score:2)
Re: (Score:2)
No, the point of HME is to perform computations on database records without decrypting the data.
Let's say you have a database of employees. Naturally, it's encrypted because ransomware and
Re: (Score:2)
Google's is a late entry? (Score:4, Informative)
IBM released their FHE code first https://www.ibm.com/security/s... [ibm.com]
https://developer.ibm.com/solu... [ibm.com]
Reported here on Slashdot as well
https://developers.slashdot.or... [slashdot.org]
Microsoft too (Score:3)
https://github.com/Microsoft/S... [github.com]
Homomorphic encryption, you say? (Score:2)
Does that mean using it will turn me gay? Asking for a friend.
Really? Google thinks this is ok? (Score:2)
"google opensources" (Score:2)