I've had this thought for some time and now I think I can express it coherently. Suppose we have some computer code in an untrusted environment. Specifically, the caller wants to trust a library, but doesn't trust the processor or operating system. It's quite challenging to work in such a hostile host environment. We might try to compute checksums of the code we are about to call, but we can't trust that the host environment won't fool us into believing that everything is okay, when it isn't. Moreover, it is trivial for the host environment to lift sensitive data out of our program as it runs.
Here's my idea: we need to transform our library function in such a way that it cannot be tampered with. That is a tall order. Given a function f with a type α → β, we need to compute T[f] with the type pub(α) → priv(β), where pub and priv are public and private encrypted forms. Effectively, the function must take encrypted input and produced signed output using some agreed upon key.
The magic must be inside f though. Obviously, it would be simple to simple decrypt the input, compute the result, and then sign the output. However, this would be abusable by our untrusted host environment. What we need to do is write a program that is equivalent to that, but does not actually do the decryption. What is needed is a way to transform our computation to work in the transformed cryptographic space. That is, create a homomorphism between normal instructions and instructions in the encrypted space. This way, nothing private would be exposed to the host environment. If the host environment attempts to tamper with the code, the results will not be correctly signed.
I suspect that the transformation would sufficiently destroy the information in the computation, such that, without knowing the homomorphism, it would be impossible to even tell what the code did. It would be very expensive to determine the homomorphism, if it is at all possible. If discrete math wasn't so mind-destroying, I should like to have been a cryptographer.
|
2009-12-26T17:42:15-05:00 |
|
Eddie Ma says on 2009-12-26T20:33:50-05:00:
Andre, I've had two similar thoughts─First, for two artificial neural networks having the same topology and number of processing nodes in each layer: if their initial small random weight values are different, then training the two networks to convergence on the same suite of the same patterns yields a homomorphic pair.
Second, for a vector-to-vector mapping device such as a neural network; if a suite of input patterns and their responses are trained on a network AND a homomorphic suite of such data are trained on a different network, if the semantics encased in the second suite of vectors is lossy, then the second network must necessarily not be able to handle the same domain of input.
For the second thought above, I was actually curious to know if a neural grammar network would operate as well on InChI keys (hashed InChIs) as well as they do on InChIs; where an InChI is a formal cheminformatics string.
The lossiness prevented me from running a suite of trials in the limited time I had.
Your thoughts are interesting though: Be sure to actually do an equivalent operation on equivalent data... I am unsure that after doing such a thing actually yields privacy though... One argues that the information content (the vector semantics) are all present.
Andre Masella says on 2009-12-26T22:12:00-04:00:
In the case of the InChI data, the homomorpism can be public. In the case the cryptographic version, the homomorpism is not public. In the case of the lossiness of the homomorpishm, the neural network is not necessarily a true homomorpism since it is trained to be like it. This is true for any kind of machine learning. The homomorpism I would required would have to be mathematically constructred, rather than inferred from the data. Remember that I am talking about transforming functions rather than data though.