Running Deep Learning on EVM

kladkogex · April 20, 2018, 1:28pm

Does any one here know how to run Tensorflow in a reproducible fashion on different NVIDIA cards?

We are adding Tensorflow as a set of pre-compiled smart contracts. For the software version of Tensorflow we can get it to be reproducible by fixing seeds of the pseudo-random generator and then compiling using -sse2 flag of gcc …

MaxC · April 20, 2018, 9:50pm

Have no idea if it’s possible to do with tensorflow, but might be possible with keras using the theano backend?

ssarkar · April 26, 2018, 8:50pm

“I am not saying EVM is a perfect place to run neural networks, on the other hand making it some kind of an simple extension to EVM/Solifity would draw many developers. Another possibility is to run a totally different thing and then feed the results into Ethereum somehow …”

Is it possible to enhance EVM/Solifity APIs to make external calls to Deep NN and RDBMS ?
The data from one node in blockchain will be transformed by DNN / RDBMS and sent to other nodes.
Is this possible ? How hard will be to implement such a thing ?

kladkogex · May 9, 2018, 9:10pm

Thats what we do on our system. The NN needs to be executed in a decentralized way though - it cant be a centralized server.

You do not need to run the NN on all nodes of the chain though. A subset is enough if all agree, and then if there is a single disagreement you can run it on a larger set of nodes, and then punish the party that made an incorrect calculation.

For instance, in a network where 1/3 of nodes is Byzantine it is enough to run the NN on 48 randomly picked nodes, since the probability of all of them being Byzantine is 10^{-22} which is a very small number

fgadaleta · August 16, 2018, 9:05am

Trusting the prover who must know the weights (in order to generate a proof) is relatively acceptable. What I think is less realistic though is disclosing the inputs to feed the “private” neural network.
I don’t see that happening on a regular basis, except in the quite specific use case you mention.

kladkogex · August 16, 2018, 11:24am

As discussed in this thread a neural network that is stored in a smart contract could be potentially fooled by a purposefully constructed malicious data ( although supplying purposefully constructed data may not be feasible. For instance if the connection between a camera and the blockchain is secure, the image can not be modified by the attacker)

A question is whether malicious data could be filtered out by applying a set of multiple networks with potentially different parameters.

Another interesting possibility is to generate lots of malicious data samples and then to have a dedicated network that detects malicious data samples

I think what we know that a human brain can not be easily fulled. One can not create a picture of a dog that looks like a cat. There is some “anti-fooling” mechanism in the human brain

MaxC · August 16, 2018, 11:19pm

[/quote]

MaxC · August 16, 2018, 11:20pm

[quote=“kladkogex, post:26, topic:899”][/quote]
I think what we know that a human brain can not be easily fulled. One can not create a picture of a dog that looks like a cat. There is some “anti-fooling” mechanism in the human brain

^^

kladkogex · August 17, 2018, 7:44am

I retract )))

mratsim · August 17, 2018, 9:37am

As a data scientist and someone who implemented a deep learning framework and a significant part of the Nimbus EVM from scratch, I think this should be done off-chain.

In addition to @chriseth zkSnarks Proof-of-Computation link

I like the following runnable examples from Snarky, a verifiable computation library.

Now a few comments about the various ideas given here:

1. Dependencies

Dependency on a Keras, Theano, Tensorflow, Torch/PyTorch is undesirable.

Keras is Python which brings dependency/virtualenv hell
Theano is Python and not supported anymore
Tensorflow is a huge beast, completely unauditable

There are several inference-only frameworks for embedded that are more suitable as a lightweight dependency:

uTensor (by an ARM core dev): GitHub - uTensor/uTensor: TinyML AI inference library deep learning on 256k RAM microcontroller
TinyDNN: GitHub - tiny-dnn/tiny-dnn: header only, dependency-free deep learning framework in C++14
ARM NN by ARM: GitHub - ARM-software/armnn: Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
NCNN by TenCent: GitHub - Tencent/ncnn: ncnn is a high-performance neural network inference framework optimized for the mobile platform
Mobile Deep Learning by Baidu: GitHub - PaddlePaddle/Paddle-Lite: PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎）

And less lightweight but very often used in embedded:

Darknet (works on drones and Tegra): GitHub - pjreddie/darknet: Convolutional Neural Networks
OpenCV

2. Determinism

Multithreaded Floating Point Math is not deterministic, a CPU or GPU thread can be paused for any reason (OS deprioritizing, hardware throttling due to high temp, neutrino …) and then instead of ((a + b) + c), you get ((a + c) + b) rounding is different and result is different.

Fixing random seeds in Numpy and Tensorflow/Theano/Cuda GPU is not enough.

Relevant links:

kladkogex · August 17, 2018, 12:14pm

Interesting - may be we need it to run single-threaded then. Prediction using a pre-trained is not such a computationally hard operation …

kladkogex · September 25, 2018, 7:51am

I am trying to form an Ethereum magicians ring to study AI on EVM.

If you any of you folks are interested, please register here