Nav apraksta

1 Atzari

subDesTagesMitExtraKaese 992428a2c9 updated doku		4 gadi atpakaļ
c++	e1b73ad0a6 fixed output channel accumulation bug	4 gadi atpakaļ
doku	992428a2c9 updated doku	4 gadi atpakaļ
examples	474d326b5f fixed train example	4 gadi atpakaļ
hostLib	71217544b7 added more layers	4 gadi atpakaļ
tests	a7d674e8cc added latency tests	4 gadi atpakaļ
.gitignore	a49eba986a added bandwidth benchmark	4 gadi atpakaļ
.gitmodules	a6e49047a8 moved sources	4 gadi atpakaļ
README.md	992428a2c9 updated doku	4 gadi atpakaļ
config.json	d839b063df copied json.hpp	4 gadi atpakaļ

TensorFlow library for adding FPGA based layers

Components

examples/ Library usage examples
hostLib/ Python wrapper module
- layers/ Layer definitions
c++/ TensorFlow custom operator library
- lib/mlfpga/ FPGA data transfer library
- 2D convolution of one channel
- 2D convolution with activation
- 2D convolution with activation and fixed output channels
- 2D convolution and MaxPooling
- multiple 2D convolutions with MaxPooling
vhdl-modules VHDL implementation in separate Repository

Usage

import tensorflow as tf
from tensorflow.keras import models
from hostLib.layers.conv2d import Conv2D as Conv2DFPGA

model = models.Sequential()
model.add(Conv2DFPGA(1))

Installation

clone repository and init submodules

git clone <this url>
cd ./tf-fpga
git submodule init

install dependencies (on Ubuntu Linux for example)

sudo apt update                           
sudo apt upgrade -y
sudo apt autoremove
sudo apt install python3 python3-pip
sudo python3 -m pip install --upgrade pip # update pip globally
python3 -m pip install tensorflow

install C++ compiler
```
sudo apt install g++
```

compile operator and fpga libraries

cd ./c++
./configure
make

> /usr/bin/g++ ... -o build/dummyBigOp.o src/dummyBigOp.cpp
> ...
> /usr/bin/g++ ... -o build/op_lib.so ...

update config.json with your FPGA addresses defined in the VHDL design

{"fpgas": [
  {
    "ip":   "192.168.1.33",
    "port": 1234
  },
  {
    "ip":   "192.168.1.34",
    "port": 1234
  },
  {
    "ip":   "192.168.1.35",
    "port": 1234
  }
]}

Adding new custom layers

For more details on how to contribute to git projects see https://gist.github.com/MarcDiethelm/7303312.

create a computation module in the FPGA implementation

add your FPGA module to the list of modules c++/lib/mlfpga/include/modules.hpp

then the MOD_DEF macro creates these entries automagically:

moduleIds[Module::myNewModule];
moduleNames[Module::myNewModule];
moduleSendPayloadLength[Module::myNewModule];
moduleRecvPayloadLength[Module::myNewModule];

create a TF kernel implementation MyNewOp inherited from AsyncOpKernel, inside these files:

c++/src/myNewOp.cpp and c++/include/myNewOp.hpp

define the constructor and overwrite the ComputeAsync method:

class MyNewOp : public AsyncOpKernel {
  public:
    explicit MyNewOp(OpKernelConstruction* context);

    void ComputeAsync(OpKernelContext* context, DoneCallback done) override;
}

using your FPGA module

auto worker = connectionManager.createWorker(Module::myNewModule, count);

c++/src/entrypoint.cpp

REGISTER_OP("MyNewOp")
  .Input("input: float")
  .Output("output: float")
  .SetShapeFn([](InferenceContext* c) {
    c->set_output(0, c->input(0));
    return Status::OK();
  });
;

REGISTER_KERNEL_BUILDER(Name("MyNewOp").Device(DEVICE_CPU), MyNewOp);
//                                  the custom kernel class /\

c++/include/entrypoint.hpp

#include "myNewOp.hpp"

More information on creating custom TF kernels can be found here.

compile everything
```
cd ./c++
make clean
make
```

append a test for your operator

tests/op_test.py

def testMyNewOp(self):
  with self.session():
    input = [1,2,3]
    result = load_op.op_lib.MyNewOp(input=input)
    self.assertAllEqual(result, input)

add a custom layer that uses the operator

hostLib/layers/myNewLayer.py

class MyNewLayer(layers.Layer):
  ...
  def call(self, inputs):
    return load_op.op_lib.MyNewOp(input=inputs)

add that layer to the python module

hostLib/layers/__init__.py
```
__all__ = ["conv2d", "myNewLayer"]
```

Tests

There are tests for each complexity level of this project.

loopback test without connected FPGAs. This will only succeed for modules that have equal input and output lengths.

compile the UDP echo server and run it in a seperate terminal:
```
cd ./c++
make echo
./build/echo
```
edit config.json:
```
{"fpgas": [
  {
    "ip":   "localhost",
    "port": 1234
  }
]}
```
then run any dummy module test:
```
python3 tests/op_test.py
```
FPGA communication test c++/tests/main.cpp
```
cd ./c++
make test
./build/test
```
operator validation test, based on TFs test suite tests/op_test.py
```
python3 tests/op_test.py
```

Dependencies

C++

libstd
libtensorflow_framework
https://github.com/nlohmann/json
./config.json

Python3

tensorflow
c++/build/op_lib.so

Used in examples:

Pillow
CV2
mss
numpy
IPython

README.md