CNN are now widely used so it is necessary to implement them efficiently.
To do so, CNN are most commonly implemented on GPU processors, and also a bit on FPGA.
In this talk, without entering into the details, we will
list some problems arising when implementing the CNN inferences, especially on FPGA.
We will also link these problems to the CNN models themselves and
we will highlight a few general recommendations extracted from the following papers.