Accelerate openCL applications with Digilent Genesys ZU-3EG Zynq Ultrascale+ MPSoC PlatformFollow article
Determinism, responsivity, and performance are requirements that drive the architecture of most embedded applications at the edge for a range of applications from autonomous driving, to robotics and advanced vision systems.
Xilinx Zynq® UltraScale+™ EG MPSoC devicescombine a high-performance ARM®-based multicore, multiprocessing system with ASIC-class programmable logic. The processing system (PS) comes with 64-bit quad-core ARM® Cortex®-A53, 32-bit dual-core Cortex-R5F real-time processors, and a Mali™-400 MP2 graphics processing unit. The ASIC-class programmable logic is highly flexible and built on Xilinx UltraScale architecture. The programmable logic communicates with the processing system through 6,000 interconnects. The architecture provides engineers with the ability to implement solutions which achieves the determinism, responsivity, and performance targets.
Zynq Ultrascale+ MPSoC EG Device Block Diagram (source: https://www.xilinx.com/content/dam/xilinx/imgs/products/zynq/zynq-eg-block.PNG)
The combination of processing system and programmable logic enables implementation of the algorithms and functions using the most appropriate implementation technology. Network communications can be implemented in the processing system while neural network acceleration can leverage the highly parallel nature of programmable logic. With Xilinx Heterogeneous SoC devices, engineers and developers can also accelerate algorithms from the processor to programmable logic using high level programming languages.
The latest Xilinx Vitis unified software platform enables the development of embedded software and accelerated applications on heterogeneous Xilinx platforms including FPGAs, SoCs, and Versal ACAPs. It provides a unified programming model for accelerating Edge, Cloud, and Hybrid computing applications. Leverage integration with high-level frameworks, develop in C, C++, openCL or Python using accelerated libraries or use RTL-based accelerators & low-level runtime APIs for more fine-grained control over implementation — Choose the level of abstraction engineers or developers need. This opens the performance of programmable logic to developers without hardware description language (HDL).
Digilent Genesys ZU-3EG Vitis acceleration platform for Digilent Genesys ZU-3EG Zynq Ultrascale+ MPSoC Platform and demonstrates different elements required to create an acceleration platform.is a standalone Zynq Ultrascale+ Zynq MPSoC board designed with optimized specs, multimedia, and network connectivity interfaces, with a robust documentation library to quickly get you started on AI, research, aerospace/defense, cloud computing, and embedded vision applications. The board has an excellent mix of on-board peripherals including upgrade-friendly DDR4, Mini PCIe, microSD slots, multi-camera, USB 3.0 and high-speed expansion. Adam Taylor, an embedded design expert, creates the
Genesys ZU-3EG Zynq Ultrascale+ MPSoC Platform
OpenCL is an open-source framework that is designed for heterogeneous systems. Its host is typically a x86 based system while its kernel can be anything including CPU, GPU, FPGA or ASIC. The aim of OpenCL is to enable portability across platforms without changing the source code. As such, the host applications are commonly created in C or C++ combined with OpenCL application programming interface (API). Kernels are developed using OpenCL C which is derived from ISO C99 with necessary limitations and changes. For example, standard headers are not allowed e.g. stdlib.h, stdio.h while scalar types are all a defined size unlike in C/C++ where they are compiler and architecture-dependent. This allows developers to use OpenCL with standard compilers like GCC while the kernel uses custom compilers supplied by the kernel manufacturer.
The OpenCL module can be used for the development of the processing system (Host) and programmable logic (Kernel) in Xilinx Zynq Ultrascale+ MPSoC. Such an approach is supported by the Xilinx unified software development tool Vitis.
Xilinx tools for Vitis OpenCL acceleration flow on Digilent Genesys ZU-3EG
- Xilinx Vivado - Create a base platform of hardware configuration with the necessary resources made available to the Vitis compiler
- Xilinx PetaLinux – Create the Petalinux operating system which contains OpenCL API’s along with support for contiguous memory allocation and direct memory access drivers. PetaLinux is also used to create the SYS Root used to support the Vitis acceleration platform.
- Xilinx Vitis – Create the Vitis acceleration platform and the resulting accelerated application.
What does it include in the Vitis acceleration platform?
Hardware Base Platform
A base Vivado platform contains interfaces and processing elements along with making resources available for the Vitis compiler. To use this base hardware design with downstream tools, XSA which defines the hardware configuration needs to be exported before a bit file is created. Then, Vitis compiler can generate the necessary bit files for the application.
As a bare minimum, this platform needs to define
- Processor Configuration – The configuration of the processing system, clocks, available DDR and configuration, Multiplexed IO configuration for the PS Interfacing peripherals.
- Clocks - Provided from a clock wizard several different clocks which can be used by the Vitis compiler. There are two clocks in 150 MHz and 300 MHz.
- Interrupts – A single interrupt is provided to the processing system from an AXI Interrupt controller. The interrupts connected to the AXI Interrupt controller are then made available to the Vitis compiler. There are eight interrupts in this platform.
- Processor Reset blocks – One processor reset block needs to be provided to the Vitis compiler for each of the available clocks.
- PS / PL interface – At least one PS AXI Master and one PS AXI Slave need to be defined this is to allow configuration and control of the created acceleration core, along with high-speed DMA transfer from the PL to PS if required. There are one AXI master and three Slave AXI Interfaces.
- MIPI camera interface - Videos and images can be output over mini Displayport on Genesys ZU-3EG from the application software in PetaLinux.
Genesys ZU-3EG Hardware Platform in Vitis
The exported XSA can be used to create and configure a new PetaLinux project targeting the Genesys ZU-3EG. Once the project is created and configured, along with the SYS Root, a Vitis platform needs the following elements created by PetaLinux
- elf – First Stage Boot loader
- ub – The kernel image itself
- elf – The platform management unit firmware
- elf – ARM TrustZone firmware
- U-Boot.elf – Second Stage Bootloader which loads in the kernel image
- With these elements available at the end of the build sequence, we are able to start working with Vitis to create an acceleration platform.
Create the First Vitis Acceleration Project
- Create Vitis Platform
Vitis provides several step Wizard to help users create a platform. The wizard is listed in the Vitis Welcome Page. Choose “Create a Platform Project” to create the platform.
Hardware and software elements are needed for an acceleration platform hardware. The hardware element is defined by the XSA previously exported from Vivado. These software elements are produced during the PetaLinux build and can be identified in the wizard. Once this is completed, the platform can be used in a new acceleration project.
Xilinx Vitis IDE Wizard
Xilinx Vitis Software Elements Definition
2. Creating the First Vitis Acceleration Project
To complete the testing of the acceleration platform, the easiest way is to build one of the existing example applications. Create a new system application targeting at the Genesys ZU-3EG platform. This platform should support both embedded development and acceleration.
Walking through the new application project creation wizard will allow you to create the vector addition example application. This vector addition example will contain one OpenCL kernel which is accelerated into the programmable logic.
Within the kernel, to optimize performance for implementation in programmable logic, there are several Pragmas used to control loop unrolling and interfacing. This accelerated block needs to be able to connect with AXI interfaces which were made available in the Vivado platform.
Building the project will take a little while. Once the project is complete, Vitis will provide everything needed to be copied on to the SD Card. When the SD Card is inserted into the Genesys ZU-3EG, the application can be tested over the command line. Running the application will show the Kernel being loaded into the programmable logic and the steps associated with the execution of the program.