Rosella       Machine Intelligence & Data Mining

Install OpenCL on Debian and Ubuntu and Armbian for Orange Pi 5 RK3588

This installation should also work for other systems that use Rockchip RK3588/RK3588S such as Radxa Rock 5, OKdo Rock 5, Mixtile Core 3588E and Khadas Edge 2!

For certain applications such as computer vision (CNN) and generative AI (LLM), GPUs can unleash huge performance that cannot be met by multicore CPUs. Single board computers are very good for robotics. Computer vision is very compute intensive and important for robotics. Orange Pi 5 has quite decent GPU with 64 shading cores. Object detection of computer vision is a good example of GPU usage.

For OpenCL GPU programming guides, please read OpenCL Programing Guides with Example.

If you are using Joshua Riek's Ubuntu, OpenCL 3.0 may be already installed. In this case, you just need to create a symbolic link and copy CL include files if you wish to compile OpenCL programs. If so, jump to Create Symbolic Link for libOpenCL.so. If this fails, continue with the followings.

This page explains how to enable OpenCL on Debian (Bookworm) and Armbian and Ubuntu (Jammy) for Orange Pi 5 Single Board Computers. OS images we tested are Debian and Ubuntu and Armbian images from the Orange Pi 5 Home.

To enable OpenCL, take the following steps;
Note that some part of this information was derived from Setting up OpenCL on RK3588 using libmali. Download the ARM Mali GPU blob from rockchip's repository and put it into /usr/lib/ as follows. And install the firmware for the GPU if not already installed.

cd /usr/lib
sudo wget https://github.com/JeffyCN/mirrors/raw/libmali/lib/aarch64-linux-gnu/libmali-valhall-g610-g6p0-x11-wayland-gbm.so

cd /lib/firmware
sudo wget https://github.com/JeffyCN/mirrors/raw/libmali/firmware/g610/mali_csffw.bin

Add the Mali GPU blob to the OpenCL ICD config file as follows;

sudo apt install mesa-opencl-icd clinfo

On Ubuntu, you may get not found errors, especially huawei links. Ignore them. It looks OK. Proceed with the followings;

sudo mkdir -p /etc/OpenCL/vendors
echo "/usr/lib/libmali-valhall-g610-g6p0-x11-wayland-gbm.so" | sudo tee /etc/OpenCL/vendors/mali.icd

Set the dependencies of the Mali OpenCL as follows;

sudo apt install libxcb-dri2-0 libxcb-dri3-0 libwayland-client0 libwayland-server0 libx11-xcb1

Now you can run "clinfo" to check whether OpenCL is working. You will see the following if OpenCL is installed correctly;

...> clinfo
Number of platforms                       1
  Platform Name                           ARM Platform
  Platform Vendor                         ARM
  Platform Version                        OpenCL 2.1 v1.g6p0-01eac0.2819f9d4dbe0b5a2f89c835d8484f9cd
  Platform Profile                        FULL_PROFILE
  Platform Extensions                     cl_khr_global_int32_base_atomics ...
  Platform Extensions function suffix     ARM
  Platform Host timer resolution          1ns
  ...

Check whether some dependencies are missing using ldd command as follows;

ldd /usr/lib/libmali-valhall-g610-g6p0-x11-wayland-gbm.so

Create Symbolic Link for libOpenCL.so

The directory "/usr/lib/aarch64-linux-gnu/" will have "libOpenCL.so.1.0.0". But no "libOpenCL.so" file. In this case, create a symbolic link as follows. You need to log into root account to create this, say, "su -";

cd /usr/lib/aarch64-linux-gnu/
ln -s libOpenCL.so.1.0.0 libOpenCL.so

Copying OpenCL "CL" Folder into "/usr/include"

If you want to compile OpenCL programs, you will need to do this section.

Your "/usr/include/" directory may not have "CL" folder. In this case, you need to copy "CL" folder in this CLv2.zip (version 2) file (or CLv3.zip (version 3 for Joshua Riek's Ubuntu) file) into the "/usr/include/" directory. Extract/copy "CL" folder into your convenient folder. Log into "root" account, say "su -". "cd" to your folder that has the copied "CL" folder. From there, copy it to "/usr/include" folder as follows;

cp -r CL /usr/include

Restart your Orange Pi 5.

Now you are ready to compile and run. Compile command example is as follows. Change as your need. CMake files can be adjusted accordingly.

g++ CMSRModel.cpp OpenclModel.cpp Main.cpp -L "/usr/lib/aarch64-linux-gnu/" -lOpenCL -o app

RK3588/Orange Pi 5 OpenCL GPU Information

The following is Orange Pi 5 GPU information reported by CMSR Machine Learning Studio.

[Device 0] Mali-LODX r0p0
- Platform: ARM Platform
- Platform version: OpenCL 2.1 v1.g6p0-01eac0.2819f9d4dbe0b5a2f89c835d8484f9cd
- Vendor: ARM
- Driver version: 2.1
- Address bits: 64
- Compute units: 4
- Max work group size: 1024
- Clock frequency: 1000 MHz
- Global memory size: 3724 MB
- Local memory size: 32 KB
- Max allocation size: 3724 MB
- Max work item dimensions: 3
- Max work item sizes: 1024x1024x1024
- Max constant buffer size: 3905794048
- Global memory cache size: 1024 KB
- Max on device queues: 1
- Device profile: FULL_PROFILE
- C version: OpenCL C 2.0 v1.g6p0-01eac0.2819f9d4dbe0b5a2f89c835d8484f9cd

On Joshua Riek's Ubuntu, you will see OpenCL version 3.0 as follows;

[Device 1] Mali-G610 r0p0
- Platform: ARM Platform
- Platform version: OpenCL 3.0 v1.g13p0-01eac0.a8b6f0c7e1f83c654c60d1775112dbe4
- Vendor: ARM
- Driver version: 3.0
- Address bits: 64
- Compute units: 4
- Max work group size: 1024
- Clock frequency: 1000 MHz
- Global memory size: 3724 MB
- Local memory size: 32 KB
- Max allocation size: 3724 MB
- Max work item dimensions: 3
- Max work item sizes: 1024x1024x1024
- Max constant buffer size: 3905794048
- Global memory cache size: 1024 KB
- Max on device queues: 1
- Device profile: FULL_PROFILE
- C version: OpenCL C 3.0 v1.g13p0-01eac0.a8b6f0c7e1f83c654c60d1775112dbe4

RK3588/Orange Pi 5 OpenCL GPU Performance Comparison

The following shows times (in minutes) taken to train 104 layer computer vision deep neural network with 77 convolution layers on GPUs. This is one epoch training time with 3,400 images.

-OPi5 Rk3588 4 eu 64 cores 1000MHz : (76m)
-Toshiba Intel i5 internal 24 cu 900MHz : (143m)
-MacMini Intel i5 internal 40 cu 1100MHz : (77m)
-Acer Intel i7 internal 24 cu 1150MHz : (61m)
-Acer Nvidia Quardro T1000 896 cores 1725MHz : (8m)

You can see even high-end Intel internal GPUs can barely beat OPi5 GPU! Nvidia external GPU is outstanding. That's why everyone is rushing to buy Nvidia GPUs.


Useful OpenCL Example Program

The following OpenCL program lists names of GPUs in your system. Create a file, say, "cldevices.c" and copy the following program into the file.

#ifndef CL_TARGET_OPENCL_VERSION
#define CL_TARGET_OPENCL_VERSION 120
#endif
#include <CL/cl.h>
#include <stdio.h>

int main() {
	int i, j, k=0;
	cl_int ret;
	cl_uint numPlatforms;
	cl_uint numDevices;
	int verbose = 1;

	// get platform IDs;
	ret = clGetPlatformIDs(0, NULL, &numPlatforms);
	if (CL_SUCCESS != ret) {
 		if (verbose) printf("Error clGetPlatformIDs() : %d\n", ret);
		return ret;
	}

	cl_platform_id platforms[numPlatforms];
	ret = clGetPlatformIDs(numPlatforms, platforms, NULL);

	char local_dev_buf[250];
	cl_device_id devices[20]; // maximum number of GPU devices, say, 20;

	// search named device or the first GPU device if not specified;
    	for (i = 0; i < numPlatforms; i++) {
		ret = clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_GPU, 0, NULL, &numDevices);
 		if (CL_SUCCESS != ret) {
			continue;
		}
		ret = clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_GPU, 
                	numDevices, devices, NULL);
  		if (CL_SUCCESS != ret) {
			continue;
		}
     		for (j=0; j < numDevices; j++) {
            		ret = clGetDeviceInfo(devices[j], CL_DEVICE_NAME, 
				sizeof(local_dev_buf), local_dev_buf, NULL);
			if (CL_SUCCESS != ret) {
 				if (verbose) printf("Error clGetDeviceInfo() : %d\n", ret);
				continue;
			}
			printf("%d : %s\n", k++, local_dev_buf);
		}
	}
	return 0;
}

To compile this, run the following command. Then run "cldevices".

	gcc cldevices.c -L "/usr/lib/aarch64-linux-gnu/" -lOpenCL -o cldevices