ailia Tech BLOG

Released ailia SDK 1.5.0

Speeding up Transformer models

We optimized the speed of the Attention mechanism used in Transformer models. When converted to ONNX, Attention is decomposed into multiple operators, but ailia SDK merges these operators at runtime to speed up the processing.

ailia SDK 1.5.0 achieves faster Attention performance for both CPU (AVX, NEON) and GPU (cuDNN). We are also planning to extend this optimization to other environments in the future.

Below is a speed comparison for the voice synthesis model GPT-SoVITS. We can see a faster inference than ONNX Runtime on both CPU and CUDA. The ONNX Runtime with CUDA backend struggles with models that involve dynamically changing shapes, often resulting in faster performance on CPU. However, with ailia SDK CUDA backend, even models with dynamically changing shapes can achieve high-speed inference.

Support for VK_KHR_cooperative_matrix

To accelerate the MatMul operation used in Transformer models, we have implemented support for Vulkan operatorVK_KHR_cooperative_matrix.

VK_KHR_cooperative_matrix is an API designed to leverage matrix computation hardware, such as Intel's XMX and NVIDIA's TensorCore. This specialized hardware enables faster inference compared to computing matrix multiplication within shaders.

This feature can be enabled by selecting the Vulkan (FP16) backend.

Support of opset = 20

The supported range of opsets has been expanded to 20. This allows compatibility with models that can only be exported with opset 20, such as LivePortrait.

Support of Jetpack 6.0 and 6.1

We have added support for Jetpack 6.0 + cuDNN 8.9.4 and Jetpack 6.1 + cuDNN 9.0.0 on Jetson devices. This ensures that ailia SDK can be used with the latest Jetpack versions.

Support of numpy 2.0

The Python Binding now supports numpy 2.0. ailia SDK can be used with both numpy 1.x and numpy 2.x using a common binary, ensuring stable operation across versions.

How to update from ailia SDK 1.4 to 1.5

Python

You can update ailia SDK using the following pip command. Up to ailia SDK 1.4, a common Wheel was used for all platforms, but starting from ailia SDK 1.5, platform-specific Wheels will be downloaded.

pip3 install -U ailia

The SDK package can be updated using the following method:

python3 bootstrap.py
pip3 install .

Unity

You can update ailia SDK via the Package Manager.

Flutter

The easiest way to update ailia SDK is using the following command:

flutter pub upgrade

For the official release, replace the libraries in the package with the license-locked versions of ailia.dll, libailia.so, or libailia.dylib.

Newly supported models

Live Portrait (Animation of still images)

Bring still portrait images to life

ailia-models/generative_adversarial_networks/live_portrait at master · ailia-ai/ailia-modelsThe collection of pre-trained, state-of-the-art AI models for ailia SDK …github.com

出典:https://github.com/KwaiVGI/LivePortrait

Qwen2VL (Vision language model)

High performance VLM that also supports Japanese language.

ailia-models/vision_language_model/qwen2_vl at master · ailia-ai/ailia-modelsThe collection of pre-trained, state-of-the-art AI models for ailia SDK — ailia-models/vision_language_model/qwen2_vl…github.com

Source: https://github.com/QwenLM/Qwen2-VL


ailia Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.

ailia Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.