Tìm kiếm

Tiêu chí lọc:


Tiêu chí lọc:


Kết quả tìm kiếm

Hiện thị kết quả từ 1 đến 2 của 2
  • <<
  • 1
  • >>
  • Tác giả : Guillermo, Alaejos; Adrián, Castelló; Héctor, Martínez;  Người hướng dẫn: -;  Đồng tác giả: - (2023)

    Our work exposes the structure of the template-based micro-kernels for ARM Neon (128-bit SIMD), ARM SVE (variable-length SIMD) and Intel AVX512 (512-bit SIMD), showing considerable performance for an NVIDIA Carmel processor (ARM Neon), a Fujitsu A64FX processor (ARM SVE) and on an AMD EPYC 7282 processor (256-bit SIMD).

  • Tác giả : Manuel F., Dolz; Sergio, Barrachina; Héctor, Martínez;  Người hướng dẫn: -;  Đồng tác giả: - (2023)

    In this work, we assess the performance and energy efficiency of high-performance codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) inference on a series of ARM-based processor architectures. Specifically, we evaluate the NVIDIA Denver2 and Carmel processors, as well as the ARM Cortex-A57 and Cortex-A78AE CPUs as part of a recent set of NVIDIA Jetson platforms. The performance–energy evaluation is carried out using the ResNet-50 v1.5 convolutional neural network (CNN) on varying configurations of convolution algorithms, number of threads/cores, and operating frequencies on the tested processor cores. The results demonstrate that the best throughput is obtained on all platforms with the Winograd con...