[PATCH v1 4/4] common/ml: add Arm NEON type conversion routines
Ruifeng Wang
Ruifeng.Wang at arm.com
Mon Dec 12 08:16:03 CET 2022
> -----Original Message-----
> From: Srikanth Yalavarthi <syalavarthi at marvell.com>
> Sent: Friday, December 9, 2022 3:36 AM
> To: Srikanth Yalavarthi <syalavarthi at marvell.com>; Ruifeng Wang <Ruifeng.Wang at arm.com>
> Cc: dev at dpdk.org; sshankarnara at marvell.com; jerinj at marvell.com; aprabhu at marvell.com
> Subject: [PATCH v1 4/4] common/ml: add Arm NEON type conversion routines
>
> Added ARM NEON intrinsic based implementations to support conversion of data types.
> Support is enabled to handle int8, uint8, int16, uint16, float16, float32 and bfloat16
> types.
>
> Signed-off-by: Srikanth Yalavarthi <syalavarthi at marvell.com>
> ---
> drivers/common/ml/meson.build | 5 +
> drivers/common/ml/ml_utils.c | 48 ++
> drivers/common/ml/ml_utils_neon.c | 950 ++++++++++++++++++++++++++++++
> drivers/common/ml/ml_utils_neon.h | 23 +
> 4 files changed, 1026 insertions(+)
> create mode 100644 drivers/common/ml/ml_utils_neon.c create mode 100644
> drivers/common/ml/ml_utils_neon.h
>
> diff --git a/drivers/common/ml/meson.build b/drivers/common/ml/meson.build index
> 84ae84ee4e..f7ce19b4b4 100644
> --- a/drivers/common/ml/meson.build
> +++ b/drivers/common/ml/meson.build
> @@ -17,6 +17,11 @@ sources = files(
> 'ml_utils_generic.c',
> )
>
> +if arch_subdir == 'arm'
> + headers += files('ml_utils_neon.h')
> + sources += files('ml_utils_neon.c') endif
> +
> deps += ['mldev']
>
> pmd_supports_disable_iova_as_pa = true
> diff --git a/drivers/common/ml/ml_utils.c b/drivers/common/ml/ml_utils.c index
> e2edef0904..3edcf09fde 100644
> --- a/drivers/common/ml/ml_utils.c
> +++ b/drivers/common/ml/ml_utils.c
> @@ -120,71 +120,119 @@ ml_io_format_to_str(enum rte_ml_io_format format, char *str, int
> len) int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void *output)
> {
> +#if defined(__ARM_NEON__)
> + return ml_float32_to_int8_neon(scale, nb_elements, input, output);
> +#else
> return ml_float32_to_int8_generic(scale, nb_elements, input, output);
> +#endif
> }
>
Maybe __rte_weak can be used to remove the ifdef clutter.
Something like:
ml_utils.c
__rte_weak int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void *output)
{
return ml_float32_to_int8_generic(scale, nb_elements, input, output);
}
ml_utis_neon.c
int ml_float32_to_int8(float scale, uint64_t nb_elements, void *input, void *output)
{
return ml_float32_to_int8_neon(scale, nb_elements, input, output);
}
<snip>
> diff --git a/drivers/common/ml/ml_utils_neon.c b/drivers/common/ml/ml_utils_neon.c
> new file mode 100644
> index 0000000000..b660de07ec
> --- /dev/null
> +++ b/drivers/common/ml/ml_utils_neon.c
> @@ -0,0 +1,950 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 Marvell.
> + */
> +
> +#include <errno.h>
> +#include <math.h>
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +#include <rte_vect.h>
> +
> +#include "ml_utils.h"
> +#include "ml_utils_neon.h"
> +
> +#include <arm_neon.h>
This line can be removed. It is included rte_vect.h.
Thanks.
<snip>
More information about the dev
mailing list