[PATCH v3 0/9] riscv: implement accelerated crc using zbc

Stephen Hemminger stephen at networkplumber.org
Tue Jan 13 02:07:31 CET 2026


On Tue, 27 Aug 2024 16:32:20 +0100
Daniel Gregory <daniel.gregory at bytedance.com> wrote:

> The RISC-V Zbc extension adds instructions for carry-less multiplication
> we can use to implement CRC in hardware. This patch set contains two new
> implementations:
> 
> - one in lib/hash/rte_crc_riscv64.h that uses a Barrett reduction to
>   implement the four rte_hash_crc_* functions
> - one in lib/net/net_crc_zbc.c that uses repeated single-folds to reduce
>   the buffer until it is small enough for a Barrett reduction to
>   implement rte_crc16_ccitt_zbc_handler and rte_crc32_eth_zbc_handler
> 
> My approach is largely based on the Intel's "Fast CRC Computation Using
> PCLMULQDQ Instruction" white paper
> https://www.researchgate.net/publication/263424619_Fast_CRC_computation
> and a post about "Optimizing CRC32 for small payload sizes on x86"
> https://mary.rs/lab/crc32/
> 
> Whether these new implementations are enabled is controlled by new
> build-time and run-time detection of the RISC-V extensions present in
> the compiler and on the target system.
> 
> I have carried out some performance comparisons between the generic
> table implementations and the new hardware implementations. Listed below
> is the number of cycles it takes to compute the CRC hash for buffers of
> various sizes (as reported by rte_get_timer_cycles()). These results
> were collected on a Kendryte K230 and averaged over 20 samples:
> 
> |Buffer    | CRC32-ETH (lib/net) | CRC32C (lib/hash)   |
> |Size (MB) | Table    | Hardware | Table    | Hardware |
> |----------|----------|----------|----------|----------|
> |        1 |   155168 |    11610 |    73026 |    18385 |
> |        2 |   311203 |    22998 |   145586 |    35886 |
> |        3 |   466744 |    34370 |   218536 |    53939 |
> |        4 |   621843 |    45536 |   291574 |    71944 |
> |        5 |   777908 |    56989 |   364152 |    89706 |
> |        6 |   932736 |    68023 |   437016 |   107726 |
> |        7 |  1088756 |    79236 |   510197 |   125426 |
> |        8 |  1243794 |    90467 |   583231 |   143614 |
> 
> These results suggest a speed-up of lib/net by thirteen times, and of
> lib/hash by four times.
> 
> I have also run the hash_functions_autotest benchmark in dpdk_test,
> which measures the performance of the lib/hash implementation on small
> buffers, getting the following times:
> 
> | Key Length | Time (ticks/op)     |
> | (bytes)    | Table    | Hardware |
> |------------|----------|----------|
> |          1 |     0.47 |     0.85 |
> |          2 |     0.57 |     0.87 |
> |          4 |     0.99 |     0.88 |
> |          8 |     1.35 |     0.88 |
> |          9 |     1.20 |     1.09 |
> |         13 |     1.76 |     1.35 |
> |         16 |     1.87 |     1.02 |
> |         32 |     2.96 |     0.98 |
> |         37 |     3.35 |     1.45 |
> |         40 |     3.49 |     1.12 |
> |         48 |     4.02 |     1.25 |
> |         64 |     5.08 |     1.54 |
> 
> v3:
> - rebase on 24.07
> - replace crc with CRC in commits (check-git-log.sh)
> v2:
> - replace compile flag with build-time (riscv extension macros) and
>   run-time detection (linux hwprobe syscall) (Stephen Hemminger)
> - add qemu target that supports zbc (Stanislaw Kardach)
> - fix spelling error in commit message
> - fix a bug in the net/ implementation that would cause segfaults on
>   small unaligned buffers
> - refactor net/ implementation to move variable declarations to top of
>   functions
> - enable the optimisation in a couple other places optimised crc is
>   preferred to jhash
>   - l3fwd-power
>   - cuckoo-hash
> 
> Daniel Gregory (9):
>   config/riscv: detect presence of Zbc extension
>   hash: implement CRC using riscv carryless multiply
>   net: implement CRC using riscv carryless multiply
>   config/riscv: add qemu crossbuild target
>   examples/l3fwd: use accelerated CRC on riscv
>   ipfrag: use accelerated CRC on riscv
>   examples/l3fwd-power: use accelerated CRC on riscv
>   hash/cuckoo: use accelerated CRC on riscv
>   member: use accelerated CRC on riscv
> 
>  MAINTAINERS                                   |   2 +
>  app/test/test_crc.c                           |   9 +
>  app/test/test_hash.c                          |   7 +
>  config/riscv/meson.build                      |  44 +++-
>  config/riscv/riscv64_qemu_linux_gcc           |  17 ++
>  .../linux_gsg/cross_build_dpdk_for_riscv.rst  |   5 +
>  examples/l3fwd-power/main.c                   |   2 +-
>  examples/l3fwd/l3fwd_em.c                     |   2 +-
>  lib/eal/riscv/include/rte_cpuflags.h          |   2 +
>  lib/eal/riscv/rte_cpuflags.c                  | 112 +++++++---
>  lib/hash/meson.build                          |   1 +
>  lib/hash/rte_crc_riscv64.h                    |  89 ++++++++
>  lib/hash/rte_cuckoo_hash.c                    |   3 +
>  lib/hash/rte_hash_crc.c                       |  13 +-
>  lib/hash/rte_hash_crc.h                       |   6 +-
>  lib/ip_frag/ip_frag_internal.c                |   6 +-
>  lib/member/rte_member.h                       |   2 +-
>  lib/net/meson.build                           |   4 +
>  lib/net/net_crc.h                             |  11 +
>  lib/net/net_crc_zbc.c                         | 191 ++++++++++++++++++
>  lib/net/rte_net_crc.c                         |  40 ++++
>  lib/net/rte_net_crc.h                         |   2 +
>  22 files changed, 529 insertions(+), 41 deletions(-)
>  create mode 100644 config/riscv/riscv64_qemu_linux_gcc
>  create mode 100644 lib/hash/rte_crc_riscv64.h
>  create mode 100644 lib/net/net_crc_zbc.c
> 

Not sure all the reasons this never got merged but it won't apply to current
DPDK version and AI review has lots of feedback.  If still interested,
please fix up and resubmit.

# DPDK Patch Review: RISC-V Zbc Extension Series

**Patch Series:** [PATCH v3 1-9/9] RISC-V Zbc extension for hardware CRC  
**Submitter:** Daniel Gregory <daniel.gregory at bytedance.com>  
**Date:** August 27, 2024  
**Patchwork IDs:** 143405-143413

---

## Overview

This 9-patch series adds support for RISC-V's Zbc (carry-less multiplication) extension to implement hardware-accelerated CRC-32C, CRC-32, and CRC-16 calculations in DPDK. The series includes build-time and runtime detection, library implementations, and integration into various DPDK components.

---

## ERROR Level Issues (Must Fix)

### 1. **SPDX License Identifier Typo** — `lib/hash/rte_crc_riscv64.h`

```c
/* SPDX-License_Identifier: BSD-3-Clause   ← WRONG: underscore
 * Copyright(c) ByteDance 2024
 */
```

**Required:**
```c
/* SPDX-License-Identifier: BSD-3-Clause   ← hyphen required
 * Copyright(c) 2024 ByteDance
 */
```

**AGENTS.md Reference:** "Every source file must begin with an SPDX license identifier"

---

### 2. **Header Guard After Includes** — `lib/hash/rte_crc_riscv64.h`

The header guard is placed AFTER the `#include` statements, which defeats its purpose:

```c
/* SPDX-License_Identifier: BSD-3-Clause
 * Copyright(c) ByteDance 2024
 */

#include <assert.h>          ← includes BEFORE guard
#include <stdint.h>

#include <riscv_bitmanip.h>

#ifndef _RTE_CRC_RISCV64_H_  ← guard comes too late
#define _RTE_CRC_RISCV64_H_
```

**Required:** Header guard immediately after blank line following copyright:

```c
/* SPDX-License-Identifier: BSD-3-Clause
 * Copyright(c) 2024 ByteDance
 */

#ifndef _RTE_CRC_RISCV64_H_
#define _RTE_CRC_RISCV64_H_

#include <assert.h>
#include <stdint.h>

#include <riscv_bitmanip.h>
```

---

### 3. **Missing NULL Check / Logic Bug** — `lib/net/rte_net_crc.c`

In `rte_crc32_eth_default_handler()`, after adding zbc support:

```c
handlers = neon_pmull_get_handlers();
if (handlers != NULL)
    return handlers[RTE_NET_CRC32_ETH](data, data_len);
handlers = zbc_clmul_get_handlers();
    return handlers[RTE_NET_CRC32_ETH](data, data_len);  ← Missing NULL check!
handlers = handlers_scalar;
```

This will crash if `zbc_clmul_get_handlers()` returns NULL. **Required:**

```c
handlers = zbc_clmul_get_handlers();
if (handlers != NULL)
    return handlers[RTE_NET_CRC32_ETH](data, data_len);
handlers = handlers_scalar;
```

---

### 4. **Test Output Typo** — `app/test/test_hash.c`

```c
printf("Failed checking CRC32_SW against CRC32_RISC64\n");
                                              ^^^^ Missing 'V'
```

**Should be:** `CRC32_RISCV64`

---

## WARNING Level Issues (Should Fix)

### 5. **Use of Standard `assert()` Instead of RTE_ASSERT**

Multiple files use standard library `assert()`:

- `lib/hash/rte_crc_riscv64.h` line 539
- `lib/net/net_crc_zbc.c` line 896

**AGENTS.md Guidance:** DPDK prefers `RTE_ASSERT()` or `RTE_VERIFY()` for consistency. Standard `assert()` may be compiled out in release builds unexpectedly.

---

### 6. **Implicit Pointer Comparison in `likely()` Macro**

In `lib/hash/rte_crc_riscv64.h`:

```c
if (likely(rte_hash_crc32_alg & CRC32_RISCV64))
```

This is a bitmask check (integer), which is acceptable. However, for clarity and consistency with DPDK style, explicit comparison is preferred:

```c
if (likely((rte_hash_crc32_alg & CRC32_RISCV64) != 0))
```

---

### 7. **Fall-through Comment Missing** — `lib/net/rte_net_crc.c`

```c
case RTE_NET_CRC_ZBC:
    handlers = zbc_clmul_get_handlers();
    /* fall-through */          ← MISSING
case RTE_NET_CRC_SCALAR:
```

DPDK requires explicit `/* fall-through */` or `/* FALLTHROUGH */` comments (per AGENTS.md Section on switch statements).

---

### 8. **Commit Body Line Length** — Patch 1/9

Some lines in the commit message body may exceed 75 characters:

```
These let us detect whether the Zbc extension is supported on the
compiler and -march we're building with. The C api also defines Zbc
```

**AGENTS.md:** "Body wrapped at 75 characters"

---

### 9. **Meson Build File Comment at End of Conditionals**

In `lib/net/meson.build`, the endif cascade doesn't clearly indicate which condition is being closed. Consider adding comments for maintainability.

---

### 10. **Copyright Format Inconsistency**

The patches use:
```c
Copyright(c) ByteDance 2024
```

DPDK convention (per AGENTS.md examples) typically uses:
```c
Copyright(c) 2024 ByteDance
```

With the year first.

---

## INFO Level Issues (Consider)

### 11. **Static Variables Without `static` at File Scope** — `lib/net/net_crc_zbc.c`

```c
struct crc_clmul_ctx crc32_eth_clmul;
struct crc_clmul_ctx crc16_ccitt_clmul;
```

These should be `static` to limit scope to the translation unit:

```c
static struct crc_clmul_ctx crc32_eth_clmul;
static struct crc_clmul_ctx crc16_ccitt_clmul;
```

---

### 12. **Magic Numbers in Constants**

In `net_crc_zbc.c`, the polynomial constants are well-commented, which is good. Consider also documenting the mathematical derivation source more explicitly (the Intel whitepaper reference is good).

---

### 13. **Documentation Gap**

The series doesn't include updates to release notes (`doc/guides/rel_notes/`). For a feature addition like hardware CRC acceleration on a new architecture, this should be documented.

**AGENTS.md:** "Update release notes in `doc/guides/rel_notes/` for important changes"

---

## Commit Message Analysis

| Patch | Subject Length | Format Check |
|-------|---------------|--------------|
| 1/9 | `config/riscv: detect presence of Zbc extension` | 47 chars ✓ |
| 2/9 | `hash: implement CRC using riscv carryless multiply` | 50 chars ✓ |
| 3/9 | `net: implement CRC using riscv carryless multiply` | 49 chars ✓ |
| 4/9 | `config/riscv: add qemu crossbuild target` | 41 chars ✓ |
| 5/9 | `examples/l3fwd: use accelerated CRC on riscv` | 45 chars ✓ |
| 6/9 | `ipfrag: use accelerated CRC on riscv` | 38 chars ✓ |
| 7/9 | `examples/l3fwd-power: use accelerated CRC on riscv` | 51 chars ✓ |
| 8/9 | `hash/cuckoo: use accelerated CRC on riscv` | 42 chars ✓ |
| 9/9 | `member: use accelerated CRC on riscv` | 37 chars ✓ |

**All subject lines:**
- ✓ Under 60 character limit
- ✓ Lowercase after colon
- ✓ Imperative mood
- ✓ No trailing period
- ✓ Correct prefixes (config/riscv, hash, net, examples/*, lib names)
- ✓ Signed-off-by present on all patches

---

## Positive Observations

1. **Well-structured series** — The patches are logically ordered: detection first, then core implementations, then consumers.

2. **Good documentation** — Comments explain the Barrett reduction algorithm and reference the Intel whitepaper.

3. **Proper runtime detection** — Uses hwprobe syscall for runtime CPU feature detection, with appropriate fallbacks.

4. **Test coverage** — Adds test cases to existing test suites (test_hash.c, test_crc.c).

5. **MAINTAINERS updates** — Properly adds new files to the RISC-V maintainer's scope.

6. **Cross-compilation support** — Includes QEMU target for development/testing.

---

## Summary

| Severity | Count | Action Required |
|----------|-------|-----------------|
| **ERROR** | 4 | Must fix before merge |
| **WARNING** | 6 | Should fix |
| **INFO** | 3 | Consider fixing |

### Required Actions Before Merge

1. Fix SPDX identifier typo (`License_Identifier` → `License-Identifier`)
2. Move header guard before includes in `rte_crc_riscv64.h`
3. Add NULL check after `zbc_clmul_get_handlers()` in `rte_crc32_eth_default_handler()`
4. Fix typo `CRC32_RISC64` → `CRC32_RISCV64` in test output

### Recommended Actions

5. Add fall-through comment in switch statement
6. Make file-scope variables `static`
7. Add release notes entry
8. Consider using `RTE_ASSERT()` instead of `assert()`


More information about the dev mailing list