[dpdk-dev] [RFC] Compression API in DPDK

Trahe, Fiona fiona.trahe at intel.com
Tue Oct 17 16:40:54 CEST 2017
Previous message: [dpdk-dev] [RFC] Compression API in DPDK
Next message: [dpdk-dev] [RFC] Compression API in DPDK
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Shally,

Thanks for your feedback. Comments below.

Regards,
Fiona

> -----Original Message-----
> From: shally verma [mailto:shallyvermacavium at gmail.com]
> Sent: Tuesday, October 17, 2017 7:56 AM
> To: Trahe, Fiona <fiona.trahe at intel.com>; dev at dpdk.org; pathreya at cavium.com; Mahipal Challa
> <mchalla at cavium.com>; Shally Verma <sverma at cavium.com>
> Subject: Re: [dpdk-dev] [RFC] Compression API in DPDK
> 
> While I am still going through RFC in details, inline are my inputs to
> some of the points.
> 
> On Thu, Oct 12, 2017 at 10:34 PM, Trahe, Fiona <fiona.trahe at intel.com> wrote:
> >
> > With the vast amounts of data being transported around networks and stored in storage systems,
> reducing data size is becoming ever more important. There are both software libraries and hardware
> devices available that provide compression, but no common API. This RFC proposes a compression API
> for DPDK to address this need.
> >
> > Features:
> > . Deflate Algorithm (https://tools.ietf.org/html/rfc1951), other algorithms can be added later.
> > . Static and Dynamic Huffman encoding.
> > . Compression levels
> > . Checksum generation
> > . Asynchronous burst API
> > . Session-based. (for stateless a session may be useable across devices, for stateful it wouldn't be.)
> >
> >
> > Note 1: Split of functionality above/below API
> > When considering whether features should be supported on the API or not, the decision was based on
> the following:
> > The purpose of the API is to decouple the application from the compute-intensive functions needed
> for compression by abstracting them under a common API. These can then be implemented by either
> hardware accelerators or optimised software libraries. Where features are not compute-intensive and
> unlikely to be offloaded or optimised, there's nothing to be gained by each PMD having to separately
> implement them, and it makes more sense for them to be done above the API. So the following are not
> handled on the API and can be done above.
> > . Prepending/appending protocol headers (gzip, zlib)
> > . File-handling, breaking files up into packets, reassembling.
> > . Synchronous API
> >
> >
> > Note 2: Stateful compression.
> > A better compression ratio can be achieved by using stateful compression, allowing pattern matching
> across multiple sequential packets. This will be supported on the API, but in the interests of getting an
> initial RFC out for feedback, there are placeholder parameters on the API. Feedback welcome on what
> parameters are needed for this feature.
> >
> 
> We would like to see this as part of Specification. We will provide
> more details on what we(cavium) require.

[Fiona] I look forward to it.

> 
> >
> >
> > Note 3: The tricky question of where the API belongs
> > We considered
> > 1. Extending cryptodev
> > 2. New acceldev APIs for device handling + compressdev APIs for data path
> > 3. New acceldev for all APIs
> > 4. New compressdev API
> >
> > We're proposing option 4, here's why:
> > 1. Extending cryptodev:
> >     .  + Nice code re-use.
> >     .  + Consistent with Linux kernel and BSD approaches.
> >     .  - But not a good name.
> >     .  - Would inevitably lead to some cryptodev API breakage - can be minimised, but then looks hacky.
> E.g.
> >             rte_cryptodev_info.max_nb_queue_pairs currently. Would need to extend to
> >             rte_cryptodev_info.max_nb_queue_pairs (Ideally rename this to max_nb_sym_.... but that's an
> API breakage)
> >             rte_cryptodev_info.max_nb_comp_queue_pairs
> >             rte_cryptodev_info.max_nb_asym_queue_pairs
> >           Similarly fn names, e.g. rte_cryptodev_queue_pair_setup.
> >            Rename to rte_cryptodev_sym_queue_pair_setup, add _comp and _asym fns? Or add
> parameter?
> > 2. New acceldev APIs for device handling + compressdev APIs for data path:
> >     ~70% of cryptodev code is device-handling which is common to all accelerators.
> >     Rather than providing new compressdev APIs, provide APIs which can be used for compression
> >     and also other accelerators, e.g. bbdev, FPGAs. Would look like rte_acceldev_info_get(),
> >     rte_acceldev_configure(), rte_compress_enqueue_burst(), etc
> >     . + Nice future code re-use.
> >     . + No cryptodev API breakage, though would be nice over time to move to acceldev APIs for
> >           cryptodev and deprecate cryptodev APIs.
> >     . - acceldev either has to understand structs for all services with resulting dependencies OR
> >             it's independent with minimal type-checking, lots of void* and casting.
> >     . - potential API churn if services have different needs
> >     . - usability - application has to sometimes use acceldev, sometimes compressdev
> > 3. New acceldev for all APIs:
> >     . As 2 but with only the xforms and ops different for each service. All APIs are rte_acceldev_....
> >     . Similar pros/cons.
> >     . - No flexibility for accelerators services to have service-specific APIs. Services are more tightly
> coupled.
> > 4. New compressdev API: (possibly with an acceldev under the hood)
> >     . + Better name
> >     . + Usability - rte_compdev_... for everything
> >     . + No cryptodev API breakage
> >     . + Flexibility to add any compression-specific APIs if needed.
> >     . + Simpler, more decoupled code
> >     . - More code duplication.
> >           May in a later iteration implement an acceldev under the hood, for code re-use, but
> >           this is hidden from applications. Service APIs could pick and choose whatever makes sense
> >           from acceldev. POC on this in-progress, also awaiting Hemant Agrawal's proposed
> >           rte_raw_device RFC, which may fit the acceldev role.
> >
> > Note 4: Chaining of compression with other services.
> >     Option 1 would make it easy to chain compression with sym crypto. Though not with other non-
> crypto
> >     operations if this is a future use-case. However it can be achieved with any option, following the
> pattern
> >     of rte_security API, so is not a factor in above decision.
> >
> 
> We are in favor of solution #4. Apart from reasons as you stated,
> nature of both operations are very different and it add lots of
> complexity to both ends (crypto + compression) to co-exist with
> different behavior.
> However I would like to understand more about acceldev proposal? Is
> there any document that I can see to understands its purpose?

[Fiona] Not at the moment.
We'd like to wait for Hemant's rte_raw_devices RFC before making a proposal on this.

> 
> >
> >
> > Code extracts below:
> >
> >
> >
> > diff --git a/lib/librte_compressdev/rte_comp.h b/lib/librte_compressdev/rte_comp.h
> > +/**
> > + * @file rte_comp.h
> > + *
> > + * RTE definitions for Data Compression Service
> > + *
> > + */
> > +
> > +/** Compression Algorithms */
> > +enum rte_comp_algorithm {
> > +      RTE_COMP_NULL = 1,
> > +      /**< NULL compression algorithm. */
> > +      RTE_COMP_DEFLATE,
> > +      /**< DEFLATE compression algorithm
> > +      * https://tools.ietf.org/html/rfc1951 */
> > +      RTE_COMP_LIST_END
> > +};
> > +
> 
> We support LZS offload as well, need to extend with RTE_COMP_LZS.

[Fiona] Agreed. Do you see any need for lzs-specific parameters? i.e. is there a
 need for a struct rte_comp_lzs_params and, if so, what should it contain?

> 
> > +
> > +/** Compression checksum types */
> > +enum rte_comp_checksum_type {
> > +      RTE_COMP_NONE,
> > +      /**< No checksum generated */
> > +      RTE_COMP_CRC32,
> > +      /**< Generates a CRC32 checksum, as used by gzip */
> > +      RTE_COMP_ADLER32,
> > +      /**< Generates an Adler-32 checksum, as used by zlib */
> > +      RTE_COMP_CRC32_ADLER32,
> > +      /**< Generates both Adler-32 and CRC32 checksums, concatenated.
> > +      * CRC32 is in the lower 32bits, Adler-32 in the upper 32 bits.
> > +      */
> > +};
> > +
> 
> In addition to these we support SHA1 and SHA256.
> To incorporate these, we suggest the following:
> 
> enum rte_comp_integrity_algo {
>    RTE_COMP_NONE,
>    RTE_COMP_CRC32,
>    RTE_COMP_ADLER32,
>    RTE_COMP_CRC32_ADLER32,
>    RTE_COMP_SHA1,
>    RTE_COMP_SHA256,
> };

[Fiona] Can you elaborate on why these are needed in the compression context?
With lengths of 20 and 32 bytes they add considerable overhead to smaller packets.
How do you propose to return the checksum? The others proposed all fit into an 8-byte
field in the op. 

> 
> > +
> > +/** Compression Deflate Huffman Type */
> > +enum rte_comp_huffman {
> > +      RTE_COMP_STATIC,
> > +      /**< Static encoding */
> > +      RTE_COMP_DYNAMIC,
> > +      /**< Dynamic encoding */
> > +};
> > +
> 
> We would like to see a RTE_COMP_DEFAULT for situations where
> application cannot make a choice. Also, To be more consistent to
> naming and deflate RFC1951, should be renamed as RTE_COMP_FIXED and
> RTE_COMP_DYNAMIC.

[Fiona] Agreed
> 
> > +/** Compression integrity check */
> > +enum rte_comp_integrity_check {
> > +      RTE_COMP_NO_VERIFY,
> > +      /**< No integrity check on output data */
> > +      RTE_COMP_VERIFY,
> > +      /**< Integrity check on output data. Only applies to compress operation.
> > +      * After compress operation done, decompress and verify it matches original data.
> > +      */
> 
> This looks more to me as diagnostic tool / test app.. It should be
> part of an Application than a library
[Fiona]
If only a diagnostic tool then yes, it needn't be on the API.
However, data integrity is a critical requirement for lossless compression. It is extremely difficult to
fully validate any compression implementation due to the infinite variety of input data patterns and
the fact that there are many possible outputs for any given input.
So applications sometimes build verification into the system.
This could be done by the application, above the API, but as is compute intensive it can also
benefit from offloading to hardware. And saves PCI bandwidth if done in same operation as
compression.  So we would like to have this option on the API. 
As with any option we should also expose it on the capabilities API and PMDs can signal to the
application whether or not they support it.
 

> 
> > +};
> > +
> > +enum rte_comp_flush_flag {
> > +      RTE_COMP_FLUSH_NONE,
> > +      /**< TODO */
> > +      RTE_COMP_FLUSH_FULL,
> > +      /**< TODO */
> > +      RTE_COMP_FLUSH_FINAL
> > +      /**< TODO */
> > +};
> > +
> 
> We need to add RTE_COMP_FLUSH_SYNC.
>  Also, there could be some algorithms where these FLUSH type may not
> be   relevant or not all may be supported , so what are PMD
> expectations for such   case?
[Fiona] Let's hold the discussion of flush flag values until we talk about stateful compression.
If a PMD doesn't support a flush flag value we should provide for this to be signalled
on the capabilities API.

> 
> > +
> > +/** Status of compression operation */
> > +enum rte_comp_op_status {
> > +      RTE_COMP_OP_STATUS_SUCCESS,
> > +      /**< Operation completed successfully */
> > +      RTE_COMP_OP_STATUS_NOT_PROCESSED,
> > +      /**< Operation has not yet been processed by the device */
> > +      RTE_COMP_OP_STATUS_INVALID_SESSION,
> > +      /**< operation failed due to invalid session arguments */
> > +      RTE_COMP_OP_STATUS_INVALID_ARGS,
> > +      /**< Operation failed due to invalid arguments in request */
> > +      RTE_COMP_OP_STATUS_ERROR,
> > +      /**< Error handling operation */
> > +      RTE_COMP_OP_STATUS_OVERFLOW,
> 
> If I understand the purpose correctly then this indicates a
> situation where output buffer exhausted during
> compression/decompression.
> So it should be renamed as RTE_COMP_OP_STATUS_OUT_OF_SPACE
[Fiona] Agreed.

> 
> > +      /**< Output buffer filled up before all data in input buffer was consumed */
> 
> This could also arise where input buffer is fully consumed during
> decompression. So comment can be more generic such as
> “Output buffer exhausted during operation.”
[Fiona] In this case I would expect the status to be RTE_COMP_OP_STATUS_SUCCESS.
The consumed parameter can be used by the application to detect that it has
come to the end of the output buffer.

 
> > +      RTE_COMP_OP_STATUS_VERIFY_FAILED
> > +      /**< Decompression after compression failed to regenerate original data */
> 
> Believe it isn't needed to be part of Spec.
> 
> > +
> > +      //Note:
> > +      //QAT API has 19 error types.
> > +      //ISA-l has 5 inflate and 6 deflate errors.
> > +      //zlib has 6 errors
> > +      //Propose only include common subset in status - only values where appl should have different
> behaviour.
> > +      //Add separate error field on op return which a PMD could populate with PMD-specific debug
> info.
> 
> Probably should add RTE_COMP_OP_STATUS_INVALID_STATE in case an
> API is invoked in invalid state.
[Fiona] Agreed

> Rest Subset looks self-contained.
> 
> > +};
> > +
> > +
> > +/** Compression transform types */
> > +enum rte_comp_xform_type {
> > +      RTE_COMP_COMPRESS,
> > +      /**< Compression service - compress */
> > +      RTE_COMP_DECOMPRESS,
> > +      /**< Compression service - decompress */
> > +      RTE_COMP_CHECKSUM
> > +      /** Compression service - generate checksum */
> > +};
> > +
> 
> Enumeration may be renamed as rte_comp_op_type
[Fiona] See answer re xforms below.

> 
> > +/** Parameters specific to the deflate algorithm */
> > +struct rte_comp_deflate_params {
> > +      enum rte_comp_huffman huffman;
> > +      /**< Compression huffman encoding type */
> > +};
> > +
> > +/**
> > + * Session Setup Data common to all compress transforms.
> > + * Includes parameters common to stateless and stateful
> > + */
> > +struct rte_comp_compress_common_params {
> > +      enum rte_comp_algorithm algo;
> > +      /**< Algorithm to use for compress operation */
> > +      union {
> > +            struct rte_comp_deflate_params deflate;
> > +            /**< Parameters specific to the deflate algorithm */
> > +      }; /**< Algorithm specific parameters */
> > +      uint8_t level;
> > +      /**< Compression level
> > +      * Range is 1-9, where 1 is fastest compression
> > +      * but worst, i.e. highest, compression ratio.
> > +      * Higher numbers may give better compression ratios and are likely slower.
> > +      * The number is interpreted by each PMD differently.
> > +       */
> 
> Different implementations of same algo may have different levels
> supported. So, we can add enum alongside this parameter such as:
> 
> enum rte_comp_compression_level {
> RTE_COMP_COMPRESSION_LEVEL_DEFAULT=0,
> /** Use PMD Default */
> RTE_COMP_COMPRESSION_LEVEL_MIN=1,
> /** Fastest Compression */
> RTE_COMP_COMPRESSION_LEVEL_MAX=9
> /** Maximum level supported by compression PMD */
> }
[Fiona] Agreed.

> 
> > +};
> > +
> > +/**
> > + * Session Setup Data for stateful compress transform.
> > + * Extra params for stateful transform
> > + */
> > +struct rte_comp_compress_stateful_params {
> > +      //TODO : add extra params just needed for stateful, e.g.
> > +      // history buffer size, window size, state, state buffers, etc...?
> 
> I will provide my comments on it in another email.
> 
> > +};
> > +/* Session Setup Data for compress transform. */
> > +struct rte_comp_compress_xform {
> > +      struct rte_comp_compress_common_params cmn;
> > +      struct rte_comp_compress_stateful_params stateful;
> > +};
> > +
> > +/**
> > + * Session Setup Data common to all decompress transforms.
> > + * Includes parameters common to stateless and stateful
> > + */
> > +struct rte_comp_decompress_common_params {
> > +      enum rte_comp_algorithm algo;
> > +      /**< Algorithm to use for decompression */
> > +};
> > +/**
> > + *  Session Setup Data for decompress transform.
> > + * Extra params for stateful transform
> > + */
> > +struct rte_comp_decompress_stateful_params {
> > +      //TODO : add extra params just needed for stateful, e.g.
> > +      // history buffer size, window size, state, state buffers, etc...?
> > +};
> > +/* Session Setup Data for decompress transform. */
> > +struct rte_comp_decompress_xform {
> > +      struct rte_comp_decompress_common_params cmn;
> > +      struct rte_comp_decompress_stateful_params stateful;
> > +};
> > +
> > +/**
> > + *  Generate checksum on uncompressed data.
> > + * Applies to both compress and decompress operations. */
> > +/* Session Setup Data for checksum transform. */
> > +struct rte_comp_checksum_xform {
> > +      enum rte_comp_checksum_type chksum_type;
> > +      /**< This parameter determines the checksum type */
> > +};
> > +
> > +
> > +/**
> > + * Compression transform structure.
> > + *
> > + * This is used to specify the compression transforms required, multiple transforms
> > + * can be chained together to specify a chain of transforms such as
> > + * generate checksum, then compress. Each transform structure can
> > + * hold a single transform, the type field is used to specify which transform
> > + * is contained within the union.
> > + * The supported combinations are:
> > + *  - compress-only
> > + *  - decompress-only
> > + *  - generate checksum and compress input data
> > + *  - decompress input data, generate checksum on output data
> > + *
> > + */
> > +struct rte_comp_xform {
> > +      struct rte_comp_xform *next;
> > +      /**< next xform in chain */
> > +      enum rte_comp_xform_type type;
> > +      /**< xform type */
> > +      union {
> > +            struct rte_comp_compress_xform compress;
> > +            /**< xform for compress operation */
> > +            struct rte_comp_decompress_xform decompress;
> > +            /**< xform for decompress operation */
> > +            struct rte_comp_checksum_xform chksum;
> > +            /**< xform to generate checksums */
> 
> We need compress+checksum(and decompress+checksum) as a single xform
> instead of requiring 2 xforms. The use of xform chain will introduce
> needless overhead for the most typical use-cases of comp+checksum(and
> decomp+checksum).
> 
[Fiona] ok. 
I can merge so we just have compress & decompress xforms with an integrity (checksum) param.
So now I see why you want to rename rte_comp_xform_type as rte_comp_op_type.
Calling it ..op_type though implies a 1:1 mapping between xform and operation,
so assumes there will never be any chained xforms. 
If this is the case we should also remove the *next parameter.
I'd followed the crypto xform chain pattern in case it proves more extendable in future,
but if we can't foresee any case for it, then we should simplify.
If we choose to keep the *next then we should keep the name as rte_comp_xform_type. 


> > +      };
> > +};
> > +
> > +
> > +struct rte_comp_session;
> > +
> > +/**
> > + * Compression Operation.
> > + *
> > + * This structure contains data relating to performing a compression
> > + * operation on the referenced mbuf data buffers.
> > + *
> > + * All compression operations are Out-of-place (OOP) operations,
> > + * as the size of the output data is different to the size of the input data.
> > + *
> > + * Comp operations are enqueued and dequeued in comp PMDs using the
> > + * rte_compdev_enqueue_burst() / rte_compdev_dequeue_burst() APIs
> > + */
> > +struct rte_comp_op {
> > +      uint8_t status;
> > +      /**<
> > +      * operation status - use values from enum rte_comp_status.
> > +      * This is reset to
> > +      * RTE_COMP_OP_STATUS_NOT_PROCESSED on allocation from mempool and
> > +      * will be set to RTE_COMP_OP_STATUS_SUCCESS after operation
> > +      * is successfully processed by a PMD
> > +      */
> > +      uint16_t debug_status;
> > +      /**<
> > +      * Status of the operation is returned in the rte_crypto_op.
> > +      * This field allows the PMD to pass back extra
> > +      * pmd-specific debug information. Value is not defined on the API.
> > +      */
> > +      struct rte_comp_session *session;
> > +      /**< Handle for the initialised session context */
> > +      struct rte_mempool *mempool;
> > +      /**< mempool from which operation is allocated */
> > +      phys_addr_t phys_addr;
> > +      /**< physical address of this operation */
> > +
> > +      struct rte_mbuf *m_src;   /**< source mbuf */
> > +      struct rte_mbuf *m_dst;   /**< destination mbuf */
> > +
> > +      struct {
> > +            uint32_t offset;
> > +            /**< Starting point for compression or decompression,
> > +            * specified as number of bytes from start of packet in
> > +            * source buffer.
> > +            * Starting point for checksum generation in compress direction.
> > +            */
> > +            uint32_t length;
> > +            /**< The length, in bytes, of the data in source buffer
> > +            * to be compressed or decompressed.
> > +            * Also the length of the data over which the checksum
> > +            * should be generated in compress direction
> > +            */
> > +      } src;
> > +      struct {
> > +            uint32_t offset;
> > +            /**< Starting point for writing output data, specified as
> > +            * number of bytes from start of packet in m_dst
> > +            * buffer. Starting point for checksum generation in
> > +            * decompress direction.
> > +            */
> > +      } dst;
> > +      enum rte_comp_flush_flag flush_flag;
> > +      /**< defines flush characteristics for the output data.
> > +      * Only applicable in compress direction.
> > +      */
> > +      enum rte_comp_integrity_check verify;
> > +      /**< Specifies if the output data should be verified.
> > +        Only applicable in compress direction.
> > +      */
> 
> Again, believe we can get away with this by moving to application.
> 
> Thanks
> Shally
> 
> > +      uint64_t input_chksum;
> > +      /**< An input checksum can be provided to generate a
> > +      * cumulative checksum across sequential blocks.
> > +      * Checksum type is as specified in chksum_type in xform.
> > +      */
> > +      uint64_t output_chksum;
> > +      /**< If a checksum is generated it will be written in here.
> > +      * Checksum type is as specified in chksum_type in xform.
> > +      */
> > +      uint32_t consumed;
> > +      /**< The number of bytes from the source buffer
> > +      * which were compressed/decompressed.
> > +      */
> > +      uint32_t produced;
> > +      /**< The number of bytes written to the destination buffer
> > +      * which were compressed/decompressed.
> > +      */
> > +      /*
> > +      TODO - Are any extra params needed on stateful op or are all in xform?
> > +      rte_comp_op_common_params/_stateful_params?
> > +      extra values apply to flush flag
> > +      */
> > +};
> > +
> > +
> > + int
> > + rte_compdev_configure(uint8_t dev_id, struct rte_compdev_config *config);
> > + int
> > + rte_compdev_start(uint8_t dev_id);
> > + void
> > + rte_compdev_stop(uint8_t dev_id);
> > /* etc...Rest of APIs follow same pattern, similar to cryptodev, stripped from RFC for simplicity. */
> >
> > +/**
> > + * Compression supported feature flags
> > + *
> > + * Note:
> > + * New features flags should be added to the end of the list
> > + *
> > + * Keep these flags synchronised with rte_compdev_get_comp_feature_name()
> > + * Note: bit numbers here are same as on cryptodev, hence the gaps.
> > + * This is an implementation detail which may change.
> > + */
> > + #define       RTE_COMPDEV_FF_CPU_SSE          (1ULL << 3)
> > + /**< Utilises CPU SIMD SSE instructions */
> > + #define       RTE_COMPDEV_FF_CPU_AVX          (1ULL << 4)
> > + /**< Utilises CPU SIMD AVX instructions */
> > + #define       RTE_COMPDEV_FF_CPU_AVX2         (1ULL << 5)
> > + /**< Utilises CPU SIMD AVX2 instructions */
> > + #define       RTE_COMPDEV_FF_HW_ACCELERATED           (1ULL << 7)
> > + /**< Operations are off-loaded to an external hardware accelerator */
> > + #define       RTE_COMPDEV_FF_CPU_AVX512               (1ULL << 8)
> > + /**< Utilises CPU SIMD AVX512 instructions */
> > + #define       RTE_COMPDEV_FF_CPU_NEON                        (1ULL << 10)
> > + /**< Utilises CPU NEON instructions */
> > + #define       RTE_COMPDEV_FF_CPU_ARM_CE               (1ULL << 11)
> > + /**< Utilises ARM CPU Cryptographic Extensions */
> > +
> > + #define     RTE_COMP_FF_MBUF_SCATTER_GATHER                    (1ULL << 12)
> > + /**< Scatter-gather mbufs are supported */
> > + #define     RTE_COMP_FF_STATEFUL                                                 (1ULL << 13)
> > + /**< Stateful compression is supported */
> > + #define     RTE_COMPDEV_FF_MULTI_PKT_CHECKSUM               (1ULL << 14)
> > + /**< Generation of checksum across multiple packets is supported */
> > +
> > +
> > +
> > +/**  Compression device information */
> > +struct rte_compdev_info {
> > +      const char *driver_name;  /**< Driver name. */
> > +      uint8_t driver_id;               /**< Driver identifier */
> > +      struct rte_device *dev;          /**< Device information. */
> > +
> > +      uint64_t feature_flags;
> > +      /**< device feature flags */
> > +     const struct rte_comp_capabilities *capabilities;
> > +      /**< Array of devices capabilities */
> >
> > +      unsigned max_nb_queue_pairs;
> > +      /**< Maximum number of queues pairs supported by device. */
> > +      unsigned max_nb_sessions;
> > +      /**< Maximum number of sessions supported by device. */
> > +      unsigned int max_nb_sessions_per_qp;
> > +      /**< Maximum number of sessions per queue pair.
> > +      * Default 0 for infinite sessions
> > +      */
> > +      /* TODO uint8_t needs_intermediate_buffer OR should this be a feature flag? Or not necessary?
> */
> > +};
> > +
> > +
> > +/** Compression device statistics */
> > +struct rte_compdev_stats {
> > +      uint64_t enqueued_count;
> > +      /**< Count of all operations enqueued */
> > +      uint64_t dequeued_count;
> > +      /**< Count of all operations dequeued */
> > +
> > +      uint64_t enqueue_err_count;
> > +      /**< Total error count on operations enqueued */
> > +      uint64_t dequeue_err_count;
> > +      /**< Total error count on operations dequeued */
> > +};
> > +
> > +
> > +/** Compression device configuration structure */
> > +struct rte_compdev_config {
> > +      int socket_id;
> > +      /**< Socket to allocate resources on */
> > +      uint16_t nb_queue_pairs;
> > +      /**< Total number of queue pairs to configure on a device. */
> > +      uint32_t comp_intermediate_buf_size;
> > +     /**< For deflate algorithm HW accelerators usually need to allocate
> > +      * a pool of intermediate buffers for dynamic compression.
> > +      * This indicates the buffer size and should be
> > +      * set a little larger than the expected max source buffer size.
> > +      * if the output of static compression doesn't fit in the
> > +      * intermediate buffer dynamic compression may not be possible,
> > +      * in this case the accelerator may revert back to static compression.
> > +      */
> > +};
> > +
> > +
> > +/** Compression device queue pair configuration structure. */
> > +struct rte_compdev_qp_conf {
> > +      uint32_t nb_descriptors; /**< Number of descriptors per queue pair */
> > +};
> > +
> > +
> > +typedef uint16_t (*dequeue_pkt_burst_t)(void *qp,
> > +            struct rte_comp_op **ops, uint16_t nb_ops);
> > +/**< Dequeue processed packets from queue pair of a device. */
> > +
> > +typedef uint16_t (*enqueue_pkt_burst_t)(void *qp,
> > +            struct rte_comp_op **ops, uint16_t nb_ops);
> > +/**< Enqueue packets for processing on queue pair of a device. */
> > +
> > +
> > +
> > +/** The data structure associated with each compression device. */
> > +struct rte_compdev {
> > +
> > +      dequeue_pkt_burst_t dequeue_burst;
> > +      /**< Pointer to PMD receive function. */
> > +      enqueue_pkt_burst_t enqueue_burst;
> > +      /**< Pointer to PMD transmit function. */
> > +
> > +      struct rte_compdev_data *data;
> > +      /**< Pointer to device data */
> > +      struct rte_compdev_ops *dev_ops;
> > +      /**< Functions exported by PMD */
> > +      uint64_t feature_flags;
> > +      /**< Supported features */
> > +
> > +     struct rte_device *device;
> > +      /**< Backing device */
> > +
> > +      uint8_t driver_id;
> > +      /**< driver identifier*/
> > +
> > +      uint8_t attached : 1;
> > +      /**< Flag indicating the device is attached */
> > +
> > +}
> > +
> > +/**
> > + *
> > + * The data part, with no function pointers, associated with each device.
> > + *
> > + * This structure is safe to place in shared memory to be common among
> > + * different processes in a multi-process configuration.
> > + */
> > +struct rte_compdev_data {
> > +      uint8_t dev_id;
> > +      /**< Device ID for this instance */
> > +      uint8_t socket_id;
> > +      /**< Socket ID where memory is allocated */
> > +      char name[RTE_COMPDEV_NAME_MAX_LEN];
> > +      /**< Unique identifier name */
> > +
> > +      uint8_t dev_started : 1;
> > +      /**< Device state: STARTED(1)/STOPPED(0) */
> > +
> > +      void **queue_pairs;
> > +      /**< Array of pointers to queue pairs. */
> > +      uint16_t nb_queue_pairs;
> > +      /**< Number of device queue pairs. */
> > +
> > +      void *dev_private;
> > +      /**< PMD-specific private data */
> > +}
> > +
> > +
> > +/**
> > + *
> > + * Dequeue a burst of processed compression operations from a queue on the
> > + * device. The dequeued operation are stored in *rte_comp_op* structures
> > + * whose pointers are supplied in the *ops* array.
> > + *
> > + * The rte_compdev_dequeue_burst() function returns the number of ops
> > + * actually dequeued, which is the number of *rte_comp_op* data structures
> > + * effectively supplied into the *ops* array.
> > + *
> > + * A return value equal to *nb_ops* indicates that the queue contained
> > + * at least *nb_ops* operations, and this is likely to signify that other
> > + * processed operations remain in the devices output queue. Applications
> > + * implementing a "retrieve as many processed operations as possible" policy
> > + * can check this specific case and keep invoking the
> > + * rte_compdev_dequeue_burst() function until a value less than
> > + * *nb_ops* is returned.
> > + *
> > + * The rte_compdev_dequeue_burst() function does not provide any error
> > + * notification to avoid the corresponding overhead.
> > + *
> > + * @param   dev_id       The compression device identifier
> > + * @param   qp_id        The index of the queue pair from which to
> > + *                       retrieve processed packets. The value must be
> > + *                       in the range [0, nb_queue_pair - 1] previously
> > + *                       supplied to rte_compdev_configure().
> > + * @param   ops          The address of an array of pointers to
> > + *                       *rte_comp_op* structures that must be
> > + *                       large enough to store *nb_ops* pointers in it.
> > + * @param   nb_ops       The maximum number of operations to dequeue.
> > + *
> > + * @return
> > + *   - The number of operations actually dequeued, which is the number
> > + *   of pointers to *rte_comp_op* structures effectively supplied to the
> > + *   *ops* array.
> > + */
> > +static inline uint16_t
> > +rte_compdev_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> > +            struct rte_comp_op **ops, uint16_t nb_ops)
> > +{
> > +      struct rte_compdev *dev = &rte_compdevs[dev_id];
> > +
> > +      nb_ops = (*dev->dequeue_burst)
> > +                   (dev->data->queue_pairs[qp_id], ops, nb_ops);
> > +
> > +      return nb_ops;
> > +}
> > +
> > +/**
> > + * Enqueue a burst of operations for processing on a compression device.
> > + *
> > + * The rte_compdev_enqueue_burst() function is invoked to place
> > + * comp operations on the queue *qp_id* of the device designated by
> > + * its *dev_id*.
> > + *
> > + * The *nb_ops* parameter is the number of operations to process which are
> > + * supplied in the *ops* array of *rte_comp_op* structures.
> > + *
> > + * The rte_compdev_enqueue_burst() function returns the number of
> > + * operations it actually enqueued for processing. A return value equal to
> > + * *nb_ops* means that all packets have been enqueued.
> > + *
> > + * @param   dev_id       The identifier of the device.
> > + * @param   qp_id        The index of the queue pair which packets are
> > + *                       to be enqueued for processing. The value
> > + *                       must be in the range [0, nb_queue_pairs - 1]
> > + *                       previously supplied to
> > + *                       *rte_compdev_configure*.
> > + * @param   ops          The address of an array of *nb_ops* pointers
> > + *                       to *rte_comp_op* structures which contain
> > + *                       the operations to be processed.
> > + * @param   nb_ops       The number of operations to process.
> > + *
> > + * @return
> > + * The number of operations actually enqueued on the device. The return
> > + * value can be less than the value of the *nb_ops* parameter when the
> > + * device's queue is full or if invalid parameters are specified in
> > + * a *rte_comp_op*.
> > + */
> > +static inline uint16_t
> > +rte_compdev_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> > +            struct rte_comp_op **ops, uint16_t nb_ops)
> > +{
> > +      struct rte_compdev *dev = &rte_cryptodevs[dev_id];
> > +
> > +      return (*dev->enqueue_burst)(
> > +                   dev->data->queue_pairs[qp_id], ops, nb_ops);
> > +}
> > +
> >
> > /* All session APIs stripped from RFC for simplicity. */
> >
> > +/** compression device operations function pointer table */
> > +struct rte_compdev_ops {
> > +      compdev_configure_t dev_configure;    /**< Configure device. */
> > +      compdev_start_t dev_start;            /**< Start device. */
> > +      compdev_stop_t dev_stop;        /**< Stop device. */
> > +      compdev_close_t dev_close;            /**< Close device. */
> > +
> > +      compdev_info_get_t dev_infos_get;     /**< Get device info. */
> > +
> > +      compdev_stats_get_t stats_get;
> > +      /**< Get device statistics. */
> > +      compdev_stats_reset_t stats_reset;
> > +      /**< Reset device statistics. */
> > +
> > +      compdev_queue_pair_setup_t queue_pair_setup;
> > +      /**< Set up a device queue pair. */
> > +      compdev_queue_pair_release_t queue_pair_release;
> > +      /**< Release a queue pair. */
> > +      compdev_queue_pair_start_t queue_pair_start;
> > +      /**< Start a queue pair. */
> > +      compdev_queue_pair_stop_t queue_pair_stop;
> > +      /**< Stop a queue pair. */
> > +      compdev_queue_pair_count_t queue_pair_count;
> > +      /**< Get count of the queue pairs. */
> > +
> > +      comp_get_session_private_size_t session_get_size;
> > +      /**< Return private session. */
> > +      comp_configure_session_t session_configure;
> > +      /**< Configure a comp session. */
> > +      comp_free_session_t session_clear;
> > +      /**< Clear a comp sessions private data. */
> > +      comp_queue_pair_attach_session_t qp_attach_session;
> > +      /**< Attach session to queue pair. */
> > +      comp_queue_pair_attach_session_t qp_detach_session;
> > +      /**< Detach session from queue pair. */
> > +};
> >
> >
> >
> >
Previous message: [dpdk-dev] [RFC] Compression API in DPDK
Next message: [dpdk-dev] [RFC] Compression API in DPDK
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list