> -----Original Message----- > From: Trahe, Fiona [mailto:fiona.tr...@intel.com] > Sent: 17 October 2017 20:11 > To: shally verma <shallyvermacav...@gmail.com>; dev@dpdk.org; Athreya, > Narayana Prasad <narayanaprasad.athr...@cavium.com>; Challa, Mahipal > <mahipal.cha...@cavium.com>; Verma, Shally <shally.ve...@cavium.com> > Cc: Trahe, Fiona <fiona.tr...@intel.com> > Subject: RE: [dpdk-dev] [RFC] Compression API in DPDK > > Hi Shally, > > Thanks for your feedback. Comments below. > > Regards, > Fiona > > > -----Original Message----- > > From: shally verma [mailto:shallyvermacav...@gmail.com] > > Sent: Tuesday, October 17, 2017 7:56 AM > > To: Trahe, Fiona <fiona.tr...@intel.com>; dev@dpdk.org; > > pathr...@cavium.com; Mahipal Challa <mcha...@cavium.com>; Shally > Verma > > <sve...@cavium.com> > > Subject: Re: [dpdk-dev] [RFC] Compression API in DPDK > > > > While I am still going through RFC in details, inline are my inputs to > > some of the points. > > > > On Thu, Oct 12, 2017 at 10:34 PM, Trahe, Fiona <fiona.tr...@intel.com> > wrote: > > > > > > With the vast amounts of data being transported around networks and > > > stored in storage systems, > > reducing data size is becoming ever more important. There are both > > software libraries and hardware devices available that provide > > compression, but no common API. This RFC proposes a compression API for > DPDK to address this need. > > > > > > Features: > > > . Deflate Algorithm (https://tools.ietf.org/html/rfc1951), other > > > algorithms > can be added later. > > > . Static and Dynamic Huffman encoding. > > > . Compression levels > > > . Checksum generation > > > . Asynchronous burst API > > > . Session-based. (for stateless a session may be useable across > > > devices, for stateful it wouldn't be.) > > > > > > > > > Note 1: Split of functionality above/below API When considering > > > whether features should be supported on the API or not, the decision > > > was based on > > the following: > > > The purpose of the API is to decouple the application from the > > > compute-intensive functions needed > > for compression by abstracting them under a common API. These can then > > be implemented by either hardware accelerators or optimised software > > libraries. Where features are not compute-intensive and unlikely to be > > offloaded or optimised, there's nothing to be gained by each PMD > > having to separately implement them, and it makes more sense for them > to be done above the API. So the following are not handled on the API and > can be done above. > > > . Prepending/appending protocol headers (gzip, zlib) . > > > File-handling, breaking files up into packets, reassembling. > > > . Synchronous API > > > > > > > > > Note 2: Stateful compression. > > > A better compression ratio can be achieved by using stateful > > > compression, allowing pattern matching > > across multiple sequential packets. This will be supported on the API, > > but in the interests of getting an initial RFC out for feedback, there > > are placeholder parameters on the API. Feedback welcome on what > parameters are needed for this feature. > > > > > > > We would like to see this as part of Specification. We will provide > > more details on what we(cavium) require. > > [Fiona] I look forward to it. > > > > > > > > > > > > Note 3: The tricky question of where the API belongs We considered > > > 1. Extending cryptodev 2. New acceldev APIs for device handling + > > > compressdev APIs for data path 3. New acceldev for all APIs 4. New > > > compressdev API > > > > > > We're proposing option 4, here's why: > > > 1. Extending cryptodev: > > > . + Nice code re-use. > > > . + Consistent with Linux kernel and BSD approaches. > > > . - But not a good name. > > > . - Would inevitably lead to some cryptodev API breakage - can be > minimised, but then looks hacky. > > E.g. > > > rte_cryptodev_info.max_nb_queue_pairs currently. Would need > to extend to > > > rte_cryptodev_info.max_nb_queue_pairs (Ideally rename > > > this to max_nb_sym_.... but that's an > > API breakage) > > > rte_cryptodev_info.max_nb_comp_queue_pairs > > > rte_cryptodev_info.max_nb_asym_queue_pairs > > > Similarly fn names, e.g. rte_cryptodev_queue_pair_setup. > > > Rename to rte_cryptodev_sym_queue_pair_setup, add _comp > > > and _asym fns? Or add > > parameter? > > > 2. New acceldev APIs for device handling + compressdev APIs for data > path: > > > ~70% of cryptodev code is device-handling which is common to all > accelerators. > > > Rather than providing new compressdev APIs, provide APIs which can > be used for compression > > > and also other accelerators, e.g. bbdev, FPGAs. Would look like > rte_acceldev_info_get(), > > > rte_acceldev_configure(), rte_compress_enqueue_burst(), etc > > > . + Nice future code re-use. > > > . + No cryptodev API breakage, though would be nice over time to > move to acceldev APIs for > > > cryptodev and deprecate cryptodev APIs. > > > . - acceldev either has to understand structs for all services with > resulting dependencies OR > > > it's independent with minimal type-checking, lots of void* and > casting. > > > . - potential API churn if services have different needs > > > . - usability - application has to sometimes use acceldev, > > > sometimes compressdev 3. New acceldev for all APIs: > > > . As 2 but with only the xforms and ops different for each service. > > > All > APIs are rte_acceldev_.... > > > . Similar pros/cons. > > > . - No flexibility for accelerators services to have > > > service-specific APIs. Services are more tightly > > coupled. > > > 4. New compressdev API: (possibly with an acceldev under the hood) > > > . + Better name > > > . + Usability - rte_compdev_... for everything > > > . + No cryptodev API breakage > > > . + Flexibility to add any compression-specific APIs if needed. > > > . + Simpler, more decoupled code > > > . - More code duplication. > > > May in a later iteration implement an acceldev under the hood, > > > for > code re-use, but > > > this is hidden from applications. Service APIs could pick and > > > choose > whatever makes sense > > > from acceldev. POC on this in-progress, also awaiting Hemant > Agrawal's proposed > > > rte_raw_device RFC, which may fit the acceldev role. > > > > > > Note 4: Chaining of compression with other services. > > > Option 1 would make it easy to chain compression with sym > > > crypto. Though not with other non- > > crypto > > > operations if this is a future use-case. However it can be > > > achieved with any option, following the > > pattern > > > of rte_security API, so is not a factor in above decision. > > > > > > > We are in favor of solution #4. Apart from reasons as you stated, > > nature of both operations are very different and it add lots of > > complexity to both ends (crypto + compression) to co-exist with > > different behavior. > > However I would like to understand more about acceldev proposal? Is > > there any document that I can see to understands its purpose? > > [Fiona] Not at the moment. > We'd like to wait for Hemant's rte_raw_devices RFC before making a > proposal on this. >
[Shally] Ok. > > > > > > > > > > > Code extracts below: > > > > > > > > > > > > diff --git a/lib/librte_compressdev/rte_comp.h > > > b/lib/librte_compressdev/rte_comp.h > > > +/** > > > + * @file rte_comp.h > > > + * > > > + * RTE definitions for Data Compression Service > > > + * > > > + */ > > > + > > > +/** Compression Algorithms */ > > > +enum rte_comp_algorithm { > > > + RTE_COMP_NULL = 1, [Shally] What is an API expectation on seeing algorithm as NULL? > > > + /**< NULL compression algorithm. */ > > > + RTE_COMP_DEFLATE, > > > + /**< DEFLATE compression algorithm > > > + * https://tools.ietf.org/html/rfc1951 */ > > > + RTE_COMP_LIST_END > > > +}; > > > + > > > > We support LZS offload as well, need to extend with RTE_COMP_LZS. > > [Fiona] Agreed. Do you see any need for lzs-specific parameters? i.e. is there > a need for a struct rte_comp_lzs_params and, if so, what should it contain? > [Shally] No. Current exported enum/param are sufficient. > > > > > + > > > +/** Compression checksum types */ > > > +enum rte_comp_checksum_type { > > > + RTE_COMP_NONE, > > > + /**< No checksum generated */ > > > + RTE_COMP_CRC32, > > > + /**< Generates a CRC32 checksum, as used by gzip */ > > > + RTE_COMP_ADLER32, > > > + /**< Generates an Adler-32 checksum, as used by zlib */ > > > + RTE_COMP_CRC32_ADLER32, > > > + /**< Generates both Adler-32 and CRC32 checksums, concatenated. > > > + * CRC32 is in the lower 32bits, Adler-32 in the upper 32 bits. > > > + */ > > > +}; > > > + > > > > In addition to these we support SHA1 and SHA256. > > To incorporate these, we suggest the following: > > > > enum rte_comp_integrity_algo { > > RTE_COMP_NONE, > > RTE_COMP_CRC32, > > RTE_COMP_ADLER32, > > RTE_COMP_CRC32_ADLER32, > > RTE_COMP_SHA1, > > RTE_COMP_SHA256, > > }; > > [Fiona] Can you elaborate on why these are needed in the compression > context? [Shally] One example typical use cases are in Data Centre applications where Hash is used as unique key for Data fingerprinting/identification and De-duplication. > With lengths of 20 and 32 bytes they add considerable overhead to smaller > packets. [Shally] Typical usage is with larger buffers(2KB and above). > How do you propose to return the checksum? The others proposed all fit into > an 8-byte field in the op. > [Shally] User can request both checksum and hash at output. I will propose additional associated changes after we freeze current RFC version. > > > > > + > > > +/** Compression Deflate Huffman Type */ enum rte_comp_huffman { > > > + RTE_COMP_STATIC, > > > + /**< Static encoding */ > > > + RTE_COMP_DYNAMIC, > > > + /**< Dynamic encoding */ > > > +}; > > > + > > > > We would like to see a RTE_COMP_DEFAULT for situations where > > application cannot make a choice. Also, To be more consistent to > > naming and deflate RFC1951, should be renamed as RTE_COMP_FIXED and > > RTE_COMP_DYNAMIC. > > [Fiona] Agreed > > > > > +/** Compression integrity check */ > > > +enum rte_comp_integrity_check { > > > + RTE_COMP_NO_VERIFY, > > > + /**< No integrity check on output data */ > > > + RTE_COMP_VERIFY, > > > + /**< Integrity check on output data. Only applies to compress > operation. > > > + * After compress operation done, decompress and verify it matches > original data. > > > + */ > > > > This looks more to me as diagnostic tool / test app.. It should be > > part of an Application than a library > [Fiona] > If only a diagnostic tool then yes, it needn't be on the API. > However, data integrity is a critical requirement for lossless compression. > It is > extremely difficult to fully validate any compression implementation due to > the infinite variety of input data patterns and the fact that there are many > possible outputs for any given input. > So applications sometimes build verification into the system. > This could be done by the application, above the API, but as is compute > intensive it can also benefit from offloading to hardware. And saves PCI > bandwidth if done in same operation as compression. So we would like to > have this option on the API. > As with any option we should also expose it on the capabilities API and PMDs > can signal to the application whether or not they support it. > [Shally] Capability idea looks good to me making it optional. Here are few more things to consider: - Memory requirement to store decompression results - should get from user Or allocate internally if verification turned on? - It's usage if burst size > 1. - Different implementation may have their own version so following comment could be more generic such as /**< Integrity check on output data. Only applies to compress operation. * Verify correctness of compression operation on data. */ Just for my curiosity, does your hardware support verification offload? > > > > > > +}; > > > + > > > +enum rte_comp_flush_flag { > > > + RTE_COMP_FLUSH_NONE, > > > + /**< TODO */ > > > + RTE_COMP_FLUSH_FULL, > > > + /**< TODO */ > > > + RTE_COMP_FLUSH_FINAL > > > + /**< TODO */ > > > +}; > > > + > > > > We need to add RTE_COMP_FLUSH_SYNC. > > Also, there could be some algorithms where these FLUSH type may not > > be relevant or not all may be supported , so what are PMD > > expectations for such case? > [Fiona] Let's hold the discussion of flush flag values until we talk about > stateful compression. > If a PMD doesn't support a flush flag value we should provide for this to be > signalled on the capabilities API. > [Shally] Ok. > > > > > + > > > +/** Status of compression operation */ enum rte_comp_op_status { > > > + RTE_COMP_OP_STATUS_SUCCESS, > > > + /**< Operation completed successfully */ > > > + RTE_COMP_OP_STATUS_NOT_PROCESSED, [Shally] I have some questions around it, please see my comments on rte_comp_op structure. > > > + /**< Operation has not yet been processed by the device */ > > > + RTE_COMP_OP_STATUS_INVALID_SESSION, > > > + /**< operation failed due to invalid session arguments */ > > > + RTE_COMP_OP_STATUS_INVALID_ARGS, > > > + /**< Operation failed due to invalid arguments in request */ > > > + RTE_COMP_OP_STATUS_ERROR, > > > + /**< Error handling operation */ > > > + RTE_COMP_OP_STATUS_OVERFLOW, > > > > If I understand the purpose correctly then this indicates a situation > > where output buffer exhausted during compression/decompression. > > So it should be renamed as RTE_COMP_OP_STATUS_OUT_OF_SPACE > [Fiona] Agreed. > > > > > > + /**< Output buffer filled up before all data in input buffer was > consumed */ > > > > This could also arise where input buffer is fully consumed during > > decompression. So comment can be more generic such as > > “Output buffer exhausted during operation.” > [Fiona] In this case I would expect the status to be > RTE_COMP_OP_STATUS_SUCCESS. > The consumed parameter can be used by the application to detect that it has > come to the end of the output buffer. > [Shally] - Have a question here, consider a scenario - input is fully read i.e consumed = length of src data, - output is written i.e. produced = length of data written in dst buffer, and - status = OP_STATUS_SUCCESS Then how an application would know there is more data to be flushed and application need to re-enqueue request with more output buffer? > > > > + RTE_COMP_OP_STATUS_VERIFY_FAILED > > > + /**< Decompression after compression failed to regenerate original > data */ [Shally] It can simply be "data verification failed". > > > > Believe it isn't needed to be part of Spec. > > > > > + > > > + //Note: > > > + //QAT API has 19 error types. > > > + //ISA-l has 5 inflate and 6 deflate errors. > > > + //zlib has 6 errors > > > + //Propose only include common subset in status - only values where > appl should have different > > behaviour. > > > + //Add separate error field on op return which a PMD could populate > with PMD-specific debug > > info. > > > > Probably should add RTE_COMP_OP_STATUS_INVALID_STATE in case an > > API is invoked in invalid state. > [Fiona] Agreed > > > Rest Subset looks self-contained. > > > > > +}; > > > + > > > + > > > +/** Compression transform types */ > > > +enum rte_comp_xform_type { > > > + RTE_COMP_COMPRESS, > > > + /**< Compression service - compress */ > > > + RTE_COMP_DECOMPRESS, > > > + /**< Compression service - decompress */ > > > + RTE_COMP_CHECKSUM > > > + /** Compression service - generate checksum */ > > > +}; > > > + > > > > Enumeration may be renamed as rte_comp_op_type > [Fiona] See answer re xforms below. > > > > > > +/** Parameters specific to the deflate algorithm */ > > > +struct rte_comp_deflate_params { > > > + enum rte_comp_huffman huffman; > > > + /**< Compression huffman encoding type */ > > > +}; > > > + > > > +/** > > > + * Session Setup Data common to all compress transforms. > > > + * Includes parameters common to stateless and stateful > > > + */ > > > +struct rte_comp_compress_common_params { > > > + enum rte_comp_algorithm algo; > > > + /**< Algorithm to use for compress operation */ > > > + union { > > > + struct rte_comp_deflate_params deflate; > > > + /**< Parameters specific to the deflate algorithm */ > > > + }; /**< Algorithm specific parameters */ > > > + uint8_t level; > > > + /**< Compression level > > > + * Range is 1-9, where 1 is fastest compression > > > + * but worst, i.e. highest, compression ratio. > > > + * Higher numbers may give better compression ratios and are likely > slower. > > > + * The number is interpreted by each PMD differently. > > > + */ > > > > Different implementations of same algo may have different levels > > supported. So, we can add enum alongside this parameter such as: > > > > enum rte_comp_compression_level { > > RTE_COMP_COMPRESSION_LEVEL_DEFAULT=0, > > /** Use PMD Default */ > > RTE_COMP_COMPRESSION_LEVEL_MIN=1, > > /** Fastest Compression */ > > RTE_COMP_COMPRESSION_LEVEL_MAX=9 > > /** Maximum level supported by compression PMD */ > > } > [Fiona] Agreed. > > > > > > +}; > > > + > > > +/** > > > + * Session Setup Data for stateful compress transform. > > > + * Extra params for stateful transform > > > + */ > > > +struct rte_comp_compress_stateful_params { > > > + //TODO : add extra params just needed for stateful, e.g. > > > + // history buffer size, window size, state, state buffers, etc...? > > > > I will provide my comments on it in another email. > > > > > +}; > > > +/* Session Setup Data for compress transform. */ > > > +struct rte_comp_compress_xform { > > > + struct rte_comp_compress_common_params cmn; > > > + struct rte_comp_compress_stateful_params stateful; > > > +}; > > > + > > > +/** > > > + * Session Setup Data common to all decompress transforms. > > > + * Includes parameters common to stateless and stateful > > > + */ > > > +struct rte_comp_decompress_common_params { > > > + enum rte_comp_algorithm algo; > > > + /**< Algorithm to use for decompression */ > > > +}; > > > +/** > > > + * Session Setup Data for decompress transform. > > > + * Extra params for stateful transform > > > + */ > > > +struct rte_comp_decompress_stateful_params { > > > + //TODO : add extra params just needed for stateful, e.g. > > > + // history buffer size, window size, state, state buffers, etc...? > > > +}; > > > +/* Session Setup Data for decompress transform. */ > > > +struct rte_comp_decompress_xform { > > > + struct rte_comp_decompress_common_params cmn; > > > + struct rte_comp_decompress_stateful_params stateful; > > > +}; > > > + > > > +/** > > > + * Generate checksum on uncompressed data. > > > + * Applies to both compress and decompress operations. */ > > > +/* Session Setup Data for checksum transform. */ > > > +struct rte_comp_checksum_xform { > > > + enum rte_comp_checksum_type chksum_type; > > > + /**< This parameter determines the checksum type */ > > > +}; > > > + > > > + > > > +/** > > > + * Compression transform structure. > > > + * > > > + * This is used to specify the compression transforms required, multiple > transforms > > > + * can be chained together to specify a chain of transforms such as > > > + * generate checksum, then compress. Each transform structure can > > > + * hold a single transform, the type field is used to specify which > transform > > > + * is contained within the union. > > > + * The supported combinations are: > > > + * - compress-only > > > + * - decompress-only > > > + * - generate checksum and compress input data > > > + * - decompress input data, generate checksum on output data > > > + * > > > + */ > > > +struct rte_comp_xform { > > > + struct rte_comp_xform *next; > > > + /**< next xform in chain */ > > > + enum rte_comp_xform_type type; > > > + /**< xform type */ > > > + union { > > > + struct rte_comp_compress_xform compress; > > > + /**< xform for compress operation */ > > > + struct rte_comp_decompress_xform decompress; > > > + /**< xform for decompress operation */ > > > + struct rte_comp_checksum_xform chksum; > > > + /**< xform to generate checksums */ > > > > We need compress+checksum(and decompress+checksum) as a single > xform > > instead of requiring 2 xforms. The use of xform chain will introduce > > needless overhead for the most typical use-cases of comp+checksum(and > > decomp+checksum). > > > [Fiona] ok. > I can merge so we just have compress & decompress xforms with an integrity > (checksum) param. > So now I see why you want to rename rte_comp_xform_type as > rte_comp_op_type. > Calling it ..op_type though implies a 1:1 mapping between xform and > operation, > so assumes there will never be any chained xforms. > If this is the case we should also remove the *next parameter. > I'd followed the crypto xform chain pattern in case it proves more > extendable in future, > but if we can't foresee any case for it, then we should simplify. > If we choose to keep the *next then we should keep the name as > rte_comp_xform_type. > [Shally] I would prefer to keep "next" for extensibility, if that require naming to be retained as "xform" am fine by that. Just a note, same is proposal for hash as well i.e. compress+hash or compress+checksum+hash will be one xform. To support compress+checksum+hash, data structure would need to be modified. > > > > + }; > > > +}; > > > + > > > + > > > +struct rte_comp_session; [Shally] It is named as rte_crytodev_session in crypto API unlike here. So is it intentional? > > > + > > > +/** > > > + * Compression Operation. > > > + * > > > + * This structure contains data relating to performing a compression > > > + * operation on the referenced mbuf data buffers. > > > + * > > > + * All compression operations are Out-of-place (OOP) operations, > > > + * as the size of the output data is different to the size of the input > > > data. > > > + * > > > + * Comp operations are enqueued and dequeued in comp PMDs using > the > > > + * rte_compdev_enqueue_burst() / rte_compdev_dequeue_burst() > APIs > > > + */ > > > +struct rte_comp_op { > > > + uint8_t status; > > > + /**< > > > + * operation status - use values from enum rte_comp_status. > > > + * This is reset to > > > + * RTE_COMP_OP_STATUS_NOT_PROCESSED on allocation from > mempool and [Shally] Per my understanding, RTE_COMP_OP_STATUS_NOT_PROCESSED will be set only at the return of rte_comp_op_alloc() and never after. If that understanding correct, shouldn't it be renamed as RTE_COMP_OP_STATUS_ALLOCATED? > > > + * will be set to RTE_COMP_OP_STATUS_SUCCESS after operation > > > + * is successfully processed by a PMD > > > + */ > > > + uint16_t debug_status; > > > + /**< > > > + * Status of the operation is returned in the rte_crypto_op. [Shally] Typo correction to rte_comp_op > > > + * This field allows the PMD to pass back extra > > > + * pmd-specific debug information. Value is not defined on the API. > > > + */ > > > + struct rte_comp_session *session; > > > + /**< Handle for the initialised session context */ > > > + struct rte_mempool *mempool; > > > + /**< mempool from which operation is allocated */ > > > + phys_addr_t phys_addr; > > > + /**< physical address of this operation */ > > > + > > > + struct rte_mbuf *m_src; /**< source mbuf */ > > > + struct rte_mbuf *m_dst; /**< destination mbuf */ > > > + > > > + struct { > > > + uint32_t offset; > > > + /**< Starting point for compression or decompression, > > > + * specified as number of bytes from start of packet in > > > + * source buffer. > > > + * Starting point for checksum generation in compress > > > direction. > > > + */ > > > + uint32_t length; > > > + /**< The length, in bytes, of the data in source buffer > > > + * to be compressed or decompressed. > > > + * Also the length of the data over which the checksum > > > + * should be generated in compress direction > > > + */ > > > + } src; > > > + struct { > > > + uint32_t offset; > > > + /**< Starting point for writing output data, specified as > > > + * number of bytes from start of packet in m_dst > > > + * buffer. Starting point for checksum generation in > > > + * decompress direction. > > > + */ [Shally] Don't we need length here? indicating length (in bytes) of available dst mbuf at input? > > > + } dst; > > > + enum rte_comp_flush_flag flush_flag; > > > + /**< defines flush characteristics for the output data. > > > + * Only applicable in compress direction. > > > + */ > > > + enum rte_comp_integrity_check verify; > > > + /**< Specifies if the output data should be verified. > > > + Only applicable in compress direction. > > > + */ > > > > Again, believe we can get away with this by moving to application. > > > > Thanks > > Shally > > > > > + uint64_t input_chksum; > > > + /**< An input checksum can be provided to generate a > > > + * cumulative checksum across sequential blocks. > > > + * Checksum type is as specified in chksum_type in xform. > > > + */ > > > + uint64_t output_chksum; > > > + /**< If a checksum is generated it will be written in here. > > > + * Checksum type is as specified in chksum_type in xform. > > > + */ > > > + uint32_t consumed; > > > + /**< The number of bytes from the source buffer > > > + * which were compressed/decompressed. > > > + */ > > > + uint32_t produced; > > > + /**< The number of bytes written to the destination buffer > > > + * which were compressed/decompressed. > > > + */ > > > + /* > > > + TODO - Are any extra params needed on stateful op or are all in > xform? > > > + rte_comp_op_common_params/_stateful_params? > > > + extra values apply to flush flag > > > + */ > > > +}; > > > + > > > + > > > + int > > > + rte_compdev_configure(uint8_t dev_id, struct rte_compdev_config > *config); > > > + int > > > + rte_compdev_start(uint8_t dev_id); > > > + void > > > + rte_compdev_stop(uint8_t dev_id); > > > /* etc...Rest of APIs follow same pattern, similar to cryptodev, stripped > from RFC for simplicity. */ > > > > > > +/** > > > + * Compression supported feature flags > > > + * > > > + * Note: > > > + * New features flags should be added to the end of the list > > > + * > > > + * Keep these flags synchronised with > rte_compdev_get_comp_feature_name() > > > + * Note: bit numbers here are same as on cryptodev, hence the gaps. > > > + * This is an implementation detail which may change. > > > + */ > > > + #define RTE_COMPDEV_FF_CPU_SSE (1ULL << 3) > > > + /**< Utilises CPU SIMD SSE instructions */ > > > + #define RTE_COMPDEV_FF_CPU_AVX (1ULL << 4) > > > + /**< Utilises CPU SIMD AVX instructions */ > > > + #define RTE_COMPDEV_FF_CPU_AVX2 (1ULL << 5) > > > + /**< Utilises CPU SIMD AVX2 instructions */ > > > + #define RTE_COMPDEV_FF_HW_ACCELERATED (1ULL << 7) > > > + /**< Operations are off-loaded to an external hardware accelerator */ > > > + #define RTE_COMPDEV_FF_CPU_AVX512 (1ULL << 8) > > > + /**< Utilises CPU SIMD AVX512 instructions */ > > > + #define RTE_COMPDEV_FF_CPU_NEON (1ULL << > > > 10) > > > + /**< Utilises CPU NEON instructions */ > > > + #define RTE_COMPDEV_FF_CPU_ARM_CE (1ULL << 11) > > > + /**< Utilises ARM CPU Cryptographic Extensions */ > > > + > > > + #define RTE_COMP_FF_MBUF_SCATTER_GATHER (1ULL << > 12) > > > + /**< Scatter-gather mbufs are supported */ > > > + #define RTE_COMP_FF_STATEFUL > > > (1ULL << > 13) > > > + /**< Stateful compression is supported */ > > > + #define RTE_COMPDEV_FF_MULTI_PKT_CHECKSUM (1ULL << > 14) > > > + /**< Generation of checksum across multiple packets is supported */ > > > + [Shally] What is meaning of this feature? how it is supposed to impact burst operations. > > > + > > > + > > > +/** Compression device information */ > > > +struct rte_compdev_info { > > > + const char *driver_name; /**< Driver name. */ > > > + uint8_t driver_id; /**< Driver identifier */ > > > + struct rte_device *dev; /**< Device information. */ > > > + > > > + uint64_t feature_flags; > > > + /**< device feature flags */ > > > + const struct rte_comp_capabilities *capabilities; > > > + /**< Array of devices capabilities */ > > > > > > + unsigned max_nb_queue_pairs; > > > + /**< Maximum number of queues pairs supported by device. */ > > > + unsigned max_nb_sessions; > > > + /**< Maximum number of sessions supported by device. */ > > > + unsigned int max_nb_sessions_per_qp; > > > + /**< Maximum number of sessions per queue pair. > > > + * Default 0 for infinite sessions > > > + */ > > > + /* TODO uint8_t needs_intermediate_buffer OR should this be a > feature flag? Or not necessary? > > */ > > > +}; > > > + > > > + > > > +/** Compression device statistics */ > > > +struct rte_compdev_stats { > > > + uint64_t enqueued_count; > > > + /**< Count of all operations enqueued */ > > > + uint64_t dequeued_count; > > > + /**< Count of all operations dequeued */ > > > + > > > + uint64_t enqueue_err_count; > > > + /**< Total error count on operations enqueued */ > > > + uint64_t dequeue_err_count; > > > + /**< Total error count on operations dequeued */ > > > +}; > > > + > > > + > > > +/** Compression device configuration structure */ > > > +struct rte_compdev_config { > > > + int socket_id; > > > + /**< Socket to allocate resources on */ > > > + uint16_t nb_queue_pairs; > > > + /**< Total number of queue pairs to configure on a device. */ > > > + uint32_t comp_intermediate_buf_size; > > > + /**< For deflate algorithm HW accelerators usually need to allocate > > > + * a pool of intermediate buffers for dynamic compression. [Shally] Do you mean Dynamic Huffman coding based compression? > > > + * This indicates the buffer size and should be > > > + * set a little larger than the expected max source buffer size. > > > + * if the output of static compression doesn't fit in the > > > + * intermediate buffer dynamic compression may not be possible, > > > + * in this case the accelerator may revert back to static > > > compression. > > > + */ > > > +}; > > > + > > > + > > > +/** Compression device queue pair configuration structure. */ > > > +struct rte_compdev_qp_conf { > > > + uint32_t nb_descriptors; /**< Number of descriptors per queue pair > */ > > > +}; > > > + > > > + [Shally] Subsequent APIs and part of above has reference to cryptodev and I have some question on their one-to-one mapping. I would not list them right now. Will take up once we freeze feedbacks on current RFC version so that we can have focused discussion around rest. Thanks Shally > > > +typedef uint16_t (*dequeue_pkt_burst_t)(void *qp, > > > + struct rte_comp_op **ops, uint16_t nb_ops); > > > +/**< Dequeue processed packets from queue pair of a device. */ > > > + > > > +typedef uint16_t (*enqueue_pkt_burst_t)(void *qp, > > > + struct rte_comp_op **ops, uint16_t nb_ops); > > > +/**< Enqueue packets for processing on queue pair of a device. */ > > > + > > > + > > > + > > > +/** The data structure associated with each compression device. */ > > > +struct rte_compdev { > > > + > > > + dequeue_pkt_burst_t dequeue_burst; > > > + /**< Pointer to PMD receive function. */ > > > + enqueue_pkt_burst_t enqueue_burst; > > > + /**< Pointer to PMD transmit function. */ > > > + > > > + struct rte_compdev_data *data; > > > + /**< Pointer to device data */ > > > + struct rte_compdev_ops *dev_ops; > > > + /**< Functions exported by PMD */ > > > + uint64_t feature_flags; > > > + /**< Supported features */ > > > + > > > + struct rte_device *device; > > > + /**< Backing device */ > > > + > > > + uint8_t driver_id; > > > + /**< driver identifier*/ > > > + > > > + uint8_t attached : 1; > > > + /**< Flag indicating the device is attached */ > > > + > > > +} > > > + > > > +/** > > > + * > > > + * The data part, with no function pointers, associated with each device. > > > + * > > > + * This structure is safe to place in shared memory to be common among > > > + * different processes in a multi-process configuration. > > > + */ > > > +struct rte_compdev_data { > > > + uint8_t dev_id; > > > + /**< Device ID for this instance */ > > > + uint8_t socket_id; > > > + /**< Socket ID where memory is allocated */ > > > + char name[RTE_COMPDEV_NAME_MAX_LEN]; > > > + /**< Unique identifier name */ > > > + > > > + uint8_t dev_started : 1; > > > + /**< Device state: STARTED(1)/STOPPED(0) */ > > > + > > > + void **queue_pairs; > > > + /**< Array of pointers to queue pairs. */ > > > + uint16_t nb_queue_pairs; > > > + /**< Number of device queue pairs. */ > > > + > > > + void *dev_private; > > > + /**< PMD-specific private data */ > > > +} > > > + > > > + > > > +/** > > > + * > > > + * Dequeue a burst of processed compression operations from a queue > on the > > > + * device. The dequeued operation are stored in *rte_comp_op* > structures > > > + * whose pointers are supplied in the *ops* array. > > > + * > > > + * The rte_compdev_dequeue_burst() function returns the number of > ops > > > + * actually dequeued, which is the number of *rte_comp_op* data > structures > > > + * effectively supplied into the *ops* array. > > > + * > > > + * A return value equal to *nb_ops* indicates that the queue contained > > > + * at least *nb_ops* operations, and this is likely to signify that other > > > + * processed operations remain in the devices output queue. > Applications > > > + * implementing a "retrieve as many processed operations as possible" > policy > > > + * can check this specific case and keep invoking the > > > + * rte_compdev_dequeue_burst() function until a value less than > > > + * *nb_ops* is returned. > > > + * > > > + * The rte_compdev_dequeue_burst() function does not provide any > error > > > + * notification to avoid the corresponding overhead. > > > + * > > > + * @param dev_id The compression device identifier > > > + * @param qp_id The index of the queue pair from which to > > > + * retrieve processed packets. The value must be > > > + * in the range [0, nb_queue_pair - 1] previously > > > + * supplied to rte_compdev_configure(). > > > + * @param ops The address of an array of pointers to > > > + * *rte_comp_op* structures that must be > > > + * large enough to store *nb_ops* pointers in it. > > > + * @param nb_ops The maximum number of operations to > dequeue. > > > + * > > > + * @return > > > + * - The number of operations actually dequeued, which is the number > > > + * of pointers to *rte_comp_op* structures effectively supplied to the > > > + * *ops* array. > > > + */ > > > +static inline uint16_t > > > +rte_compdev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, > > > + struct rte_comp_op **ops, uint16_t nb_ops) > > > +{ > > > + struct rte_compdev *dev = &rte_compdevs[dev_id]; > > > + > > > + nb_ops = (*dev->dequeue_burst) > > > + (dev->data->queue_pairs[qp_id], ops, nb_ops); > > > + > > > + return nb_ops; > > > +} > > > + > > > +/** > > > + * Enqueue a burst of operations for processing on a compression > device. > > > + * > > > + * The rte_compdev_enqueue_burst() function is invoked to place > > > + * comp operations on the queue *qp_id* of the device designated by > > > + * its *dev_id*. > > > + * > > > + * The *nb_ops* parameter is the number of operations to process > which are > > > + * supplied in the *ops* array of *rte_comp_op* structures. > > > + * > > > + * The rte_compdev_enqueue_burst() function returns the number of > > > + * operations it actually enqueued for processing. A return value equal > to > > > + * *nb_ops* means that all packets have been enqueued. > > > + * > > > + * @param dev_id The identifier of the device. > > > + * @param qp_id The index of the queue pair which packets are > > > + * to be enqueued for processing. The value > > > + * must be in the range [0, nb_queue_pairs - 1] > > > + * previously supplied to > > > + * *rte_compdev_configure*. > > > + * @param ops The address of an array of *nb_ops* pointers > > > + * to *rte_comp_op* structures which contain > > > + * the operations to be processed. > > > + * @param nb_ops The number of operations to process. > > > + * > > > + * @return > > > + * The number of operations actually enqueued on the device. The > return > > > + * value can be less than the value of the *nb_ops* parameter when > the > > > + * device's queue is full or if invalid parameters are specified in > > > + * a *rte_comp_op*. > > > + */ > > > +static inline uint16_t > > > +rte_compdev_enqueue_burst(uint8_t dev_id, uint16_t qp_id, > > > + struct rte_comp_op **ops, uint16_t nb_ops) > > > +{ > > > + struct rte_compdev *dev = &rte_cryptodevs[dev_id]; > > > + > > > + return (*dev->enqueue_burst)( > > > + dev->data->queue_pairs[qp_id], ops, nb_ops); > > > +} > > > + > > > > > > /* All session APIs stripped from RFC for simplicity. */ > > > > > > +/** compression device operations function pointer table */ > > > +struct rte_compdev_ops { > > > + compdev_configure_t dev_configure; /**< Configure device. */ > > > + compdev_start_t dev_start; /**< Start device. */ > > > + compdev_stop_t dev_stop; /**< Stop device. */ > > > + compdev_close_t dev_close; /**< Close device. */ > > > + > > > + compdev_info_get_t dev_infos_get; /**< Get device info. */ > > > + > > > + compdev_stats_get_t stats_get; > > > + /**< Get device statistics. */ > > > + compdev_stats_reset_t stats_reset; > > > + /**< Reset device statistics. */ > > > + > > > + compdev_queue_pair_setup_t queue_pair_setup; > > > + /**< Set up a device queue pair. */ > > > + compdev_queue_pair_release_t queue_pair_release; > > > + /**< Release a queue pair. */ > > > + compdev_queue_pair_start_t queue_pair_start; > > > + /**< Start a queue pair. */ > > > + compdev_queue_pair_stop_t queue_pair_stop; > > > + /**< Stop a queue pair. */ > > > + compdev_queue_pair_count_t queue_pair_count; > > > + /**< Get count of the queue pairs. */ > > > + > > > + comp_get_session_private_size_t session_get_size; > > > + /**< Return private session. */ > > > + comp_configure_session_t session_configure; > > > + /**< Configure a comp session. */ > > > + comp_free_session_t session_clear; > > > + /**< Clear a comp sessions private data. */ > > > + comp_queue_pair_attach_session_t qp_attach_session; > > > + /**< Attach session to queue pair. */ > > > + comp_queue_pair_attach_session_t qp_detach_session; > > > + /**< Detach session from queue pair. */ > > > +}; > > > > > > > > > > > >