Hi Shally,

Thanks for the summary. It is very helpful. Please see comments below


On 1/4/2018 6:45 AM, Verma, Shally wrote:
> This is an RFC v2 document to brief understanding and requirements on 
> compression API proposal in DPDK. It is based on "[RFC v3] Compression API in 
> DPDK 
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F32331%2F&data=02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=JFtOnJxajgXX7s3DMZ79K7VVM7TXO8lBd6rNeVlsHDg%3D&reserved=0
>  ".
> Intention of this document is to align on concepts built into compression 
> API, its usage and identify further requirements. 
>
> Going further it could be a base to Compression Module Programmer Guide.
>
> Current scope is limited to
> - definition of the terminology which makes up foundation of compression API
> - typical API flow expected to use by applications
> - Stateless and Stateful operation definition and usage after RFC v1 doc 
> review 
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdev.dpdk.narkive.com%2FCHS5l01B%2Fdpdk-dev-rfc-v1-doc-compression-api-for-dpdk&data=02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=Fy7xKIyxZX97i7vEM6NqgrvnqKrNrWOYLwIA5dEHQNQ%3D&reserved=0
>  
> 1. Overview
> ~~~~~~~~~~~
>
> A. Compression Methodologies in compression API
> ===========================================
> DPDK compression supports two types of compression methodologies:
> - Stateless - each data object is compressed individually without any 
> reference to previous data, 
> - Stateful -  each data object is compressed with reference to previous data 
> object i.e. history of data is needed for compression / decompression
> For more explanation, please refer RFC 
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Frfc%2Frfc1951.txt&data=02%7C01%7Cahmed.mansour%40nxp.com%7C80bd3270430c473fa71d08d55368a0e1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636506631207323264&sdata=pfp2VX1w3UxH5YLcL2R%2BvKXNeS7jP46CsASq0B1SETw%3D&reserved=0
>
> To support both methodologies, DPDK compression introduces two key concepts: 
> Session and Stream.
>
> B. Notion of a session in compression API
> ================================== 
> A Session in DPDK compression is a logical entity which is setup one-time 
> with immutable parameters i.e. parameters that don't change across operations 
> and devices.
> A session can be shared across multiple devices and multiple operations 
> simultaneously. 
> A typical Session parameters includes info such as:
> - compress / decompress
> - compression algorithm and associated configuration parameters
>
> Application can create different sessions on a device initialized with 
> same/different xforms. Once a session is initialized with one xform it cannot 
> be re-initialized.
>  
> C. Notion of stream in compression API
>  =======================================
> Unlike session which carry common set of information across operations, a 
> stream in DPDK compression is a logical entity which identify related set of 
> operations and carry operation specific information as needed by device 
> during its processing.
> It is device specific data structure which is opaque to application, setup 
> and maintained by device. 
>
> A stream can be used with *only* one op at a time i.e. no two operations can 
> share same stream simultaneously.
> A stream is *must* for stateful ops processing and optional for stateless 
> (Please see respective sections for more details).
>
> This enables sharing of a session by multiple threads handling different data 
> set as each op carry its own context (internal states, history buffers et el) 
> in its attached stream. 
> Application should call rte_comp_stream_create() and attach to op before 
> beginning of  operation processing and free via rte_comp_stream_free() after 
> its complete.
>
> C. Notion of burst operations in compression API
>  =======================================
> A burst in DPDK compression is an array of operations where each op carry 
> independent set of data. i.e. a burst can look like:
>
>                                       
> ---------------------------------------------------------------------------------------------------------
>               enque_burst (|op1.no_flush | op2.no_flush | op3.flush_final | 
> op4.no_flush | op5.no_flush |)
>                                        
> ---------------------------------------------------------------------------------------------------------
>
> Where, op1 .. op5 are all independent of each other and carry entirely 
> different set of data. 
> Each op can be attached to same/different session but *must* be attached to 
> different stream.
>
> Each op (struct rte_comp_op) carry compression/decompression operational 
> parameter and is both an input/output parameter. 
> PMD gets source, destination and checksum information at input and update it 
> with bytes consumed and produced and checksum at output.
>
> Since each operation in a burst is independent and thus can complete 
> out-of-order,  applications which need ordering, should setup per-op user 
> data area with reordering information so that it can determine enqueue order 
> at deque.
>
> Also if multiple threads calls enqueue_burst() on same queue pair then it’s 
> application onus to use proper locking mechanism to ensure exclusive 
> enqueuing of operations.
>
> D. Stateless Vs Stateful
> ===================
> Compression API provide RTE_COMP_FF_STATEFUL feature flag for PMD to reflect 
> its support for Stateful operation. Each op carry an op type indicating if 
> it's to be processed stateful or stateless.
>  
> D.1 Compression API Stateless operation
> ------------------------------------------------------ 
> An op is processed stateless if it has
> -              flush value is set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL 
> (required only on compression side),
> -      op_type set to RTE_COMP_OP_STATELESS
> -              All-of the required input and sufficient large output buffer 
> to store output i.e. OUT_OF_SPACE can never occur.
>  
> When all of the above conditions are met, PMD initiates stateless processing 
> and releases acquired resources after processing of current operation is 
> complete i.e. full input consumed and full output written.
> Application can optionally attach a stream to such ops. In such case, 
> application must attach different stream to each op.
>
> Application can enqueue stateless burst via making consecutive enque_burst() 
> calls i.e. Following is relevant usage:
>  
> enqueued = rte_comp_enque_burst (dev_id, qp_id, ops1, nb_ops); 
> enqueued = rte_comp_enque_burst(dev_id, qp_id, ops2, nb_ops);  
>  
> *Note – Every call has different ops array i.e.  same rte_comp_op array 
> *cannot be re-enqueued* to process next batch of data until previous ones are 
> completely processed.
>
> D.1.1 Stateless and OUT_OF_SPACE 
> ------------------------------------------------
> OUT_OF_SPACE is a condition when output buffer runs out of space and where 
> PMD still has more data to produce. If PMD run into such condition, then it's 
> an error condition in stateless processing.
> In such case, PMD resets itself and return with status 
> RTE_COMP_OP_STATUS_OUT_OF_SPACE with produced=consumed=0 i.e. no input read, 
> no output written.
> Application can resubmit an full input with larger output buffer size.

[Ahmed] Can we add an option to allow the user to read the data that was 
produced while still reporting OUT_OF_SPACE? this is mainly useful for 
decompression applications doing search.

> D.2 Compression API Stateful operation
> ----------------------------------------------------------
>  A Stateful operation in DPDK compression means application invokes enqueue 
> burst() multiple times to process related chunk of data either because 
> - Application broke data into several ops, and/or
> - PMD ran into out_of_space situation during input processing
>
> In case of either one or all of the above conditions, PMD is required to 
> maintain state of op across enque_burst() calls and
> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin with flush value = 
> RTE_COMP_NO/SYNC_FLUSH and end at flush value RTE_COMP_FULL/FINAL_FLUSH.
>
> D.2.1 Stateful operation state maintenance
> ---------------------------------------------------------------
> It is always an ideal expectation from application that it should parse 
> through all related chunk of source data making its mbuf-chain and enqueue it 
> for stateless processing.
> However, if it need to break it into several enqueue_burst() calls, then an 
> expected call flow would be something like:
>
> enqueue_burst( |op.no_flush |)

[Ahmed] The work is now in flight to the PMD.The user will call dequeue burst 
in a loop until all ops are received. Is this correct?

> deque_burst(op) // should dequeue before we enqueue next
> enqueue_burst( |op.no_flush |)
> deque_burst(op) // should dequeue before we enqueue next
> enqueue_burst( |op.full_flush |)

[Ahmed] Why now allow multiple work items in flight? I understand that 
occasionaly there will be OUT_OF_SPACE exception. Can we just distinguish the 
response in exception cases?

>
> Here an op *must* be attached to a stream and every subsequent 
> enqueue_burst() call should carry *same* stream. Since PMD maintain ops state 
> in stream, thus it is mandatory for application to attach stream to such ops.
>
> D.2.2 Stateful and Out_of_Space
> --------------------------------------------
> If PMD support stateful and run into OUT_OF_SPACE situation, then it is not 
> an error condition for PMD. In such case, PMD return with status 
> RTE_COMP_OP_STATUS_OUT_OF_SPACE with consumed = number of input bytes read 
> and produced = length of complete output buffer.
> Application should enqueue op with source starting at consumed+1 and output 
> buffer with available space.

[Ahmed] Related to OUT_OF_SPACE. What status does the user recieve in a 
decompression case when the end block is encountered before the end of the 
input? Does the PMD continue decomp? Does it stop there and return the stop 
index?

>            
> D.2.3 Sliding Window Size
> ------------------------------------
> Every PMD will reflect in its algorithm capability structure maximum length 
> of Sliding Window in bytes which would indicate maximum history buffer length 
> used by algo.
>
> 2. Example API illustration
> ~~~~~~~~~~~~~~~~~~~~~~~
>
> Following is an illustration on API usage  (This is just one flow, other 
> variants are also possible):
> 1. rte_comp_session *sess = rte_compressdev_session_create (rte_mempool 
> *pool);  
> 2. rte_compressdev_session_init (int dev_id, rte_comp_session *sess, 
> rte_comp_xform *xform, rte_mempool *sess_pool);  
> 3. rte_comp_op_pool_create(rte_mempool ..)  
> 4. rte_comp_op_bulk_alloc (struct rte_mempool *mempool, struct rte_comp_op 
> **ops, uint16_t nb_ops);  
> 5. for every rte_comp_op in ops[],
>     5.1 rte_comp_op_attach_session (rte_comp_op *op, rte_comp_session *sess); 
>     5.2 op.op_type = RTE_COMP_OP_STATELESS
>     5.3 op.flush = RTE_FLUSH_FINAL
> 6. [Optional] for every rte_comp_op in ops[],
>     6.1 rte_comp_stream_create(int dev_id, rte_comp_session *sess, void 
> **stream); 
>     6.2 rte_comp_op_attach_stream(rte_comp_op *op, rte_comp_session *stream);

[Ahmed] What is the semantic effect of attaching a stream to every op? will 
this application benefit for this given that it is setup with op_type STATELESS

> 7.for every rte_comp_op in ops[],
>      7.1 set up with src/dst buffer
> 8. enq = rte_compressdev_enqueue_burst (dev_id, qp_id, &ops, nb_ops); 
> 9. do while (dqu < enq) // Wait till all of enqueued are dequeued 
>     9.1 dqu = rte_compressdev_dequeue_burst (dev_id, qp_id, &ops, enq);

[Ahmed] I am assuming that waiting for all enqueued to be dequeued is not 
strictly necessary, but is just the chosen example in this case

> 10. Repeat 7 for next batch of data  
> 11. for every ops in ops[]
>       11.1 rte_comp_stream_free(op->stream);
> 11. rte_comp_session_clear (sess) ;
> 12. rte_comp_session_terminate(ret_comp_sess *session)
>
> Thanks
> Shally
>
>

Reply via email to