> -----Original Message----- > From: Wang Xiang <xiang.w.w...@intel.com> > Sent: Monday, October 14, 2019 7:29 PM > To: Jerin Jacob Kollanukkaran <jer...@marvell.com> > Cc: Thomas Monjalon <tho...@monjalon.net>; dev@dpdk.org; Pavan > Nikhilesh Bhagavatula <pbhagavat...@marvell.com>; Shahaf Shuler > <shah...@mellanox.com>; Hemant Agrawal <hemant.agra...@nxp.com>; > Opher Reviv <op...@mellanox.com>; Alex Rosenbaum > <al...@mellanox.com>; Dovrat Zifroni <dov...@marvell.com>; Prasun Kapoor > <pkap...@marvell.com>; Nipun Gupta <nipun.gu...@nxp.com>; Richardson, > Bruce <bruce.richard...@intel.com>; Hong, Yang A <yang.a.h...@intel.com>; > Chang, Harry <harry.ch...@intel.com>; gu.ji...@zte.com.cn; > shanjia...@chinatelecom.cn; zhangy....@chinatelecom.cn; > lixin...@huachentel.com; wush...@inspur.com; yuying...@yxlink.com; > fanchengg...@sunyainfo.com; davidf...@tencent.com; > liuzho...@chinaunicom.cn; zhaoyon...@huawei.com; o...@yunify.com; > j...@netgate.com; Ni, Hongjun <hongjun...@intel.com>; j.bromhead@titan- > ic.com; d...@ntop.org; f...@napatech.com; arthur...@lionic.com; Guy Kaneti > <g...@marvell.com>; Smadar Fuks <smad...@marvell.com>; Liron Himi > <lir...@marvell.com>; edwin.verpla...@intel.com; keith.wi...@intel.com > Subject: [EXT] Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev > subsystem > > External Email > > ---------------------------------------------------------------------- > On Fri, Sep 27, 2019 at 02:35:00PM +0000, Jerin Jacob Kollanukkaran wrote: > > > -----Original Message----- > > > From: Wang Xiang <xiang.w.w...@intel.com> > > > > > > Hi Jerin, > > > > > > Thanks for your response. More comments below and inline. > > > > > > 1) I think the size of some varaibles (e.g. nb_matches, scan_size, > > > matching offset, etc) should be increased based on what Hyperscan > supports. > > > > > > a) struct rte_regex_ops: > > > > > > uint16_t scan_size => uint32_t scan_size > > > > I think, packet buffers will not be > 64K and getting more than > > contiguous 64K DMAable memory will be difficult in DPDK. > > Other than that, rte_regex_match is 64bit now, increasing width of Len > > could increase the size of "rte_regex_match". i.e Need more Bandwidth > > for response. > > Could other HW implementations share the views on max length is > > supported on their implementation? Based on that we can decide. > > > OK, let's gather ideas from HW implementation.
Any inputs from Mellanox or other vendors on the "width" of the type and size of "rte_regex_match" considering the performance implications. > > > > > uint8_t nb_actual_matches => uint64 nb_actual_matches > > > uint8_t nb_matches => uint64 nb__matches > > > > 2^64 matches will be never possible in practical system. How about 2^16. > > > I think the number of matches depends on the number of total rules and scan > size. Based on the definitions (16-bit nb_rules_per_group, 16-bit nb_groups > and > 16-bit scan size), the maximum possible matches could exceed 2^16. Users may > get partial matches in this case while Hyperscan doesn't make compromises. > It'll also be good to check other HW implementation. See above. > > > > > > > > b) struct rte_regex_match: > > > uint16_t offset => uint32_t offset > > > uint16_t len => uint32_t len > > > > See above. > > > > > > > > c) uint16_t > > > rte_regex_rule_db_update(uint8_t dev_id, const struct > > > rte_regex_rule *rules, > > > uint16_t nb_rules); > > > => > > > uint32_t > > > rte_regex_rule_db_update(uint8_t dev_id, const struct > > > rte_regex_rule *rules, > > > uint32_t nb_rules); > > > > OK. I will change it next version. > > > > > > > > d) int > > > rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id, > > > const struct rte_regex_qp_conf *qp_conf); > > > => > > > int > > > rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id, > > > const struct rte_regex_qp_conf *qp_conf); > > > > OK. I will change it next version. > > > > > > > > e) struct rte_regex_dev_config: > > > uint8_t nb_max_matches => uint64_t nb_max_matches > > > > 2^64 matches will be never possible in practical system. How about 2^16. > > > See above. > > > > > > > > f) struct rte_regex_dev_info: > > > uint8_t max_matches => uint64_t max_matches > > > > 2^64 matches will be never possible in practical system. How about 2^16. > > > See above. > > > > > > > > 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() > defined. > > > Are all the attributes below could be set by users? Is any of them > > > read-only? > > > > See below, > > > > > /** Enumerates RegEx device attribute identifier */ enum > > > rte_regex_dev_attr_id { > > > RTE_REGEX_DEV_ATTR_SOCKET_ID, > > > /**< The NUMA socket id to which the device is connected or > > > * a default of zero if the socket could not be determined. > > > * datatype: *int* > > > * operation: *get* > > > > *get* means read only. *get* and *set* means it support both > > operation > > > > > */ > > > RTE_REGEX_DEV_ATTR_MAX_MATCHES, > > > /**< Maximum number of matches per scan. > > > * datatype: *uint8_t* > > > * operation: *get* and *set* > > > * > > > * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F > > > */ > > > RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT, > > > /**< Upper bound scan time in ns. > > > * datatype: *uint16_t* > > > * operation: *get* and *set* > > > * > > > * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F > > > */ > > > RTE_REGEX_DEV_ATTR_MAX_PREFIX, > > > /**< Maximum number of prefix detected per scan. > > > * This would be useful for denial of service detection. > > > * datatype: *uint16_t* > > > * operation: *get* and *set* > > > * > > > * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F > > > */ > > > }; > > > > > > 3) Both RTE_REGEX_PCRE_RULE_* and > > > RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can > > > we merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F > and have > > > a unified regex_dev_capa in struct rte_regex_dev_info. > > > > Sure. I will fix it next version. > > > > > > > > > > > 4) It'll be good if we can also define synchronous matching API for > > > users who want to have a one-off scan and wait for the results. > > > > Makes sense. I will add synchronous matching API in next version(I > > understand, it will be useful for SW Implementations). Probably expose as > INFO flag to expose the it as preference. > > > > > > > > On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran > wrote: > > > > Hi Xiang, > > > > > > > > Sorry for delay in response(Was busy with 19.11 proposal > > > > deadline). Please > > > see inline. > > > > > > > > > > > > > > Reply to Xiang's queries in main thread: > > > > > > > > > > Hi all, > > > > > > > > > > Some questions regarding APIs. Could you please give more insights? > > > > > > > > > > 1) rte_regex_ops > > > > > a) rsp_flags > > > > > These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and > > > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan. > > > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a > > > > > partial match at the end of current buffer after scan. > > > > > What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F? > > > > > > > > > > [Jerin] Since we need three states to represent partial match > > > > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the > > > > > buffer, intermediate buffers with no flag, and end of the buffer > > > > > with RTE_REGEX_OPS_RSP_PMI_EOJ > > > > > > > > > [Xiang] How could a user leverage these flags for matching? > > > > > Suppose a large buffer is divided into multiple chunks. Will > > > > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't > > > > > set after scan the first chunk. Similarly, > > > > > RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching > > > > > future buffers after finish the last > > > chunk? > > > > > > > > Let me describe with an example, > > > > > > > > Assume, > > > > 1) struct rte_regex_dev_info:: max_payload_size set to 1024 > > > > 2) rte_regex_dev_config:: dev_cfg_flags configured with > > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F > > > > 3) Device programmed with matching "hello\s+world" pattern > > > > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data" > > > > and struct rte_regex_op:: scan_size = 1024 > > > > > > > > data[0..1021] = data don???t have hello world pattern data[1022] = 'h' > > > > data[1023] = 'e' > > > > > > > > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data" > > > > and struct rte_regex_op:: scan_size = 9 > > > > > > > > data[0] = 'l' > > > > data[1] = 'l' > > > > data[2] = 'o' > > > > data[3] = ' ' > > > > data[4] = 'w' > > > > data[5] = 'o' > > > > data[6] = 'r' > > > > data[7] = 'l' > > > > data[8] = 'd' > > > > > > > > If so, > > > > > > > > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in > rte_regex_ops:: > > > > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and > > > > len 2 > > > > > > > > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in > rte_regex_ops:: > > > > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9 > > > > > > > If the defined pattern is "hello.*world" instead of "hello\s+world", > > > and we enqueue following struct rte_regex_ops: > > > > > > 1) rte_regex_op:: scan_size = 1024 > > > > > > data[0..1021] = data don???t have hello world pattern > > > data[1022] = 'h' > > > data[1023] = 'e' > > > > > > 2) rte_regex_op:: scan_size = 9 > > > data[0] = 'l' > > > data[1] = 'l' > > > data[2] = 'o' > > > data[3] = ' ' > > > data[4] = 'w' > > > data[5] = 'o' > > > data[6] = 'r' > > > data[7] = 'l' > > > data[8] = 'd' > > > > > > 3) rte_regex_op:: scan_size = 5 > > > data[0] = 'w' > > > data[1] = 'o' > > > data[2] = 'r' > > > data[3] = 'l' > > > data[4] = 'd' > > > > > > Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in > rte_regex_ops:: > > > rsp_flags on dequeue > > > Where rte_regex_match:: offset is 0 and len 4? > > > > Yes. > > > > > > > > I am wondering what's your expected behavior for .* or similar > > > syntax and if there are syntax compatability issues. We report all > > > matches in > Hyperscan, e.g. > > > report end match offsets 11 and 16 for pattern "hello.*world" and > > > corpus "hello worldworld". > > > > > > BTW, not sure how other hardware devices handle cross buffer scan. > > > Hyperscan doesn't reports matches for start and intermediate buffers > > > but only reports end offset if a full match is found. > > > > > > > > > > > > > > > > > RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a > > > > > definition for a specific hardware implementation. I am > > > > > wondering what this PREFIX refers to:)? > > > > > > > > > > [Jerin] Yes. Looks like it is for hardware specific implementation. > > > > > Introduced rte_regex_dev_attr_set/get functions to make it > > > > > portable and To add new implementation specific fields. > > > > > For example, if a rule is > > > > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is > > > > > considered the factor. The prefix is a literal string, while the > > > > > factor can contain complex regular expression constructs. As a > > > > > result, rule matching occurs in two stages: prefix matching and > > > > > factor matching. > > > > > > > > > > b) user_id or user_ptr > > > > > Under what kind of circumstances should an application > > > > > pass value into these variables for enqueue and dequeuer operations? > > > > > > > > > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also > > > > > allocated using mempool normally, on enqueue, user can specify > > > > > user_id If needed to in order identify the op on dequeue if > > > > > required. The use case could be to store the sequence number > > > > > from application POV or storing the mbuf ptr in which pattern is > requested etc. > > > > > > > > > > > > > > > 2) rte_regex_match > > > > > a) offset; /**< Starting Byte Position for matched rule. > > > > > */ and uint16_t len; /**< Length of match in bytes */ > > > > > Looks like the matching offset is defined as *starting > > > > > matching offset* instead of *end matching offset*, e.g. report > > > > > the offset of > > > "a" instead of "c" > > > > > for pattern "abc". > > > > > If so, this makes it hard to integrate software regex > > > > > libraries such as Hyperscan and RE2 as they only report *end > > > > > matching offset* without length of match. > > > > > Although Hyperscan has API for *starting matching offset*, > > > > > it only delivers partial syntax support. So I think we have to > > > > > define *end of matching offset* for software solutions. > > > > > > > > > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST > tradeoffs. > > > > > I thought application would need always the length of the match. > > > > > Probably we will see how other HW implementation (from Mellanox) > > > > > etc. We will try to abstract it, probably we can make it as > > > > > function of "user requested". > > > > > [Xiang] Yes, it will be good to make it per user request. At > > > > > least from Hyperscan user's point of view, start of match and > > > > > match length are not mandatory. > > > > > > > > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In > > > > device configure. > > > > > > > > Since offset+len == end, we can introduce following generic inline > function. > > > > > > > > static inline > > > > rte_regex_match_end(truct rte_regex_match *match) { > > > > match->offset + match->len; > > > > } > > > > > > > > Example: pattern to match is "hello\s+world" and data is > > > > following data[4] = 'h' > > > > data[5] = 'e' > > > > data[6] = 'l' > > > > data[7] = 'l' > > > > data[8] = 'o' > > > > data[9] = ' ' > > > > data[10] = 'w' > > > > data[11] = 'o' > > > > data[12] = 'r' > > > > data[13] = 'l' > > > > data[14] = 'd' > > > > > > > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START > > > > match->offset returns 4 > > > > match->len returns 11 > > > > > > > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START > > > > driver MAY return the following(in hyperscan case) > > > > match->offset returns 0 > > > > match->len returns 11 + 4 > > > > > > > > In both case(irrespective of flags, to make application life easy) > > > rte_regex_match_end() would return 15. > > > > If application demands for MATCH_AS_START then driver can return > > > > match->offset returns 4 and match->len returns 11 Aka set > > > > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should > > > > use > > > rte_regex_match_end() for finding the end of the match. To make, > > > work in all cases. > > > > > > > > Is it OK? > > > > > > > Can we replace len with end offset? So we can change "offset" to > "start_offset" > > > and len to "end_ offset" in struct rte_regex_match. Users interested > > > in len could take "end_offset - start_offset". > > > We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to > > > RTE_REGEX_DEV_CFG_MATCH_START > > > > > > In your example, > > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_START > > > match->start_offset returns 4 > > > match->end_offset returns 15 > > > > > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START > > > match->start_offset returns 0 > > > match->end_offset returns 15 > > > > > > This part is little tricky as HW descriptions need to be rewritten on > > response. > > This is a one issue, I foresee earlier, to come up with > > rte_regex_match That's works for all implementation without performance > issue. > > > > We have two HW implementations, both returns start_off and len. > > Lets get input from other HW implementation on the semantics of > > rte_regex_match. Based on that, we can decide how to go about it? > > Thoughts from Mellanox or other vendors? > > > Sure. Let's get more inputs on this. > > > > > > > > > > > > > > > > > 3) rte_regex_rule_db_update() > > > > > Does this mean we can dynamically add or delete rules for an > > > > > already generated database without recompile from scratch for > > > > > hardware Regex implementation? > > > > > If so, this isn't possible for software solutions as they > > > > > don't support dynamic database update and require recompile. > > > > > > > > > > [Jerin] rte_regex_rule_db_update() internally it would call > > > > > recompile function for both HW and SW. > > > > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() > > > > > for precompiled rule database case. > > > > > [Xiang] OK, sounds like we have to save the original rule-set > > > > > for the device in order to do recompile. I see both ADD and > > > > > REMOVE operators from rte_regex_rule. > > > > > For rules with REMOVE operator, what's the expected behavior to > > > > > handle them for the old rule-set? Do we need to go through the > > > > > old rule-set and remove corresponding rules before doing recompile? > > > > > > > > Yes. > > > > > > > I think it'll be better to change rte_regex_rule_db_update() to > > > rte_regex_rule_compile() and have users to provide a full rule-set. > > > So we don't have to maintain old rule-set and decide which one to > > > keep and remove. We can simply recompile new rule-set and get rid of > > > rte_regex_rule_op in this case. > > > > > > On virtualized, HW implementations, The RULE database is maintained by > > single body. So the above scheme, works with SW and HW implementations. > > And It make user life easy as they don't need to maintain the rules. > > > > I don't have preference on the rte_regex_rule_db_update() name, I can > > change to > > rte_regex_rule_compile() if required keeping above functionality. Let me > know. > > > > > OK, I'm good if your are willing to maintain it for users. Then both > rte_regex_rule_db_update() and rte_regex_rule_compile() work for me. > > > > > > > > > > > >