[dpdk-dev] [PATCH v3 2/2] ethdev: add hierarchical scheduler API

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Fri Apr 7 18:51:58 CEST 2017


Hi Hemant,

Thanks for your input!

...<snip>

> > +/**
> > + * Traffic manager dynamic updates
> > + */
> > +enum rte_tm_dynamic_update_type {
> > +	/**< Dynamic parent node update. The new parent node is located
> on same
> > +	 * hierarchy level as the former parent node. Consequently, the
> node
> > +	 * whose parent is changed preserves its hierarchy level.
> > +	 */
> > +	RTE_TM_UPDATE_NODE_PARENT_KEEP_LEVEL = 1 << 0,
> > +
> > +	/**< Dynamic parent node update. The new parent node is located
> on
> > +	 * different hierarchy level than the former parent node.
> Consequently,
> > +	 * the node whose parent is changed also changes its hierarchy level.
> > +	 */
> > +	RTE_TM_UPDATE_NODE_PARENT_CHANGE_LEVEL = 1 << 1,
> > +
> > +	/**< Dynamic node add/delete. */
> > +	RTE_TM_UPDATE_NODE_ADD_DELETE = 1 << 2,
> > +
> > +	/**< Suspend/resume nodes. */
> > +	RTE_TM_UPDATE_NODE_SUSPEND_RESUME = 1 << 3,
> 
> what is the expectation from suspend/resumes?
> 1. the Fan-out will stop.
> 2. the Fan-In will stop, any enqueue request will result into error?  or
> it will be implementation dependent.
> During suspend/hold state the buffers should not hold in the node.

Requirement is that fan-out should stop while node is in suspended state. This operation is similar to inhibiting dequeue operation from the associated queues, or to having the output rate of the node set to zero while in suspended state. I am taking the action to document the suspend operation.

I agree that the following items should be implementation specific. They are related to enqueue side and in my mind they are not related to the suspend operation, which only affects the dequeue side:
-Whether the fan-in should stop while the node is in suspended state
-What should happen with the packets stored in the queues (owned by this node and its descendants)

...<snip>

> > +/**
> > + * Traffic manager level capabilities
> > + */
> > +struct rte_tm_level_capabilities {
> > +	/**< Maximum number of nodes for the current hierarchy level. */
> > +	uint32_t n_nodes_max;
> > +
> > +	/**< Maximum number of non-leaf nodes for the current hierarchy
> level.
> > +	 * The value of 0 indicates that current level only supports leaf
> > +	 * nodes. The maximum value is *n_nodes_max*.
> > +	 */
> > +	uint32_t n_nodes_nonleaf_max;
> > +
> > +	/**< Maximum number of leaf nodes for the current hierarchy level.
> The
> > +	 * value of 0 indicates that current level only supports non-leaf
> > +	 * nodes. The maximum value is *n_nodes_max*.
> > +	 */
> > +	uint32_t n_nodes_leaf_max;
> > +
> > +	/**< Summary of node-level capabilities across all the non-leaf
> nodes
> > +	 * of the current hierarchy level. Valid only when
> > +	 * *n_nodes_nonleaf_max* is greater than 0.
> > +	 */
> > +	struct rte_tm_node_capabilities nonleaf;
> > +
> > +	/**< Summary of node-level capabilities across all the leaf nodes of
> > +	 * the current hierarchy level. Valid only when *n_nodes_leaf_max*
> is
> > +	 * greater than 0.
> > +	 */
> > +	struct rte_tm_node_capabilities leaf;
> 
> you also need to provide a flag that all the nodes in this level has
> equal capability of not.

Yes, good idea, will do.

> 
> In case, all nodes in this level has equal priority, the implementation
> need to to inquire any further for node level capability. Generally in
> case of HWs, they have equal capabilities at a particular level.
> 
> If all nodes are not having equal priority, this information may not be
> correct and it may only provide a very rough overview of node
> capabilities. the applications should not assume anything about a
> particular node capability within this level.
> 

Agree.

> 
> > +};
> > +
> > +/**
> > + * Traffic manager capabilities
> > + */
> > +struct rte_tm_capabilities {
> > +	/**< Maximum number of nodes. */
> > +	uint32_t n_nodes_max;
> > +
> > +	/**< Maximum number of levels (i.e. number of nodes connecting
> the root
> > +	 * node with any leaf node, including the root and the leaf).
> > +	 */
> > +	uint32_t n_levels_max;
> > +
> > +	/**< Maximum number of shapers, either private or shared. In case
> the
> > +	 * implementation does not share any resource between private and
> > +	 * shared shapers, it is typically equal to the sum between
> > +	 * *shaper_private_n_max* and *shaper_shared_n_max*.
> > +	 */
> > +	uint32_t shaper_n_max;
> > +
> > +	/**< Maximum number of private shapers. Indicates the maximum
> number of
> > +	 * nodes that can concurrently have the private shaper enabled.
> > +	 */
> > +	uint32_t shaper_private_n_max;
> > +
> > +	/**< Maximum number of shared shapers. The value of zero
> indicates that
> > +	 * shared shapers are not supported.
> > +	 */
> > +	uint32_t shaper_shared_n_max;
> > +
> > +	/**< Maximum number of nodes that can share the same shared
> shaper.
> > +	 * Only valid when shared shapers are supported.
> > +	 */
> > +	uint32_t shaper_shared_n_nodes_max;
> > +
> > +	/**< Maximum number of shared shapers that can be configured
> with dual
> > +	 * rate shaping. The value of zero indicates that dual rate shaping
> > +	 * support is not available for shared shapers.
> > +	 */
> > +	uint32_t shaper_shared_dual_rate_n_max;
> > +
> > +	/**< Minimum committed/peak rate (bytes per second) for shared
> shapers.
> > +	 * Only valid when shared shapers are supported.
> > +	 */
> > +	uint64_t shaper_shared_rate_min;
> > +
> > +	/**< Maximum committed/peak rate (bytes per second) for shared
> shaper.
> > +	 * Only valid when shared shapers are supported.
> > +	 */
> > +	uint64_t shaper_shared_rate_max;
> > +
> > +	/**< Minimum value allowed for packet length adjustment for
> > +	 * private/shared shapers.
> > +	 */
> > +	int shaper_pkt_length_adjust_min;
> > +
> > +	/**< Maximum value allowed for packet length adjustment for
> > +	 * private/shared shapers.
> > +	 */
> > +	int shaper_pkt_length_adjust_max;
> > +
> > +	/**< Maximum number of WRED contexts. */
> > +	uint32_t cman_wred_context_n_max;
> > +
> > +	/**< Maximum number of private WRED contexts. Indicates the
> maximum
> > +	 * number of leaf nodes that can concurrently have the private WRED
> > +	 * context enabled.
> > +	 */
> > +	uint32_t cman_wred_context_private_n_max;
> > +
> > +	/**< Maximum number of shared WRED contexts. The value of zero
> > +	 * indicates that shared WRED contexts are not supported.
> > +	 */
> > +	uint32_t cman_wred_context_shared_n_max;
> > +
> > +	/**< Maximum number of leaf nodes that can share the same WRED
> context.
> > +	 * Only valid when shared WRED contexts are supported.
> > +	 */
> > +	uint32_t cman_wred_context_shared_n_nodes_max;
> > +
> > +	/**< Support for VLAN DEI packet marking (per color). */
> > +	int mark_vlan_dei_supported[RTE_TM_COLORS];
> > +
> > +	/**< Support for IPv4/IPv6 ECN marking of TCP packets (per color).
> */
> > +	int mark_ip_ecn_tcp_supported[RTE_TM_COLORS];
> > +
> > +	/**< Support for IPv4/IPv6 ECN marking of SCTP packets (per color).
> */
> > +	int mark_ip_ecn_sctp_supported[RTE_TM_COLORS];
> > +
> > +	/**< Support for IPv4/IPv6 DSCP packet marking (per color). */
> > +	int mark_ip_dscp_supported[RTE_TM_COLORS];
> > +
> > +	/**< Set of supported dynamic update operations
> > +	 * (see enum rte_tm_dynamic_update_type).
> > +	 */
> > +	uint64_t dynamic_update_mask;
> > +
> > +	/**< Summary of node-level capabilities across all non-leaf nodes. */
> > +	struct rte_tm_node_capabilities nonleaf;
> > +
> > +	/**< Summary of node-level capabilities across all leaf nodes. */
> > +	struct rte_tm_node_capabilities leaf;
> 
> This is not right, When you are having per level capabilities, why to
> return a node level capability with TM.
> 
> In software, it is easy to maintain all nodes of same capabilities. But
> in case of hardware, the capabilities of different levels is going to be
> different.
> 
> This will result into non-portable implementation, where the application
> will find the easy way to build the TM tree on this basis of these values.
> 
> You already have most of the capability indications in the tm level
> parameters. the only information missing is w.r.t WRR/WFQ and SP
> capability of tm. you can add some flags to get that.
> 
> 

Yes, we have a private conversation and I agree with your concern that "summary of capabilities across a set of nodes" is potentially confusing, as it is easy to confuse between "at least one node supports X" and "all nodes support X". The only reason I went this way is to save on some big data structure by trying to reuse some of them for slightly different purpose than initially stated. I am taking the action to rework this part to eliminate any confusion.

... <snip>

Regards,
Cristian


More information about the dev mailing list