[EXT] Re: [PATCH v6 2/6] ethdev: add trace points for ethdev (part one)

Ankur Dwivedi adwivedi at marvell.com
Thu Feb 2 11:20:06 CET 2023


>-----Original Message-----
>From: Ferruh Yigit <ferruh.yigit at amd.com>
>Sent: Thursday, February 2, 2023 2:27 PM
>To: Ankur Dwivedi <adwivedi at marvell.com>; dev at dpdk.org; David
>Marchand <david.marchand at redhat.com>; Jerin Jacob Kollanukkaran
><jerinj at marvell.com>; Andrew Rybchenko
><andrew.rybchenko at oktetlabs.ru>
>Cc: thomas at monjalon.net; mdr at ashroe.eu; orika at nvidia.com;
>chas3 at att.com; humin29 at huawei.com; linville at tuxdriver.com;
>ciara.loftus at intel.com; qi.z.zhang at intel.com; mw at semihalf.com;
>mk at semihalf.com; shaibran at amazon.com; evgenys at amazon.com;
>igorch at amazon.com; chandu at amd.com; Igor Russkikh
><irusskikh at marvell.com>; shepard.siegel at atomicrules.com;
>ed.czeck at atomicrules.com; john.miller at atomicrules.com;
>ajit.khaparde at broadcom.com; somnath.kotur at broadcom.com; Maciej
>Czekaj [C] <mczekaj at marvell.com>; Shijith Thotton <sthotton at marvell.com>;
>Srisivasubramanian Srinivasan <srinivasan at marvell.com>; Harman Kalra
><hkalra at marvell.com>; rahul.lakkireddy at chelsio.com; johndale at cisco.com;
>hyonkim at cisco.com; liudongdong3 at huawei.com;
>yisen.zhuang at huawei.com; xuanziyang2 at huawei.com;
>cloud.wangxiaoyun at huawei.com; zhouguoyang at huawei.com;
>simei.su at intel.com; wenjun1.wu at intel.com; qiming.yang at intel.com;
>Yuying.Zhang at intel.com; beilei.xing at intel.com; xiao.w.wang at intel.com;
>jingjing.wu at intel.com; junfeng.guo at intel.com; rosen.xu at intel.com; Nithin
>Kumar Dabilpuram <ndabilpuram at marvell.com>; Kiran Kumar Kokkilagadda
><kirankumark at marvell.com>; Sunil Kumar Kori <skori at marvell.com>; Satha
>Koteswara Rao Kottidi <skoteshwar at marvell.com>; Liron Himi
><lironh at marvell.com>; zr at semihalf.com; Radha Chintakuntla
><radhac at marvell.com>; Veerasenareddy Burru <vburru at marvell.com>;
>Sathesh B Edara <sedara at marvell.com>; matan at nvidia.com;
>viacheslavo at nvidia.com; longli at microsoft.com; spinler at cesnet.cz;
>chaoyong.he at corigine.com; niklas.soderlund at corigine.com;
>hemant.agrawal at nxp.com; sachin.saxena at oss.nxp.com; g.singh at nxp.com;
>apeksha.gupta at nxp.com; sachin.saxena at nxp.com; aboyer at pensando.io;
>Rasesh Mody <rmody at marvell.com>; Shahed Shaikh
><shshaikh at marvell.com>; Devendra Singh Rawat
><dsinghrawat at marvell.com>; jiawenwu at trustnetic.com;
>jianwang at trustnetic.com; jbehrens at vmware.com;
>maxime.coquelin at redhat.com; chenbo.xia at intel.com;
>steven.webster at windriver.com; matt.peters at windriver.com;
>bruce.richardson at intel.com; mtetsuyah at gmail.com; grive at u256.net;
>jasvinder.singh at intel.com; cristian.dumitrescu at intel.com;
>jgrajcia at cisco.com; mb at smartsharesystems.com
>Subject: Re: [EXT] Re: [PATCH v6 2/6] ethdev: add trace points for ethdev (part
>one)
>
>On 2/1/2023 3:42 PM, Ankur Dwivedi wrote:
>>
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yigit at amd.com>
>>> Sent: Wednesday, February 1, 2023 12:08 AM
>>> To: Ankur Dwivedi <adwivedi at marvell.com>; dev at dpdk.org; David
>>> Marchand <david.marchand at redhat.com>; Jerin Jacob Kollanukkaran
>>> <jerinj at marvell.com>; Andrew Rybchenko
>>> <andrew.rybchenko at oktetlabs.ru>
>>> Cc: thomas at monjalon.net; mdr at ashroe.eu; orika at nvidia.com;
>>> chas3 at att.com; humin29 at huawei.com; linville at tuxdriver.com;
>>> ciara.loftus at intel.com; qi.z.zhang at intel.com; mw at semihalf.com;
>>> mk at semihalf.com; shaibran at amazon.com; evgenys at amazon.com;
>>> igorch at amazon.com; chandu at amd.com; Igor Russkikh
>>> <irusskikh at marvell.com>; shepard.siegel at atomicrules.com;
>>> ed.czeck at atomicrules.com; john.miller at atomicrules.com;
>>> ajit.khaparde at broadcom.com; somnath.kotur at broadcom.com; Jerin
>Jacob
>>> Kollanukkaran <jerinj at marvell.com>; Maciej Czekaj [C]
>>> <mczekaj at marvell.com>; Shijith Thotton <sthotton at marvell.com>;
>>> Srisivasubramanian Srinivasan <srinivasan at marvell.com>; Harman Kalra
>>> <hkalra at marvell.com>; rahul.lakkireddy at chelsio.com;
>>> johndale at cisco.com; hyonkim at cisco.com; liudongdong3 at huawei.com;
>>> yisen.zhuang at huawei.com; xuanziyang2 at huawei.com;
>>> cloud.wangxiaoyun at huawei.com; zhouguoyang at huawei.com;
>>> simei.su at intel.com; wenjun1.wu at intel.com; qiming.yang at intel.com;
>>> Yuying.Zhang at intel.com; beilei.xing at intel.com; xiao.w.wang at intel.com;
>>> jingjing.wu at intel.com; junfeng.guo at intel.com; rosen.xu at intel.com;
>>> Nithin Kumar Dabilpuram <ndabilpuram at marvell.com>; Kiran Kumar
>>> Kokkilagadda <kirankumark at marvell.com>; Sunil Kumar Kori
>>> <skori at marvell.com>; Satha Koteswara Rao Kottidi
>>> <skoteshwar at marvell.com>; Liron Himi <lironh at marvell.com>;
>>> zr at semihalf.com; Radha Chintakuntla <radhac at marvell.com>;
>>> Veerasenareddy Burru <vburru at marvell.com>; Sathesh B Edara
>>> <sedara at marvell.com>; matan at nvidia.com; viacheslavo at nvidia.com;
>>> longli at microsoft.com; spinler at cesnet.cz; chaoyong.he at corigine.com;
>>> niklas.soderlund at corigine.com; hemant.agrawal at nxp.com;
>>> sachin.saxena at oss.nxp.com; g.singh at nxp.com; apeksha.gupta at nxp.com;
>>> sachin.saxena at nxp.com; aboyer at pensando.io; Rasesh Mody
>>> <rmody at marvell.com>; Shahed Shaikh <shshaikh at marvell.com>;
>Devendra
>>> Singh Rawat <dsinghrawat at marvell.com>;
>andrew.rybchenko at oktetlabs.ru;
>>> jiawenwu at trustnetic.com; jianwang at trustnetic.com;
>>> jbehrens at vmware.com; maxime.coquelin at redhat.com;
>>> chenbo.xia at intel.com; steven.webster at windriver.com;
>>> matt.peters at windriver.com; bruce.richardson at intel.com;
>>> mtetsuyah at gmail.com; grive at u256.net; jasvinder.singh at intel.com;
>>> cristian.dumitrescu at intel.com; jgrajcia at cisco.com;
>>> mb at smartsharesystems.com
>>> Subject: Re: [EXT] Re: [PATCH v6 2/6] ethdev: add trace points for
>>> ethdev (part
>>> one)
>>>
>>> On 1/30/2023 4:01 PM, Ankur Dwivedi wrote:
>>>
>>> <...>
>>>
>>>>>> diff --git a/lib/ethdev/meson.build b/lib/ethdev/meson.build index
>>>>>> 39250b5da1..f5c0865023 100644
>>>>>> --- a/lib/ethdev/meson.build
>>>>>> +++ b/lib/ethdev/meson.build
>>>>>> @@ -24,6 +24,7 @@ headers = files(
>>>>>>          'rte_ethdev.h',
>>>>>>          'rte_ethdev_trace.h',
>>>>>>          'rte_ethdev_trace_fp.h',
>>>>>> +        'rte_ethdev_trace_fp_burst.h',
>>>>>
>>>>> Why these trace headers are public?
>>>>> Aren't trace points only used by the APIs, so I expect them to be
>>>>> internal, so applications shouldn't need them. Why they are expsed
>>>>> to
>>> user.
>>>> 'rte_ethdev_trace.h' can be made as internal. Not sure about
>>> 'rte_ethdev_trace_fp.h' and 'rte_ethdev_trace_fp_burst.h' as the
>>> tracepoints in fast path may be called from public inline functions.
>>>
>>> Trace calls used by inline functions needs to be public, in this case
>>> at least 'rte_ethdev_trace_fp_burst.h' needs to be public.
>>>
>>> Can you please at least move all trace points that are called by
>>> inline functions to the same file, 'rte_ethdev_trace_fp_burst.h', to
>>> reduce number of the header files to make public? Feel free to rename
>header if required.
>>>
>>> Meanwhile not sure about adding new header as dependency to end user.
>>> @Jerin, @Andrew, what do you think
>>> 1) to move these trace points to 'rte_ethdev_core.h'
>>> OR
>>> 2) disable trace calls in inline functions on compile time, possibly
>>> with existing 'RTE_ETHDEV_DEBUG_*' macro
>>>
>>> <...>
>>>
>>>>>> -	if (port_id >= RTE_MAX_ETHPORTS)
>>>>>> +	if (port_id >= RTE_MAX_ETHPORTS) {
>>>>>> +		rte_eth_trace_find_next(RTE_MAX_ETHPORTS);
>>>>>
>>>>> Is there a specific reason to trace all paths, why not just keep the last
>one?
>>>> This can be kept as the function returns with RTE_MAX_ETHPORTS here.
>>>
>>> 'RTE_MAX_ETHPORTS' more like error path, that is returned when no
>>> more valid port left, since most of the trace doesn't record error
>>> return values, I thought this can be dropped as well unless there is an
>explicit need for it.
>> Ok.
>>>
>>> <...>
>>>
>>>>>> +RTE_TRACE_POINT(
>>>>>> +	rte_eth_trace_rx_hairpin_queue_setup,
>>>>>> +	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t
>rx_queue_id,
>>>>>> +		uint16_t nb_rx_desc, const struct
>rte_eth_hairpin_conf *conf,
>>>>>> +		int ret),
>>>>>> +	uint32_t peer_count = conf->peer_count;
>>>>>> +	uint32_t tx_explicit = conf->tx_explicit;
>>>>>> +	uint32_t manual_bind = conf->manual_bind;
>>>>>> +	uint32_t use_locked_device_memory = conf-
>>>>>> use_locked_device_memory;
>>>>>> +	uint32_t use_rte_memory = conf->use_rte_memory;
>>>>>> +	uint32_t force_memory = conf->force_memory;
>>>>>> +
>>>>>> +	rte_trace_point_emit_u16(port_id);
>>>>>> +	rte_trace_point_emit_u16(rx_queue_id);
>>>>>> +	rte_trace_point_emit_u16(nb_rx_desc);
>>>>>> +	rte_trace_point_emit_ptr(conf);
>>>>>> +	rte_trace_point_emit_u32(peer_count);
>>>>>> +	rte_trace_point_emit_u32(tx_explicit);
>>>>>> +	rte_trace_point_emit_u32(manual_bind);
>>>>>> +	rte_trace_point_emit_u32(use_locked_device_memory);
>>>>>> +	rte_trace_point_emit_u32(use_rte_memory);
>>>>>> +	rte_trace_point_emit_u32(force_memory);
>>>>>> +	rte_trace_point_emit_int(ret);
>>>>>
>>>>> Do we need temporary variables like 'peer_count', why not directly
>>>>> use
>>> them:
>>>> Temporary variables are needed where the actual variable is a bitfield.
>>>> For example in struct rte_eth_hairpin_conf:
>>>> struct rte_eth_hairpin_conf {
>>>> uint32_t peer_count:16;
>>>> ....
>>>> uint32_t tx_explicit:1
>>>> ....
>>>> };
>>>>
>>>> Otherwise there is a compilation error.
>>>> ../lib/ethdev/rte_ethdev_trace.h:253:9: note: in expansion of macro
>>> ‘rte_trace_point_emit_u16’
>>>>   253 |         rte_trace_point_emit_u16(conf->peer_count);
>>>>       |         ^~~~~~~~~~~~~~~~~~~~~~~~
>>>> ../lib/eal/include/rte_trace_point.h:369:38: error: ‘sizeof’ applied
>>>> to a bit-
>>> field
>>>>   369 |         mem = RTE_PTR_ADD(mem, sizeof(in)); \
>>>>
>>>
>>> Got it, that is because of bitfields. But as I look the
>'rte_trace_point_emit_u16'
>>> macro, it is like:
>>>
>>> ```
>>> #define rte_trace_point_emit_u16(in) __rte_trace_point_emit(in,
>>> uint16_t)
>>>
>>> #define __rte_trace_point_emit(in, type) \ do { \
>>>        memcpy(mem, &(in), sizeof(in)); \
>>>        mem = RTE_PTR_ADD(mem, sizeof(in)); \ } while (0) ```
>>>
>>> Here, in '__rte_trace_point_emit' macro 'type' is not used at all, if
>>> so what is the point of passing type?
>>>
>>> If 'sizeof(type)' used, instead of 'sizeof(in)', it would be possible
>>> to use bitfields directly, what do you think?
>>
>> After using 'sizeof(type)', the following compile time error is observed.
>> In file included from ../lib/ethdev/rte_ethdev_trace_fp_burst.h:18,
>>                  from ../lib/ethdev/rte_ethdev.h:175,
>>                  from ../lib/ethdev/rte_tm.c:8:
>> ../lib/ethdev/rte_ethdev_trace.h: In function
>‘rte_eth_trace_rx_hairpin_queue_setup’:
>> ../lib/eal/include/rte_trace_point.h:378:21: error: cannot take address of
>bit-field ‘peer_count’
>>   378 |         memcpy(mem, &(in), sizeof(type)); \
>>           |                                     ^
>
>Right, you can't memcpy to bitfiled, so temporary variables are required, OK.

Yes.
>
>Still, is it OK to ignore 'type' in the '__rte_trace_point_emit()'?
>If variable (in) is uint8_t, it won't matter if 'rte_trace_point_emit_u8',
>'rte_trace_point_emit_u16' or 'rte_trace_point_emit_u32' used, they will all
>give same result, right?
Calling (in) which is uint8_t is not allowed in 'rte_trace_point_emit_u16' because there is other __rte_trace_point_emit
in lib/eal/include/rte_trace_point_register.h called during trace point register which prevents this. 

There would be compilation error because of this check in '__rte_trace_point_emit()'
RTE_BUILD_BUG_ON(sizeof(type) != sizeof(typeof(in))); 

So type can be ignored in '__rte_trace_point_emit()' in lib/eal/include/rte_trace_point.h.
>
>
>>>
>>>
>>>
>>> Also, as for the "struct rte_eth_hairpin_conf" sample, some of the
>>> bitfields are
>>> 1 bit length, does 'uint32_t' really needed for them?
>>
>> uint8_t should suffice.
>
>ack
>
>>>
>>> <...>
>>>
>>>>>> +RTE_TRACE_POINT_FP(
>>>>>> +	rte_eth_trace_find_next,
>>>>>> +	RTE_TRACE_POINT_ARGS(uint16_t port_id),
>>>>>> +	rte_trace_point_emit_u16(port_id);
>>>>>> +)
>>>>>> +
>>>>>
>>>>> Why 'rte_eth_trace_find_next' added as fast path?
>>>>> Can you please add comment for all why it is added as fast path,
>>>>> this help to evaluate/review this later.
>>>>
>>>> There were many functions for which I was not sure about whether
>>>> they
>>> should be slow path or fast path. I made the following assumption:
>>>>
>>>> For slow path I have chosen the function which do some setup,
>>>> configure or
>>> write some configuration. For an example
>>> rte_eth_trace_tx_hairpin_queue_setup,
>>> rte_eth_trace_tx_buffer_set_err_callback,
>>> rte_eth_trace_promiscuous_enable are slow path functions.
>>>>
>>>> The functions which read data are made as fastpath functions. Also
>>>> for
>>> functions for which I was not sure I made it as fastpath.
>>>>
>>>> For an example rte_ethdev_trace_owner_get,
>>> rte_eth_trace_hairpin_get_peer_port, rte_eth_trace_macaddr_get are
>>> made as fastpath.
>>>>
>>>> But there are few exceptions. Function like *_get_capability are
>>>> made as
>>> slowpath. Also rte_ethdev_trace_info_get is slowpath.
>>>>
>>>> The slowpath and fastpath functions are in separate files.
>>> rte_ethdev_trace.h (slowpath) and rte_ethdev_trace_fp.h (fastpath).
>>>>
>>>> Please let me  know if any function needs to be swapped. I will make
>>>> that
>>> change.
>>>>
>>>
>>> Got it, I think most of the trace points in the 'rte_ethdev_trace_fp.h'
>>> are for control/helper functions like:
>>> 'rte_ethdev_trace_count_avail', 'rte_ethdev_trace_get_port_by_name',
>'rte_eth_trace_promiscuous_get' ...
>>>
>>> I thought you did based on some analysis, that is why I asked to add
>>> that reasoning as code comment.
>>>
>>>
>>> I think we can generalize as:
>>>
>>> 1) Anything called by ethdev static inline functions are datapath,
>>> and must be 'RTE_TRACE_POINT_FP', like
>>> 'rte_eth_trace_call_[rt]x_callbacks',
>>> 'rte_ethdev_trace_[rt]x_burst',
>>
>> In this category the following functions come:
>> rte_eth_rx_burst
>> rte_eth_tx_burst
>> rte_eth_call_rx_callbacks (called from rte_eth_rx_burst)
>> rte_eth_call_tx_callbacks (called from rte_eth_tx_burst)
>> rte_eth_tx_buffer_count_callback (registered as error callback, called
>> from rte_eth_tx_buffer_flush) rte_eth_tx_buffer_drop_callback
>> (registered as error callback)
>
>ack
>
>>>
>>> 2) Anything that is called in endless loop in application/sample that
>>> has potential impact although it may not really be datapath
>>
>> Apart from functions in category [1], I have observed the following functions
>in ethdev library, called in some while loop in app/examples.
>> rte_eth_stats_get (called in while loop in examples/qos_sched and
>> examples/distributor) rte_eth_macaddr_get (called in while loop in
>> examples/bond and examples/ethtool) rte_eth_link_get (called in for
>> loop in examples/ip_pipeline) rte_eth_dev_get_mtu (called in for loop
>> in examples/ip_pipeline) rte_eth_link_speed_to_str (called in for loop
>> in examples/ip_pipeline) rte_eth_dev_rx_intr_enable ( called in loop
>> in examples/l3fwd-power) rte_eth_dev_rx_intr_disable ( called in loop
>> in examples/l3fwd-power) rte_eth_timesync_read_rx_timestamp (called in
>> loop in examples/ptpclient) rte_eth_timesync_read_tx_timestamp (called
>> in loop in examples/ptpclient) rte_eth_timesync_adjust_time (called in
>> loop in examples/ptpclient) rte_eth_timesync_read_time (called in loop
>> in examples/ptpclient) rte_flow_classifier_query (called in
>> examples/flow_classify) rte_mtr_create (in app/test-flow-perf loop)
>> rte_mtr_destroy (in app/test-flow-perf loop)
>> rte_mtr_meter_policy_delete ((in app/test-flow-perf loop)
>> rte_flow_create (in app/test-flow-perf) rte_flow_destroy (in
>> app/test-flow-perf)
>>
>
>Ack, and can you please add the note within the parenthesis as a comment,
>whoever visits these later knows why there trace points added as fast path
>trace point?
>
>> Apart from the above can all other functions be moved to slowpath
>tracepoints?
>
>I think yes, we can start with this.
>At least this gives us a logic to follow.
>
>And does trace point and fast path trace points needs to be in separate
>header files?

I do not think separate header files is a requirement, but it is easy  to segregate 
slowpath/fastpath if they are in separate files. What do you think ?
>Can you please put first group into a header an expose it to the user, rest (as a
>mixture) to another header and keep it internal to ethdev?
Ok.
>
>Thanks.
>
>> I think it was discussed in tech board meeting on 2022-11-30 that the
>> functions for which the slowpath or fastpath distinction is not clear, they
>should be fastpath tracepoints.
>>
>>>
>>> 3) Rest are regular trace points, RTE_TRACE_POINT.
>>>
>>>
>>> Item (2) requires some analysis and whenever you found one of them
>>> please comment the reasoning in the code where trace point is defined.
>>



More information about the dev mailing list