[dpdk-dev] Couple of PMD questions

Bruce Richardson bruce.richardson at intel.com
Thu Apr 21 12:54:04 CEST 2016


On Wed, Apr 20, 2016 at 10:47:59AM -0500, Jay Rolette wrote:
> On Wed, Apr 20, 2016 at 9:05 AM, Bruce Richardson <
> bruce.richardson at intel.com> wrote:
> 
> > On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> > > On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> > > bruce.richardson at intel.com> wrote:
> > >
> > > > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > > > In ixgbe_dev_rx_init(), there is this bit of code:
> > > > >
> > > > >       /*
> > > > >        * Configure the RX buffer size in the BSIZEPACKET field of
> > > > >        * the SRRCTL register of the queue.
> > > > >        * The value is in 1 KB resolution. Valid values can be from
> > > > >        * 1 KB to 16 KB.
> > > > >        */
> > > > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool)
> > -
> > > > >               RTE_PKTMBUF_HEADROOM);
> > > > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > > > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> > > > >
> > > > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > > > >
> > > > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > > > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > > > >
> > > > >       /* It adds dual VLAN length for supporting dual VLAN */
> > > > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > > > >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > > > >               dev->data->scattered_rx = 1;
> > > > >
> > > > >
> > > > > Couple of questions I have about it:
> > > > >
> > > > > 1) If the caller configured the MBUF memory pool data room size to be
> > > > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then
> > the
> > > > > driver ends up silently programming the NIC to use a smaller max RX
> > size
> > > > > than the caller specified.
> > > > >
> > > > > Should the driver error out in that case instead of only "sort of"
> > > > working?
> > > > > If not, it seems like it should be logging an error message.
> > > >
> > > > Well, it does log at the end of the whole rx init process what RX
> > function
> > > > is
> > > > being used, so there is that.
> > > > However, looking at it now, given that there is an explicit setting in
> > the
> > > > config
> > > > to request scattered mode, I think you may be right and that we should
> > > > error out
> > > > here if scattered mode is needed and not set. It could help prevent
> > > > unexpected
> > > > problems with these drivers.
> > > >
> > >
> > > +1. A hard fail at init if I've misconfigured things is much preferred to
> > > something that mostly works for typical test cases and only fails on
> > > corner/limit testing.
> > >
> > >
> > > > > 2) Second question is about the "/* It adds dual VLAN length for
> > > > supporting
> > > > > dual VLAN */" bit...
> > > > >
> > > > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be
> > set
> > > > to
> > > > > the max frame size you support (although it says it's only used if
> > jumbo
> > > > > frame is enabled). That same value is generally used when
> > calculating the
> > > > > size that mbuf elements should be created at.
> > > > >
> > > > > The description for the data_room_size parameter of
> > > > > rte_pktmbuf_pool_create():
> > > > >
> > > > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > > > >
> > > > >
> > > > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > > > make
> > > > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > > > data_room_size
> > > > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > > > >
> > > > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > > > will
> > > > > fail the dual VLAN length check. The really nasty part about that is
> > it
> > > > has
> > > > > a side-effect of enabling scattered RX regardless of the fact that I
> > > > didn't
> > > > > enable scattered RX in dev_conf.rxmode.
> > > > >
> > > > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > > > >
> > > > > Is that check correct? It seems wrong to be adding space for q-in-q
> > on
> > > > top
> > > > > of your specified max frame size...
> > > >
> > > > The issue here is what the correct behaviour needs to be. If we have
> > the
> > > > user
> > > > specify the maximum frame size including all vlan tags, then we hit the
> > > > problem
> > > > where we have to subtract the VLAN tag sizes when writing the value to
> > the
> > > > NIC.
> > > > In that case, we will hit a problem where we get a e.g. 9210 byte
> > frame -
> > > > to
> > > > reuse your example - without any VLAN tags, which will be rejected by
> > the
> > > > hardware as being oversized. If we don't do the subtraction, and we
> > get the
> > > > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept
> > it
> > > > and
> > > > then split it across multiple descriptors because the actual DMA size
> > is
> > > > 9218 bytes.
> > > >
> > >
> > > As an app developer, I didn't realize the max frame size didn't include
> > > VLAN tags. I expected max frame size to be the size of the ethernet frame
> > > on the wire, which I would expect to include space used by any VLAN or
> > MPLS
> > > tags.
> > >
> > > Is there anything in the docs or example apps about that? I did some
> > > digging as I was debugging this and didn't notice it, but entirely
> > possible
> > > I just missed it.
> > >
> > >
> > > > I'm not sure there is a works-in-all-cases solution here.
> > > >
> > >
> > > Andriy's suggestion seems like it points in the right direction.
> > >
> > > From an app developer point of view, I'd expect to have a single max
> > frame
> > > size value to track and the APIs should take care of any adjustments
> > > required internally. Maybe have rte_pktmbuf_pool_create() add the
> > > additional bytes when it calls rte_mempool_create() under the covers?
> > Then
> > > it's nice and clean for the API without unexpected side-effects.
> > >
> >
> > It will still have unintended side-effects I think, depending on the
> > resolution
> > of the NIC buffer length paramters. For drivers like ixgbe or e1000, the
> > mempool
> > create call could potentially have to add an additional 1k to each buffer
> > just
> > to be able to store the extra eight bytes.
> 
> 
> The comments in the ixgbe driver say that the value programmed into SRRCTL
> must be on a 1K boundary. Based on your previous response, it sounded like
> the NIC ignores that limit for VLAN tags, hence the check for the extra 8
> bytes on the mbuf element size. Are you worried about the size resolution
> on mempool elements?
> 
> Sounds like I've got to go spend some quality time in the NIC data
> sheets... Maybe I should back up and just ask the higher level question:
> 
> What's the right incantation in both the dev_conf structure and in creating
> the mbuf pool to support jumbo frames of some particular size on the wire,
> with or without VLAN tags, without requiring scattered_rx support in an app?
> 
I think that the answer to that may be that it's hard to do, or impossible,
depending on the NIC. From what I can see from the datasheet for some of the
ixgbe NICs, the max-frame-size (MAXFRS) field that you specify always excludes
VLAN tags. Therefore you can't tell the NIC to accept a max size of X irrespective
of VLAN tags. The max size will always vary depending on the tags.

/Bruce


More information about the dev mailing list