[dpdk-dev] Admin Queue ENA
kumaraparameshwaran rathinavel
kumaraparamesh92 at gmail.com
Fri Nov 8 07:02:26 CET 2019
Hi Michał,
Please look at the below function,
static int
ena_com_wait_and_process_admin_cq_polling(
struct ena_comp_ctx *comp_ctx,
struct ena_com_admin_queue *admin_queue)
{
unsigned long flags = 0;
u64 start_time;
int ret;
start_time = ENA_GET_SYSTEM_USECS();
while (comp_ctx->status == ENA_CMD_SUBMITTED) {
if ((ENA_GET_SYSTEM_USECS() - start_time) >
ADMIN_CMD_TIMEOUT_US) {
ena_trc_err("Wait for completion (polling) timeout\n");
/* ENA didn't have any completion */
ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
admin_queue->stats.no_completion++;
admin_queue->running_state = false;
ENA_SPINLOCK_UNLOCK(admin_queue->q_lock, flags);
ret = ENA_COM_TIMER_EXPIRED;
goto err;
}
*ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
ena_com_handle_admin_completion(admin_queue);
ENA_SPINLOCK_UNLOCK(admin_queue->q_lock, flags);*
}
if (unlikely(comp_ctx->status == ENA_CMD_ABORTED)) {
ena_trc_err("Command was aborted\n");
ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
admin_queue->stats.aborted_cmd++;
ENA_SPINLOCK_UNLOCK(admin_queue->q_lock, flags);
ret = ENA_COM_NO_DEVICE;
goto err;
}
ENA_ASSERT(comp_ctx->status == ENA_CMD_COMPLETED,
"Invalid comp status %d\n", comp_ctx->status);
ret = ena_com_comp_status_to_errno(comp_ctx->comp_status);
err:
*comp_ctxt_release(admin_queue, comp_ctx);*
return ret;
}
This is a case where there are two threads executing admin commands.
The occupied flag is set to false in the function comp_ctxt_release. Let
us say there are two consumers of completion context and C1 has a
completion context and the same completion context can be used by another
consumer C2 even before the C1 is resetting the occupied flag.
This is because the ena_com_handle_admin_completion is done under spin lock
and comp_ctxt_release is not under this spin lock.
Thanks,
Param
On Thu, Oct 24, 2019 at 2:09 PM Michał Krawczyk <mk at semihalf.com> wrote:
> sob., 19 paź 2019 o 20:26 kumaraparameshwaran rathinavel
> <kumaraparamesh92 at gmail.com> napisał(a):
> >
> > Hi All,
> >
> > In the ENA poll mode driver I see that every request in the admin queue
> is
> > associated with a completion context and this is preallocated during the
> > device initialisation. When the completion context is used we check for
> > occupied to be true in the 16.X version if the occupied flag is set to
> true
> > we assert and in the latest version I see that this is an error log. But
> > there is a time window where if the completion context would be available
> > to the other consumer but still the old consumer did not set the occupied
> > to false. The new consumer holds the admin queue lock to get the
> completion
> > context but the update by the old consumer to set the the occupied flag
> is
> > not done under lock. So should we make sure that the new consumer should
> > get the completion context only when the occupied flag is set to false.
> Any
> > thoughts on this?
>
> Hi Param,
>
> Both the producer and the consumer are holding the spinlock while
> getting the completion context. If you see any situation where it
> isn't (besides the release function), please let me know.
> As it is protected by the lock, returning error while completion
> context is occupied (and it shouldn't) it fine, as it will stop the
> admin queue and allow the DPDK user application to execute the reset
> of the device.
>
> Thanks,
> Michal
>
> > If required I can try to make a patch where the completion context would
> be
> > available only after setting the occupied flag to false.
> >
> > Thanks,
> > Param.
>
More information about the dev
mailing list