[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_manage and improve unit tests

rsanford2 at gmail.com rsanford2 at gmail.com
Tue Jul 28 00:46:03 CEST 2015


From: Robert Sanford <rsanford at akamai.com>

This patchset fixes a bug in timer stress test 2, adds a new stress test
to expose a race condition bug in API rte_timer_manage(), and then fixes
the rte_timer_manage() bug.

Description of rte_timer_manage() race condition bug: Through code
inspection, we notice a potential problem in rte_timer_manage() that
leads to corruption of per-lcore pending-lists (implemented as
skip-lists). The race condition occurs when rte_timer_manage() expires
multiple timers on lcore A, while lcore B simultaneously invokes
rte_timer_reset() for one of the expiring timers (other than the first
one).

Lcore A splits its pending-list, creating a local list of expired timers
linked through their sl_next[0] pointers, and sets the first expired
timer to the RUNNING state, all during one list-lock round trip.
Lcore A then unlocks the list-lock to run the first callback, and that
is when A and B can have different interpretations of the subsequent
expired timers' true state. Lcore B sees an expired timer still in the
PENDING state, atomically changes the timer to the CONFIG state, locks
lcore A's list-lock, and reinserts the timer into A's pending-list.
The two lcores try to use the same next-pointers to maintain both lists!

v2 changes:
Move patch descriptions to their respective patches.
Correct checkpatch warnings.

Robert Sanford (3):
  fix stress test 2 sync bug
  add timer manage race condition test
  fix race condition in rte_timer_manage

 app/test/Makefile              |    1 +
 app/test/test_timer.c          |  154 +++++++++++++++++++++++-------
 app/test/test_timer_racecond.c |  209 ++++++++++++++++++++++++++++++++++++++++
 lib/librte_timer/rte_timer.c   |   56 +++++++----
 4 files changed, 366 insertions(+), 54 deletions(-)
 create mode 100644 app/test/test_timer_racecond.c



More information about the dev mailing list