SVNews r332537

NOTE: This service is experimental and subject to change! Use at your own risk!

2018-04-16 03:47:53 - r332537 by mav (Alexander Motin)

Complete list of files affected by revision r332537:

(Note: At the moment, these links point to ViewVC on svn.freebsd.org. They are probably slow. Do not overuse.)

   Contents     MODIFY   /stable/11  
  History   Contents   Diff   MODIFY   /stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files  
  History   Contents   Diff   MODIFY   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c  
  History   Contents   Diff   MODIFY   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h  
  History   Contents   Diff   MODIFY   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h  
  History   Contents     ADD   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zthr.h  
  History   Contents   Diff   MODIFY   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c  
  History   Contents     ADD   /stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zthr.c  
  History   Contents   Diff   MODIFY   /stable/11/sys/conf/files  

Commit message:

MFC r329802: MFV r329799, r329800:
9079 race condition in starting and ending condesing thread for indirect vdevs

illumos/illumos-gate@667ec66f1b4f491d5e839644e0912cad1c9e7122

The timeline of the race condition is the following:
[1] Thread A is about to finish condesing the first vdev in
spa_condense_indirect_thread(),
so it calls the spa_condense_indirect_complete_sync() sync task which sets the
spa_condensing_indirect field to NULL. Waiting for the sync task to finish,
thread A
sleeps until the txg is done. When this happens, thread A will acquire
spa_async_lock
and set spa_condense_thread to NULL.
[2] While thread A waits for the txg to finish, thread B which is running
spa_sync() checks
whether it should condense the second vdev in vdev_indirect_should_condense()
by checking
the spa_condensing_indirect field which was set to NULL by
spa_condense_indirect_thread()
from thread A. So it goes on and tries to spawn a new condensing thread in
spa_condense_indirect_start_sync() and the aforementioned assertions fails
because thread A
has not set spa_condense_thread to NULL (which is basically the last thing it
does before
returning).

The main issue here is that we rely on both spa_condensing_indirect and
spa_condense_thread to
signify whether a condensing thread is running. Ideally we would only use one
throughout the
codebase. In addition, for managing spa_condense_thread we currently use
spa_async_lock which
basically tights condensing to scrubing when it comes to pausing and resuming
those actions
during spa export.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

 


Powered by Python FreeBSD support by secnetix GmbH & Co. KG

Page generated in 16 ms, 9 files printed. Current time is 2018-04-23 15:02:51. All times are in UTC/GMT.