The canonical repair procedure (find /client/mount -print0 | xargs -0 stat > /dev/null) verifies the integrity of every file in the whole volume. That’s great, but what if you know that only a subset of the files needs to be repaired, say only files on one brick, or only files modified in the last half-hour. Here is a strategy for “targeted self-heal” which can save lots of time compared to healing the whole volume.
The general strategy here is to run find on a good replica brick, then stat the resulting files through a client mount. Now lets go through a couple examples which involve an Nx2 volume with two servers each having N bricks, so each server has a complete copy of the whole volume. The first example is where one brick disk has died and needs to be replaced with a new empty disk. The second example is where an entire server needs to be shutdown temporarily, say for a kernel upgrade.