Dienstag, 12. September 2017

Recover a single file on the Lustre due to the OST corruption.

Assuming the OST number 4 is corrupted.to get the ost ID on the client one can use lfs df  :
lfs df /archive
UUID                   1K-blocks        Used   Available Use% Mounted on
arch-MDT0000_UUID       99428812    44432680    48286884  48% /archive[MDT:0]
arch-OST0000_UUID    63838042896 46275072544 14344891120  76% /archive[OST:0]
arch-OST0001_UUID    63838042896 46036859640 14583104024  76% /archive[OST:1]
arch-OST0002_UUID    63838042896 34406650692 26213311960  57% /archive[OST:2]
arch-OST0003_UUID    63838042896 39355270936 21264676344  65% /archive[OST:3]
arch-OST0004_UUID    63838042896  7102256308 53517690972  12% /archive[OST:4]
If the OST got corrupted then file attributes are still on MDS so we can filter all corrupted files by following:

lfs find /archive --ost 4  -type f | xargs -I% sh -c "[ ! -f % ]&& echo %" | tee -a recover/todo/OST04-corrupted.txt
this is assuming that files does not contain spaces or nasty characters. For the general case one should use python script for the proper handling of the filenames.

Now once the file list is complete, one can use following script to copy the files over the corrupted files, here we assume that the backup path is /backup and target path is /archive:

 cat recover.sh
#!/bin/bash
file=$1
cat $file   | xargs -I{} sh -c  "[ -f \"/backup{}\" ]&&echo 'unlink \"{}\";cp -a \"/backup{}\" \"{}\"' "| parallel  --progress --eta 

./recover.sh recover/todo/OST04-corrupted.txt

Keine Kommentare: