//jerrywalsh.org

coding, hacking, startups, computer security, technology and more

Resurrecting Harddisks With Damaged Filesystems in FreeBSD

I found two old dead maxtor harddisks I had lying about the place, a 160GB and a 200GB, both were used as storage media on a FreeBSD server (4.X to be exact so these were UFSv1 partitions).

I plugged in the 160GB to my pc.. It would mount but drive performance was awful (in the region of 400kb/s transfers) and after approx 10 minutes it'd fail completely hanging the PC.. anyway.. it appeared from looking at the contents of the drive that I had managed to salvage most of the data from this drive previously so, I moved on to the 200GB hdd instead.   I plugged in the the drive, and the kernel spewed some geom related 'uncorrectable' errors.. I figured this drive must have suffered from some bad sectors.    FSCK confirmed this and I discovered the bad sectors corrupted the primary superblock on the partition.  Fortunately fsck managed to locate an alternate superblock which was still intact. Fiddling a bit with ``dd'', I discovered there was only 3-4 sectors which were bad at the start of the partition.  Because the sectors were bad there was no way for me to mount the partition, even if the alternate superblock was good.

I used dd_rescue  ( from /usr/ports/sysutils/dd_rescue ) to take an image of the bad drive and copy it to a larger 500GB drive. It's important to note that the size of the target drive does NOT matter - as long as it's either the same size or larger than the original drive.  The downside of using a larger target drive is that you won't be able to make full use of all the space on the drive, but i'd imagine that's a job for some partition resizing utility to solve. Anyway.. dd_resure is basically a modified version of dd but designed for dealing with bad sectors. it'll use a large blocksize and reduce the block size if it encounters any bad sectors whilst copying. If after it finds bad sectors it then gets a number of consecutive successful reads the block size is then increased back up again to speed up the imaging process.  Outside of the first 4 bad sectors there weren't any further problems with copying until about 80% into the imaging process:

dd_rescue: (info): ipos: 158322022.5k, opos: 158322022.5k, xferd:    115117.5k
                *  errs:      9, errxfer:         4.5k, succxfer:    115113.0k
             +curr.rate:        0kB/s, avg.rate:     6730kB/s, avg.load:  0.1%
dd_rescue: (warning): /dev/ad1 (158322022.5k): Input/output error!

dd_rescue: (info): ipos: 158322023.0k, opos: 158322023.0k, xferd:    115118.0k
                *  errs:     10, errxfer:         5.0k, succxfer:    115113.0k
             +curr.rate:        0kB/s, avg.rate:     6247kB/s, avg.load:  0.1%
dd_rescue: (warning): /dev/ad1 (158322023.0k): Input/output error!

dd_rescue: (info): ipos: 159550393.0k, opos: 159550393.0k, xferd:   1343488.0k
                   errs:     11, errxfer:         5.5k, succxfer:   1343482.5k
             +curr.rate:    45310kB/s, avg.rate:    29621kB/s, avg.load:  0.7%

At this point the copying slowed right down because dd_rescue started working with tiny blocksizes, i eventually got tired of waiting and cancelled the process to launch it again with a start position of a few thousand increments from the problematic position.. It started off very fast and then encountered more errors.. these errors got gradually less and it seemed i'd eventually bypassed that dodgy area (presumably this was an area of the disk which got alot of writes).  For a 200GB drive the dd_rescue imaging process was suprisingly fast (must have taken less than an hour!). One the process finished i fdisk'd the target disk:

 [root@orion] (/dev): fdisk /dev/ad6
******* Working on device /dev/ad6 *******
parameters extracted from in-core disklabel are:
cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl)

Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
    start 63, size 398283417 (194474 Meg), flag 80 (active)
        beg: cyl 0/ head 1/ sector 1;
        end: cyl 1023/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>

This looks good. The target drive was actually a 500GB drive, and now fdisk is reporting to see it as 200GB :) .. next I decided to see if a reboot was required in order for devfs to notice the new slices.

 [root@orion] (/dev): ls ad6*
ad6     ad6s1   ad6s1c  ad6s1e

Even better.  Next it was time to FSCK the s1e slice..:

 [root@orion] (/dev): fsck /dev/ad6s1e
** /dev/ad6s1e
Cannot find file system superblock

LOOK FOR ALTERNATE SUPERBLOCKS? [yn] y

USING ALTERNATE SUPERBLOCK AT 32
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
INCORRECT BLOCK COUNT I=38402498 (71232 should be 65792)
CORRECT? [yn] y

INCORRECT BLOCK COUNT I=38402499 (81648 should be 32992)
CORRECT? [yn] y

INCORRECT BLOCK COUNT I=38413449 (6768 should be 208)
CORRECT? [yn] y

INCORRECT BLOCK COUNT I=38413450 (7424 should be 208)
CORRECT? [yn] y

INCORRECT BLOCK COUNT I=38413451 (6000 should be 208)
CORRECT? [yn] y

INCORRECT BLOCK COUNT I=38413452 (6768 should be 208)
CORRECT? [yn] y

etc.

At this point I figured I should have almost passed the -y flag to fsck to auto respond YES to any prompts.. but thankfully fsck didn't ask any more questions after that initial bunch...

 
183419 files, 123017087 used, 70007255 free (66399 frags, 8742607 blocks, 0.0% fragmentation)

***** FILE SYSTEM MARKED CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****

Success!  200GB of data which was completely lost was completely recovered.