|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.blinkenbyte.pcopier.ProgressiveCopier
public class ProgressiveCopier
The ProgressiveCopier is a utility to aid in imaging drives that have errors on them. A common problem with damaged drives is that copying them may take an extraordinarily long time from start to finish, sector by sector. It may never finish, and with damaged drives, getting as much data off as quickly as possible is important. Once the image has been created, it can be mounted or opened directly by a data recovery program without having to worry about bad sectors preventing the program from scanning the drive.
This utility aids this process in two ways. First, it maintains a map of sectors and whether they have been read or not. Second (and more importantly), it has a hierarchial strategy to copy the block device. The drive is divided up into blocks, and when copying a block of data, any errors will cause it to mark the block as incomplete and move to the next block. Each of these blocks is in turn divided into smaller blocks, forming a large hierarchy of blocks. The topmost (first) level 0 is a block large enough to encompass the entire device. The bottom (last) level is composed of blocks the size of the input's sector size. The number of sub-blocks each block is broken into defaults to 4, but can specified.
The advantage of this approach is that larger contiguous blocks of error-free data can be ready quickly, and the areas with errors are left for later. As the copy progresses, the detail with which it attempts to read the bad areas is reduced to smaller and smaller blocks.
As an example, imagine starting with a 128GB hard drive. The utility will start reading the whole 128GB drive as a single large block. If an error occurs 79GB into the drive, the 128GB block is marked as partial, and since it's the only block in the top level, the utility progresses to the next level. At this level, the drive is divided into 32GB blocks. The first two blocks are already complete and the third block is already partially read, so they are skipped. It then resumes at the fourth block. Suppose it completes fully, and moves onto the next level. At this level, the blocks are 8GB in size, and the only region that needs to be dealt with is from 64-96GB. The block from 64-72GB is complete, and the block from 72-79GB is marked as partial, so they will not be read. The blocks from 80-88GB and 88-96GB are marked as unread, so they are read. After reading, it then continues to the next level, etc, until execution is stopped or the last level is completed.
In cases where this utility is useful, the sector level will generally not be reached. If it is reached, it indicates that a straight start-to-end sector by sector copy would have been faster. The only utility in that case is where the ability to resume a copy is needed.
The map file has a header followed by the map data, which uses 2 bits per block. The number of blocks will depend on the number of sub-blocks (childrenPerNode) each block is divided into, and the virtual size of the input device. The virtual size of the input device is smallest block size that is equal or larger in size than the input device. It can be determined by starting with the sector size and repeatedly multiplying by childrenPerNode until a number greater or equal to the input device size is reached. For example, if the sector size is 512 bytes, childrenPerNode is 4, and the device's size is 48K, the virtual size will be 512 * 4 * 4 * 4 * 4 = 128K. This also tells me the number of levels and their block sizes. In this case, the levels are numbered 0-4. Level 0 is a single 128K block, level 1 is four 32K blocks, level 2 is sixteen 8K blocks, level 3 is 64 2K blocks, and level 4 is 256 512-byte blocks (256 sectors). The status for the virtual blocks past the end of the input is maintained as if they were real (and always successful), but read requests for these virtual blocks will never be passed to the input device.
The map file stores the status of every block in a large array of bits. The status of child blocks are not always updated; if a parent block is marked as complete, the status of the children is irrelevant. This saves a considerable amount of I/O to the map file. The storage is byte-oriented, so endian-ness is not a concern when reading the map. The low order bits of a byte are considered to come before the high-order bits of a byte. Because each status is 2 bits, a single byte can hold four statuses. At the last level, one byte of the map is responsible for 2048 bytes of input (assuming 512 byte sectors). Depending on childrenPerNode, the ratio between the virtual size of the input device and the size of the map will be roughly in the range of 1024-2048 to 1. If childrenPerNode is 4, the ratio is roughly 1536:1; assuming a worst-case scenario of the virtual size being 4 times as large as the actual input size, the ratio works out to 384:1, or roughly 0.26% of the input size, so the overhead compared to the resulting output is minimal.
As an example of how the map file stores statuses, bits 1-0 of byte 0 of the map contain the status for level 0 block 0, bits 3-2 contain the status for level 1 block 0, bits 5-4 for level 1 block 1, bits 7-6 for level 1 block 2, bits 1-0 of byte 1 for level 1 block 3, bits 3-2 of byte 1 for level 2 block 0, etc.
To determine the status of a block, all of its parents must be checked. If a parent has a completed status, then all sectors that fall within that block are considered complete regardless of their status. Note that complete does not necessarily mean successfully read, just that every sector in the block has either been read or attempted to be read individually. A completed or partial block with a parent that has an empty status is the result of a corrupt map and should never happen.
If a specific set of sectors should be skipped, a bad sector list can be used. A bad block emulator device will then be wrapped around the input to simulate errors if those sectors are read from, and the read request will not be passed to the input device. Overlapping ranges are allowed, and reads spanning good and bad areas are errored out or truncated as needed so that the read of the bad area is never seen by the input device.
The format of a sector list is a series of entries on separate lines. Each line contains the starting sector number, and an optional sector count. If the sector count is not specified, the default is 1. The numbers can be in either decimal or hexadecimal. Hexadecimal numbers should be prefixed with "0x", as in "0x64" would be sector 100. Spaces, commas, and tabs can be used as the separator between the two values. Both the input bad sector list and the output incomplete sector list use this format.
Basic operation: Specify an input, output, and map file or device, then call open(), performCopy(), and close().
Nested Class Summary | |
---|---|
static class |
ProgressiveCopier.DisplayStatus
This class is used by the command-line program as a callback which displays basic information about the copy progress. |
static class |
ProgressiveCopier.ParamException
An exception class for bad parameters |
static class |
ProgressiveCopier.ParamInfo
Miscellaneous class for holding some command line parameters and parsing status |
Field Summary | |
---|---|
static int |
CURRENT_MAP_VERSION
Current map file version number |
static int |
MAPFILE_HEADER_SIZE
Map file header size |
static int |
STATUS_COMPLETE
Status indicating a block that has been completed (all sectors are either read or unreadable). |
static int |
STATUS_ERROR
Status indicating a block that returned an error (only valid at the sector level). |
static int |
STATUS_PARTIAL
Status indicating a block that has only been read partially either due to an error in a bulk transfer or an interrupted or ongoing transfer. |
static int |
STATUS_UNREAD
Status indicating an unread block. |
Constructor Summary | |
---|---|
ProgressiveCopier()
Constructs a new ProgressiveCopier |
Method Summary | |
---|---|
void |
addEmulatedBadSector(BlockRange position)
Appends the given block to the bad sector list. |
long |
calcBlockNumber(int levelNumber,
long sectorNumber)
Returns the block number that contains the specified sector at the specified level. |
long |
calcBlockSizeInSectors(int levelNumber)
Returns the block size in sectors for the specified level. |
long |
calcNumBlocksForLevel(int levelNumber)
Returns the number of blocks in the specified level. |
long |
calcSectorNumber(int levelNumber,
long blockNumber)
Returns the sector number corresponding to the start of the specified block at the specified level. |
static int |
checkPower2(long n)
Checks whether the a number is a power of 2 |
void |
clearEmulateBadSectorList()
Clears the list of emulated bad sectors. |
void |
clearMapBlocks(int mapClearStartingLevel,
long mapClearBlockStart,
long mapClearBlockCount)
Clears entries from the map file. |
void |
clearMapSectors(long clearSectorStart,
long clearSectorCount)
Clears entries from the map file. |
void |
close()
Closes the input, output, and map files/devices if they were opened by this class. |
void |
emulateBadSectors(java.util.List<BlockRange> listToAppend)
Appends the given list of blocks to the bad sector list. |
void |
emulateBadSectors(java.io.Reader r)
Reads and appends a list of bad sector runs from the given Reader. |
void |
forceInputIsDevice(boolean isDevice)
Force the input to be treated as a device or not instead of attempting to determine this from an input filename. |
void |
forceOutputIsDevice(boolean isDevice)
Force the output to be treated as a device or not instead of attempting to determine this from an output filename. |
long |
getBadReadCount()
Returns the number of bad reads performed (statistic). |
int |
getBufferSize()
Returns the current size of the read buffer |
java.lang.Runnable |
getCallback()
Returns the Runnable object to be called. |
long |
getCallbackInterval()
Returns the callback interval in milliseconds. |
long |
getCurBlock()
Returns the block number in the current level that is being copied. |
int |
getCurLevel()
Returns the current level that is being copied. |
long |
getEmptyReadCount()
Returns the number of empty reads performed (statistic). |
long |
getEndBlockSize()
Returns the smallest block size in bytes that the copier will attempt to read. |
int |
getEndLevel()
Returns the level at which the copy will end |
java.lang.Boolean |
getForceInputIsDevice()
Return whether the input device has been forced and if so, to what status. |
java.lang.Boolean |
getForceOutputIsDevice()
Return whether the output device has been forced and if so, to what status. |
long |
getFullReadCount()
Returns the number of full reads performed (statistic). |
java.util.ArrayList<BlockRange> |
getIncompleteSectorsList()
Returns a list of incomplete sectors. |
long |
getInitialBlockSize()
Returns the block size in bytes that the copy will start from. |
BlockDevice |
getInputDevice()
Returns the input device (default null). |
java.lang.String |
getInputFilename()
Returns the input filename (default null). |
long |
getInputOffset()
Get the offset into the input at which the data starts. |
int |
getInputOverrideSectorSize()
Returns the overridden sector size of the input. |
long |
getInputOverrideSize()
Returns the overridden size of the input in bytes. |
int |
getInputSectorSizeLog()
Returns the base 2 logarithm of the sector size of the input (may not be known until the copy starts). |
long |
getInputSize()
Returns the size of the input (may not be known until the copy starts). |
long |
getInputSizeSectors()
Returns the size of the input in sectors (may not be known until the copy starts). |
ReadWriteDevice |
getMapFile()
Returns the map file (default null). |
java.lang.String |
getMapFilename()
Returns the map filename (default null). |
boolean |
getMapToMemory()
Returns whether the map should be constructed in memory instead of a file. |
int |
getMaxLevelNumber()
Returns the largest level number (the level at which the block size is equal to 1 sector). |
static BlockDevice |
getNewBlockDevice(boolean forceNormal)
Returns an uninitialized BlockDevice that may be architecture/OS dependent. |
int |
getNumChildrenPerNode()
Returns the number of children per node. |
long |
getNumLeaves()
Returns the number of virtual sectors in the input device. |
int |
getNumLeavesLog()
Returns the base 2 logarithm of the number of leaves in the map. |
BlockDevice |
getOutputDevice()
Returns the output device (default null). |
java.lang.String |
getOutputFilename()
Returns the output filename (default null). |
long |
getOutputOffset()
Get the offset into the output at which writing should begin. |
long |
getPartialReadCount()
Returns the number of partial reads performed (statistic). |
boolean |
getPreallocateOutput()
Returns whether the copier should preallocate the output before starting the copy. |
int |
getStartLevel()
Returns the level at which the copy will start |
boolean |
getWriteToSystemOut()
Returns whether status messages will be written to System.out. |
boolean |
isInputDevice()
Returns whether this object considers the input to be a device. |
boolean |
isOutputDevice()
Returns whether this object considers the output to be a device. |
static void |
main(java.lang.String[] argv)
Command-line interface for the program. |
void |
open()
Opens the input, output, and map files/devices as needed and gathers basic information from them. |
void |
openMapOnly()
Opens the map and prepares it to be read or cleared. |
ProgressiveCopier.ParamInfo |
parseArgs(java.lang.String[] argv)
Parses command-line arguments. |
void |
performCopy()
Performs the progressive copy. |
void |
setBufferSize(int _bufferSize)
Sets the size of the read buffer (maximum size of read request performed). |
void |
setCallback(java.lang.Runnable _callback)
Sets the Runnable object to be called. |
void |
setCallbackInterval(long _callbackInterval)
Sets the callback interval in milliseconds. |
void |
setChildrenPerNode(int _childrenPerNode)
Sets the number of children per node. |
void |
setChildrenPerNodeLog(int _childrenPerNodeLog)
Similar to setChildrenPerNode , but specifying the base 2 log of childrenPerNode instead. |
void |
setEndBlockSize(long _endBlockSize)
Sets the smallest block size in bytes that the copier will attempt to read. |
void |
setInitialBlockSize(long _initialBlockSize)
Sets the initial block size the copier will start at. |
void |
setInputDevice(BlockDevice _inputDevice)
Sets the input device. |
void |
setInputFilename(java.lang.String _inputFilename)
Sets the input filename. |
void |
setInputOffset(long _inputOffset)
Set the offset into the input at which the data starts. |
void |
setInputOverrideSectorSize(int _inputOverrideSectorSize)
Overrides the input's sector size. |
void |
setInputOverrideSize(long _inputOverrideSize)
Sets the size of the input in bytes. |
void |
setMapFile(ReadWriteDevice _mapFile)
Sets the map file. |
void |
setMapFilename(java.lang.String _mapFilename)
Sets the map filename. |
void |
setMapToMemory(boolean _mapToMemory)
Sets whether the map should be constructed in memory instead of a file. |
void |
setOutputDevice(BlockDevice _outputDevice)
Sets the output device. |
void |
setOutputFilename(java.lang.String _outputFilename)
Sets the output filename. |
void |
setOutputOffset(long _outputOffset)
Set the offset into the output at which writing should begin. |
void |
setPreallocateOutput(boolean _preallocateOutput)
Sets whether the copier should preallocate the output before starting the copy. |
void |
setWriteToSystemOut(boolean _writeToSystemOut)
Sets whether status messages will be written to System.out. |
void |
unforceInputIsDevice()
Lets this object determine if the input is a device (default setting). |
void |
unforceOutputIsDevice()
Lets this object determine if the output is a device (default setting). |
void |
writeIncompleteSectorsList(java.lang.String outputIncompleteSectorsFilename)
Writes a list of incomplete sectors to the console or specified file. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int STATUS_UNREAD
public static final int STATUS_ERROR
public static final int STATUS_PARTIAL
public static final int STATUS_COMPLETE
public static final int CURRENT_MAP_VERSION
public static final int MAPFILE_HEADER_SIZE
Constructor Detail |
---|
public ProgressiveCopier()
Method Detail |
---|
public boolean getWriteToSystemOut()
public void setWriteToSystemOut(boolean _writeToSystemOut)
public java.lang.String getInputFilename()
public void setInputFilename(java.lang.String _inputFilename)
public BlockDevice getInputDevice()
public void setInputDevice(BlockDevice _inputDevice)
public void forceInputIsDevice(boolean isDevice)
isDevice
- If true, then treat the input as a device.public java.lang.Boolean getForceInputIsDevice()
public void unforceInputIsDevice()
public boolean isInputDevice()
public int getInputOverrideSectorSize()
public void setInputOverrideSectorSize(int _inputOverrideSectorSize)
public long getInputOverrideSize()
public void setInputOverrideSize(long _inputOverrideSize)
public long getInitialBlockSize()
public void setInitialBlockSize(long _initialBlockSize)
public long getEndBlockSize()
public void setEndBlockSize(long _endBlockSize)
public java.lang.String getOutputFilename()
public void setOutputFilename(java.lang.String _outputFilename)
public BlockDevice getOutputDevice()
public void setOutputDevice(BlockDevice _outputDevice)
public void forceOutputIsDevice(boolean isDevice)
isDevice
- If true, then treat the output as a device.public java.lang.Boolean getForceOutputIsDevice()
public void unforceOutputIsDevice()
public boolean isOutputDevice()
public boolean getPreallocateOutput()
public void setPreallocateOutput(boolean _preallocateOutput)
public java.lang.String getMapFilename()
public void setMapFilename(java.lang.String _mapFilename)
public ReadWriteDevice getMapFile()
public void setMapFile(ReadWriteDevice _mapFile)
public boolean getMapToMemory()
public void setMapToMemory(boolean _mapToMemory)
public long getCallbackInterval()
public void setCallbackInterval(long _callbackInterval)
public java.lang.Runnable getCallback()
public void setCallback(java.lang.Runnable _callback)
public int getNumChildrenPerNode()
public void setChildrenPerNode(int _childrenPerNode) throws ProgressiveCopier.ParamException
For example, if the sector size of a 1MB device is 512 bytes and the childrenPerNode is 4, the smallest block that covers the entire input device is 512 * 4 * 4 * 4 * 4 * 4 * 4 = 2MB. If the childrenPerNode is set to 32, the smallest block that covers the entire input device is 512 * 32 * 32 * 32 = 16MB, which would result in a much larger map.
A smal value increases the number of levels and thus increases the number of times a read attempt may be performed on a bad block, potentially to excessive levels.
ProgressiveCopier.ParamException
public long getInputOffset()
public void setInputOffset(long _inputOffset)
public long getOutputOffset()
public void setOutputOffset(long _outputOffset)
public void setChildrenPerNodeLog(int _childrenPerNodeLog)
setChildrenPerNode
, but specifying the base 2 log of childrenPerNode instead.
public void setBufferSize(int _bufferSize) throws ProgressiveCopier.ParamException
ProgressiveCopier.ParamException
public int getBufferSize()
public int getCurLevel()
public long getCurBlock()
public long getFullReadCount()
public long getPartialReadCount()
public long getEmptyReadCount()
public long getBadReadCount()
public long getInputSize()
public long getInputSizeSectors()
public int getInputSectorSizeLog()
public int getStartLevel()
public int getEndLevel()
public long getNumLeaves()
public int getNumLeavesLog()
public int getMaxLevelNumber()
public void emulateBadSectors(java.io.Reader r) throws java.lang.Exception
BadBlockEmulator
to prevent the sectors from being read from.
The format of sector lists is specified in the class description.
java.lang.Exception
public void emulateBadSectors(java.util.List<BlockRange> listToAppend)
BadBlockEmulator
to prevent the sectors from being read from.
public void addEmulatedBadSector(BlockRange position)
BadBlockEmulator
to prevent the sectors from being read from.
public void clearEmulateBadSectorList()
public void open() throws java.lang.Exception
java.lang.Exception
public void openMapOnly() throws java.lang.Exception
java.lang.Exception
public void performCopy() throws java.io.IOException
java.io.IOException
public void close()
public void writeIncompleteSectorsList(java.lang.String outputIncompleteSectorsFilename) throws java.io.IOException
outputIncompleteSectorsFilename
- Name of the file to write the list of incomplete sectors. If the name is null,
the list is written to System.out.
java.io.IOException
public java.util.ArrayList<BlockRange> getIncompleteSectorsList() throws java.io.IOException
java.io.IOException
public void clearMapBlocks(int mapClearStartingLevel, long mapClearBlockStart, long mapClearBlockCount) throws java.io.IOException
mapClearStartingLevel
- The level at which to start the clear operation. Level 0 is the top level, containing the single
block that encompasses the entire drive.mapClearBlockStart
- The block at which to start clearing in the specified level.mapClearBlockCount
- The number of blocks to clear in the specified level. If less than zero, all remaining blocks in
the level beginning at mapClearBlockStart are cleared.
java.io.IOException
public void clearMapSectors(long clearSectorStart, long clearSectorCount) throws java.io.IOException
clearSectorStart
- The sector at which to start clearing map entries.clearSectorCount
- The number of sector map entries to clear. If the count is negative, it will be computed as all
remaining sectors in the map.
java.io.IOException
public long calcNumBlocksForLevel(int levelNumber)
public long calcBlockSizeInSectors(int levelNumber)
public long calcSectorNumber(int levelNumber, long blockNumber)
public long calcBlockNumber(int levelNumber, long sectorNumber)
public static BlockDevice getNewBlockDevice(boolean forceNormal) throws java.lang.Exception
BlockDevice
that may be architecture/OS dependent.
java.lang.Exception
public static int checkPower2(long n)
n
- The number to be checked.
public ProgressiveCopier.ParamInfo parseArgs(java.lang.String[] argv)
public static void main(java.lang.String[] argv)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |