Perfectly Cloning a Ext4 Linux Partition

The built in commands in Linux are simply awesome. Trying to do any advanced partition maneuvers in Windows are difficult to do and often requires expensive third party software to simplify the process. This is untrue in Linux.

The dd command in Linux can be used to clone drives between drives or even produce image files of drives, all with a few simple commands from the command line.

Step Zero: Unmount the Involved Drive(s)

Anyone who has done anything like this in Windows knows that they often have to reboot their computer to even start the process of playing around with partitions. Often times needing to be in an environment where the drives are simply not mounted. One of the pitfalls on Linux, especially when you only have the Linux command line, is that line between fully functional environment with mounted drives and rescue/recovery mode with unmounted drives gets fuzzy and really needs to be determined by checking devices by hand to see if they are mounted. Starting in Recovery mode is best for anything like this, especially if you are cloning your OS's drive itself.

Cloning a Drive to an Image on a File System

dd if=/dev/sdb bs=4M | pv -s 30G | dd of=/mnt/sda/mydrive.img bs=4M

In this case I am cloning the drive found at /dev/sdb into an image to be located at /mnt/sda/mydrive.img. One thing worth noting is that the destination drive is mounted because we are storing the image inside of the file system. The drive at /dev/sdb has been unmounted. In fact, the dd command probably won't allow the operation to take place until the origin drive is unmounted.

"What's that stuff in the middle?"  This is a good question. On its own the dd command literally provides no user feedback... At all. The command starts, the cursor blinks seemingly endlessly and then after an indefinite amount of time that seems like forever you see dd finally output something when it is finished. Wouldn't it be nice to have an ETA and a Rate in which the data is being copied? This is where pv comes into play. Piping the data through pv produces a nice screen that shows the rate and an estimated time to completion. You do need to specify the total size of the amount of data expected to pass through the function to get a proper ETA, hence the 30G in this example. If you exclude the -s 30G the pv command will still show a status bar, and the rate, it just won't be able to estimate when it should be finished.

"Ok... And what is that other stuff?" Also a good question. The 4M is the rate in which the data is read and written from the drive. The higher the number does not mean the faster it will work. This number really is there to tell dd what the block size on the volume. 4096B or 4MB is normally the default block size for most file systems. Not specifying this means that dd will do the, if you have a lot of data could take upwards of hours to days. When you specify 4MB it isn't uncommon to see the data transfer rate go upwards of 80MB/s.

Cloning an Image to a Drive

dd if=/mnt/sda/mydrive.img bs=4M | pv -s 30G | dd of=/dev/sdb bs=4M

This command clones the contents of an image with expected size of 30GB to the drive at /dev/sdb.

Cloning a Drive to a Drive

dd if=/dev/sda bs=4M | pv -s 30G | dd of=/dev/sdb bs=4M

This command clones the contents of a drive at /dev/sda with expected size of 30GB to the drive at /dev/sdb. This command would likely need to be run from Recovery mode.

Fun with SSH

You can run the dd command in concert with SSH to copy the contents of a drive directly onto an image on a remote server.

ssh root@ "dd if=/dev/sda " | dd of=/home/archive/linode.img

This code snippet is from: