Splitting files with dd

We have an ESXi box hosted with Rackspace, it took a bit of pushing to get them to install ESXi it in the first place as they tried to get us to use their cloud offering. But this is a staging environment and we want something dedicated on hardware we control so we can get an idea of performance without other people’s workloads muddying the water.

Anyway, I’ve been having a bit of fun getting our server template uploaded to it, which is only 11GB compressed – not exactly large, but apparently large enough to be inconvenient.

In my experience the datastore upload tool in the vSphere client frequently fails on large files. In this case I was getting the “Failed to log into NFC server” error, which is probably due to a requisite port not being open. I didn’t like that tool anyway, move on.

The trusty-but-slow scp method was also failing however. Uploads would start but consistently stall at about the 1GB mark. Not sure if it’s a buffer or something getting filled in dropbear (which is designed to be a lightweight ssh server and really shouldn’t need to deal with files this large), but Googling didn’t turn up much.

So I went down the track of splitting up the file into smaller chunks, easily done using the split tool. Except I didn’t know about the split tool and used dd.

So if you’re reading this you probably want to use split, but if for some reason you need to do this on an environment that doesn’t have split and does have dd (like ESXi), this could help:


#!/bin/bash

FILE="$1"

#How big we want the chunks to be in bytes
CHUNKSIZE=$(( 512 * 1024 * 1024 ))

#Block size for dd in bytes
BS=$(( 8 * 1024 ))

#Convert CHUNKSIZE to blocks
CHUNKSIZE=$(( $CHUNKSIZE / $BS ))

# Skip value for dd, we start at 0
SKIP=0

#Calculate total size of file in blocks
FSIZE=`stat -c%s "$1"`
SIZE=$(( $FSIZE / $BS ))

#Loop counter for file name
i=0

echo "Using chunks of "$CHUNKSIZE" blocks"
echo "Size is "$FSIZE" bytes = "$SIZE" blocks"

while [ $SKIP -le $SIZE ]
do
NEWFILE="$FILE".part"$i"
i=$(( $i + 1 ))

echo "Creating file "$NEWFILE" starting after block "$SKIP""
dd if="$FILE" of="$NEWFILE" bs="$BS" count="$CHUNKSIZE" skip=$SKIP

SKIP=$(( $SKIP + $CHUNKSIZE ))
done

Afterwards:

scp ./*.part* user@host:/vmfs/datastore/

Then at the other end you simply concatenate them together. I generated a list of files with `ls -tr1 *.part*` and simply pasted that into a script. Obviously the order is critical, but reverse sorting by time (which is what the r and t options do) gives the correct order.

#!/bin/bash

#FLIST=`ls -tr1 *.part*`
FLIST="devbox.tgz.part0 devbox.tgz.part1 devbox.tgz.part2 devbox.tgz.part3 devbox.tgz.part4 devbox.tgz.part5 devbox.tgz.part6 devbox.tgz.part7 devbox.tgz.part8 devbox.tgz.part9 devbox.tgz.part10 devbox.tgz.part11 devbox.tgz.part12 devbox.tgz.part13 devbox.tgz.part14 devbox.tgz.part15 devbox.tgz.part16 devbox.tgz.part17 devbox.tgz.part18 devbox.tgz.part19 devbox.tgz.part20 devbox.tgz.part21"

OUTPUT="output.tgz"

for F in $FLIST
do
cat $F >> $OUTPUT
done

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 43 other followers