Moving large files? Try split first.

I often need to move large files between servers, usually between different data centers. If I have a 100 gig file that I need to move (SQL backups for example) it can take a VERY long time to try and transfer the data via FTP (RSYNC is not an option in this example).

I find that moving a file via FTP might transfer at 250k per second. But it is possible to transfer multiple files via FTP at the same time, all at 250k per second (seems strange right?).

To increase the speed of the transfer use the split command on unix to split the 100 gig file into pieces, transfer the pieces to the destination and then rebuild the pieces into the original file. Here is an example:

split -–bytes=500m database.sql.gz database_

This will split a large file into 500 meg pieces. The file being processed is database.sql.gz and it will result in pieces like database_aa, database_ab … etc.

Then using FTP, transfer the collection of files and have the process completed much faster vs. transferring a single large file.

Once you have all the files moved to the destination, you need to take all the pieces and put them back together. To do this, use cat.

cat database_* > database.sql.gz

If you are looking to combine the files back into one under Windows, from the command prompt try something like this:

copy /b database_* database.sql.gz

That’s it, I now have my original file restored and ready for processing.

One thought on “Moving large files? Try split first.

Comments are closed.