ZIP Up Files from the Command Line
One of the most useful tools in any computer users toolbox is an archiver/compression tool. Examples include 7-Zip, WinZip, PKZip, and WinRAR for the Windows platform, and gzip/tar for the Linux platform. Most people are familiar with the graphical user interfaces that these tools provide, dragging and dropping files into, and out of archives. But did you know that all of these tools also have commandline versions available? These are great for using in batch files to automate the process of compressing, or uncompressing, files. This makes it possible, with one click of a mouse button, to do things like:
- Make a quick backup for safekeeping of the 20 most critical files and folders, no matter that they are scattered across two hard drives in a dozen different locations.
- Copy the files for any current projects onto a USB drive, to take home for the weekend, or take on the road.
- “Snapshot” a set of files, freezing them in time — just in case it is ever necessary to revert them back to the way they were.
- Archive old files before deleting them, as a simple precaution.
(Not) Automating Extractions: As you can see, most of the cases that call for automation are ones where original files are being compressed and stored into an archive, rather than files being extracted from an archive. For file extraction, it’s typically easiest to use the GUI, to double-click on the archive file and then click-and-drag the desired files out of the archive. So, the examples below will focus on the former, but don’t hesitate to post a comment should you need to see examples of latter.
7-Zip: For this example, I will use 7-Zip, a free open source archiver for the Windows platform (see Quick Link: 7-Zip 4.47 in Beta), but the concepts are the same no matter which tool you prefer. Only the details of syntax will differ.
When you run the 7-Zip installer, it actually installs several versions of itself as separate executables. The executable that all of the shortcuts link to is 7zFM.exe, known as the “7-Zip File Manager.” That’s the one that presents a graphical user interface. There are also a couple of commandline versions of the executable:
7z.exe — This version of the commandline program is fully featured, because it utilizes all of the plug-in modules that are included with the 7-Zip package. For example, if you want to be able to unarchive a RAR file, then you will need to use this version. Also, if you want to create an archive that is self-extracting, then you’ll need to use this version.
7za.exe — This is the standalone version of the commandline program. It only supports certain built-in compression formats (7z, zip, gzip, bzip2, Z and tar). 7za.exe doesn’t depend on any other files besides the EXE itself. So, this version is particularly handy for carrying around on a USB drive, or for any other need where it is nice to only have to worry about the one EXE file being in place.
C: CD books 7z.exe u -tzip classics.zip classics -r
The general syntax for calling 7-Zip from the command line is: the name of 7-Zip executable (either 7z.exe or 7za.exe), followed by one of seven command a letters (e.g. “a” for adding to an archive, “u” for updating an archive, “x” for extracting from an archive, etc.), followed by the name of the archive file to be accessed (or created), followed by whatever additional information is necessary depending on which command is used (in the case of an “a” or “u” command, for example, it needs to know the names of the files and/or folders that are to be archived).
There are also 20 option switches available that can be used to customize how the commands operate. For example, “-r” tells 7-Zip to re-curse through all subdirectories, thus including all children, grandchildren, etc. of the specified folder(s).
Here are the seven commands that are available in 7-Zip:
|a||Add - create a new archive, or add files to an existing archive|
|d||Delete - remove files from an existing archive|
|e||Extract - unarchive files|
|l||List - display the contents of an archive|
|t||Test - validate the integrity of an archive|
|u||Update - overwrite existing files in an existing archive|
|x||Extract - same as “e”, except that the files are restored to their exact original locations (if possible)|
The difference between Add an Update is subtle. If the archive file you are creating does not already exist, then there is no difference between Add an Update (for all practical purposes). It is when the archive file does already exist, that the difference matters. In that case, Update will take the time to look for existing files in the archive that match the names (and paths) of the incoming files. For any match found, the previously existing file in the archive is first removed, and then the new file is added. In the case of the Add command, each new file is added to the end of the archive, regardless of whether or not a file by that name and path already exists. (Yes, that could mean that two files by the same name could exist and the archive files simultaneously, but it’s no big deal. It when extracting those files the older one will be extracted first, and then overwritten by the new one when it gets extracted.)
Making a Quick Backup: So, let’s say that there are two folders on your C: drive called C:\research and C:\papers that contain critical files that you are constantly editing. So, at the end of every day, you would like to be able to double-click an icon on your desktop which will archive all of the files in those folders. To do this, create a batch file that contains the following commands:
C: CD backups 7z.exe a research.7z C:research* -r 7z.exe a papers.7z C:papers* -r pause
My personal convention is to place such batch files in C:\sys\scripts (where “sys” stands for “system-level stuff”). Thus, this batch file would then be named something like “C:\sys\scripts\nightly_bu.bat”. To run the batch file, navigate to C:\sys\scripts (in the Windows Explorer), and then double-click on the name of the BAT file. (Or, create a shortcut to it on your desktop for convenience, and double-click on that.)
Excluding Files: Let’s say that the first time you run your batch file, it takes longer than you would like. Plus, you see that the research.7z file is much larger than you were expecting. Taking a closer look at the contents of C:\research\, you realize that there are quite a number of *.PDF documents that don’t really need to be backed up, since they never change and can easily be obtained again in case they are lost. Also, you use a text editor that creates backup copies of the files that are edited, by tacking on a “.bak” extension, and those don’t need to be archived either. To exclude such *.PDF and *.BAK files, change the batch file to look like this:
C: CD backups 7z.exe a research.7z C:research* -r -x!*.pdf -x!*.bak 7z.exe a papers.7z C:papers* -r -x!*.pdf -x!*.bak pause
-x is the file exclusion option switch. It is followed by either an exclamation point (!), or an at-sign (@). An exclamation point means that what follows is a wildcard pattern to be compared against the filenames found. An at-sign means that what follows is the name of a file that contains multiple wildcard patterns to be considered. So, in this case we actually had a choice. Instead of repeating the -x switch twice, once for the PDF files and once for the BAK files, we could have used something like “-firstname.lastname@example.org”, where exclude.txt is a file that looks like this:
(Note: 7-Zip is unique in requiring the exclamation point before a literal wildcard. Usually, Windows programs that accept the @ notation for an option’s value will assume that the absence of an at-sign means that the option’s value is a literal.)
Other Option Switches: Some of 7-Zip’s more interesting option switches are:
|-x||Exclude file(s), as shown above|
|-t||The type of archive to create (-t7z, -tzip, -tgzip, -tbzip2 or -ttar). -t7z is the default.|
|-sfx||Create a self-extracting archive|
|-mx=9||This can be any number from 0 to 9, where 0 means no compression (just store the files), and 9 means maximum compression (takes longer). -mx=5 is the default, a compromise between the amount of compression obtained, and the time required to perform the compression.|
|-o||specifies the output directory (for when extracting). The default is to use the current directory.|
|-u||Update options. this switch works in conjunction with the add, delete, and update commands to determine conflict resolution. For example, what happens when a file being added to an archive already exists in the archive and the timestamp on the source file is older than the timestamp in the archive. Should the file in the archive be left alone, or overwritten?|
|-v||Create Volumes — This switch allows you to specify the maximum size for an archive file. If the archive file would be bigger than that, 7-Zip will automatically split it into multiple volumes. this will ensure that its the archive files can fit on whatever storage media you have at hand.|