in

Mastering the tar command: A comprehensive guide with 15+ examples

default image

Hello friend! Have you ever struggled with archiving or compressing files on your Linux system? If yes, then this guide is for you. We will explore the popular tar utility through 15+ detailed examples.

Whether you are a system administrator looking to backup configuration files or a developer exchanging codebases – tar is a handy tool for all.

Let me walk you through everything you need to know to become a tar navigating expert.

A little bit of history

The tar command has a legacy dating back to the early tapes drives era. Tape drives were used on Unix systems in the 1970s to backup data. Tar was thus developed to archive multiple files into tape archives, combining several files into one continuous tape file.

Fun fact: Later adaptations of tar could read archives from stdin and write to stdout. This allowed piping archives across networks rather than just tape devices.

Over the decades, tar evolved into the defacto standard for archiving files in Unix world. It works across different implementations of Unix systems. Some vendors even included their own optimizations, like BSD tar.

According to 2022 StackOverflow survey, Linux ranks as the most popular operating system among developers. No wonder you will find tar pre-installed on most modern Linux distributions.

As a testament to its popularity, tar has over 100 command line options! We will explore some common ones today.

Creating your first archive

Let‘s start by archiving a couple of files into a tar file.

$ ls
file1.txt file2.txt

$ tar cvf my-archive.tar file1.txt file2.txt
file1.txt
file2.txt

Here:

  • c – Creates an archive
  • v – Verbose output listing files
  • f – Archive filename

This creates a tar file named my-archive.tar containing the two files. Pretty easy right?

Note that tar archives just concatenate files as-is without compression. The uncompressed contents can be extracted without any loss.

Fun fact: The largest tar archive ever created is believed to be a 2 petabyte backup archive at University of Manchester!

Compressing archives

While uncompressed tar is handy for archiving small sets of files, larger files and directories warrant compression.

Tar supports different compression formats:

Format Flag Extension
Gzip z .tar.gz or .tgz
Bzip2 j .tar.bz2
Lzip –lzip .tar.lz

Example (gzip):

$ tar czf my-archive.tar.gz large-directory/ 

This creates a compressed gzipped archive of the large-directory.

Among these, gzip offers the fastest compression rates while bzip2 has higher compression ratio. So there is a trade-off between speed vs compression percentage.

Listing archive contents

You can view contents of a tar archive without extracting it using:

$ tar tvf my-archive.tar

The t option lists files with details like permissions, ownership and size:

-rw-r--r-- abhishek/staff  1188 2017-04-05 09:51 file1.txt  
-rw-r--r-- abhishek/staff   723 2017-04-05 13:20 file2.txt

This works for compressed archives too.

Fun fact: In 1999, researcher Timothy Kientzle developed libarchive open source library by analyzing tar file formats! This enabled porting tar capabilities into different programming languages.

Extracting specific files

You can choose to extract specific files from a tar archive without unpacking everything:

$ tar xvf my-archive.tar file1.txt

This extracts just file1.txt into current directory.

For compressed tar, add z or j flags:

$ tar xvzf my-archive.tar.gz file1.txt

You can specify multiple file names to extract several specific files in one go.

Appending new files

Uncompressed tar files provide the flexibility of incrementally adding more files easily:

$ tar rvf my-archive.tar another-file.txt

The r option appends files rather than overwriting existing archive.

This avoids having to recreate entire archives when new files need to be added.

Real-world example: Log archiving tasks where active logs are periodically added to existing archives.

Updating existing archives

In addition to adding new files, tar also allows updating existing archived files without affecting other contents.

For example, consider archive my-archive.tar contains file.txt. If the file gets updated:

$ echo "new data" > file.txt 

$ tar uvf my-archive.tar file.txt

The u option updates existing file in archive with the modified filesystem version without tampering other archived files.

Excluding files/patterns

When archiving specific directories, you might need to exclude a few files or patterns.

This can be done using the –exclude option:

$ tar czf my-archive.tar.gz --exclude="*.temp" stuff/

Here this excludes all .temp files inside stuff/ folder from archive.

More examples:

$ tar czf archive.tgz --exclude="/logs/*.log" /var/www
# Excludes all .log files from /var/www

You can supply multiple –exclude options in same command.

Real-world cases: Ignoring editor temporary files, node_modules folders etc.

Preserving permissions

By default tar archives store Unix permissions and attributes like timestamps for extracted files and folders.

The p option explicitly asks tar to preserve permissions:

$ tar cpf archive.tar /etc/*.conf

When users extract it, they get same permissions as source.

Useful for distributing admin scripts, cron jobs or web apps across team.

Verifying archives

When creating uncompressed archives, you can enable content verification using:

$ tar cWvf archive.tar *.c

The W flag verifies file contents after write to handle situations where files were changed during archive process. This checks integrity of archives.

Estimating compressed sizes

It is not possible to determine compressed tar content sizes just by listing files. The actual space savings can be estimated by:

$ tar xzf archive.tar.gz --to-stdout | wc -c 
# replace xzf with xjf for bzip2

This decompresses archive stream to stdout and counts bytes using wc utility.

Extracting into custom directories

By default tar expands archive contents into current directory. To redirect extraction path:

$ tar xvf archive.tar -C /tmp/special-folder

The -C argument overrides destination as /tmp/special-folder instead.

This also works for archives being extracted via scripts without needing to cd into target directory first.

Finding file differences

Ever compared an existing filesystem file with its archived version inside tar?

tar provides a diff mode just for this!

Consider file.txt exists in source code and archive. If filesystem file.txt changes:

$ tar dif archive.tar file.txt
file.txt: Mod time differs

The output shows that timestamp differs between the two file versions. Useful right?

It can detect changes in – permissions, timestamps, sizes etc.

Use cases

Now that you have understood tar capabilities hands-on, where can it help in real-world scenarios?

For system administrators:

  • Archiving log files from across servers and freeing disk space
  • Backup and restore critical system files like /etc
  • Distributing scripts, cron jobs to managed Linux boxes
  • Transferring or exchanging sets of files via tapes/drives/clouds

For software developers:

  • Sharing codebases and directories with teammates
  • Exchanging data bundles with clients or third-party systems
  • Attaching sets of files while reporting bugs or submitting patches
  • Archiving stale repositories or gists for future reference

Hope these give you ideas about applying tar for streamlining workflows!

Closing thoughts

Tar remains among the most flexible archiving solutions even several decades since inception. With compute capabilities growing exponentially alongside storage volumes – innovative applications of tar continue to emerge.

The examples provided in this guide only skim the surface of tar possibilities. I encourage you to check out other advanced features by reading the manual – man tar.

Did you find this helpful? What are your favorite tar tricks? Ping me, would love to know!

AlexisKestler

Written by Alexis Kestler

A female web designer and programmer - Now is a 36-year IT professional with over 15 years of experience living in NorCal. I enjoy keeping my feet wet in the world of technology through reading, working, and researching topics that pique my interest.