Introduction to File compression and archiving
20 useful tar and zip commands It is useful to store a group of files in one file for easy backup, for transfer to another directory, or for transfer to another computer. It is also useful to compress large files; compressed files take up less disk space and download faster via the Internet.
It is important to understand the distinction between an archive file and a compressed file. An archive file is a collection of files and directories stored in one file. The archive file is not compressed — it uses the same amount of disk space as all the individual files and directories combined. A compressed file is a collection of files and directories that are stored in one file and stored in a way that uses less disk space than all the individual files and directories combined. If disk space is a concern, compress rarely-used files, or place all such files in a single archive file and compress it.
Note: tar file is not a compressed file, but compressed file is archived file
As we so many extensions to compress the files using tar command, as we take few examples in this article. All the extensions will work to compress the files and directories but there compression ratio is different compare to each other. Based extension compression ratio we can use different options.
1. gzip
2. bzip
3. zip
Syntax: tar <File Name.tar> <directory / file path>
1. Archiving files using tar command
Archiving is not an compression of files and directories it's an kind of group all the files and directories together in single file, instead of multiple files. After creating an archive file, we can't see size difference in between actual file system size and archive file.
Let's see an example below
[root@TechTutorial tar]# du -h *.txt <<-- Files Size before creating an archive 44K d.txt 44K g.txt 44K kumar.txt 44K ravi.txt 44K tech.txt 44K test1.txt 44K test2.txt 44K test3.txt 44K test4.txt [root@TechTutorial tar]# tar -cvf ravi.tar *.txt << to Create an Archive file command d.txt g.txt kumar.txt ravi.txt tech.txt test1.txt test2.txt test3.txt test4.txt [root@TechTutorial tar]# du -h ravi.tar << -- After Creating an archive file size 380K ravi.tar
explanation of tar command options
-c Create an archive file
-v verbose (display all files status to archive)
-f specifying the files
2. Extracting an archive file
In order to extract the archive file we have to use -x option along with tar command
[root@TechTutorial tar]# tar -xvf ravi.tar d.txt g.txt kumar.txt ravi.txt tech.txt test1.txt test2.txt test3.txt test4.txt
3. Updating an archive file with newly created files
There is a requirement that, we have to update an archive file by adding only newly created files. Adding only newly created files to archive will save us the lot of time.
Let's see an example as shown below, when we use -u option along with tar command it will update the tar file with newly created files
[root@TechTutorial tar]# touch Techtutorials.txt [root@TechTutorial tar]# tar -uvf ravi.tar *.txt Techtutorials.txt
4. List files from archive without extracting them
all the times we know need to extract an archive in order to see the archive content, if it is an large file its very difficult to extract and it takes lot of time to extract and required disk space as well to extract the files.
We have to use '-t' option to see all files which are there in archive file
[root@TechTutorial tar]# tar -tf ravi.tar d.txt g.txt kumar.txt ravi.txt tech.txt test1.txt test2.txt test3.txt test4.txt Techtutorials.txt
5. Extract single file from archive
This option is very handy whenever we have an large archive file, we need only single file from that archive to be restored. In order to restore an single file from archive we have to use wildcards
[root@TechTutorial tar]# rm -rf *.txt <<-- Deleted all the Files from current location [root@TechTutorial tar]# ls << -- After Deletion we have below files 3 arkit10.doc arkit1.doc arkit2.doc arkit3.doc arkit5.doc arkit6.doc arkit7.doc arkit8.doc arkit9.doc ravi.tar [root@TechTutorial tar]# tar -xvf ravi.tar Techtutorials.txt <<<-- Restored an single file from archive Techtutorials.txt [root@TechTutorial tar]# ls <<-- After Restoration we have below files 3 arkit1.doc arkit3.doc arkit6.doc arkit8.doc ravi.tar arkit10.doc arkit2.doc arkit5.doc arkit7.doc arkit9.doc Techtutorials.txt
above is the example how we can restore a single from archive
6. Extract multiple files from archive (not all files)
As you see in 5th step we extracted single file from archive, in the same way we are going to extract an multiple files from archive (not all).
Note: in order to extract files from archive you have to know exact file names, you can use '-t' to see all the files in archive
[root@TechTutorial tar]# rm -rf Techtutorials.txt <<-- To get clarity deleted previous presented files [root@TechTutorial tar]# tar -xvf ravi.tar "Techtutorials.txt" "test1.txt" test1.txt Techtutorials.txt [root@TechTutorial tar]# ls 3 arkit1.doc arkit3.doc arkit6.doc arkit8.doc ravi.tar test1.txt arkit10.doc arkit2.doc arkit5.doc arkit7.doc arkit9.doc Techtutorials.txt [root@TechTutorial tar]# rm -rf Techtutorials.txt test1.txt [root@TechTutorial tar]# tar -xvf ravi.tar --wildcards *.txt d.txt g.txt kumar.txt ravi.txt tech.txt test1.txt test2.txt test3.txt test4.txt Techtutorials.txt
Note:: As we deleting the previous files only for demonstration only, DO NOT DELETE FILES in your environment.
you can mention multiple file names and also we can use wildcard option to restore multiple files as shown above example
7. Compressing files in gzip
As of now we see how to archive an files (grouping files together in single file). After creating an archive we did not get an space saving benefit because archive will not compress an files, file size will not decrease. When we compress an files we save disk space. If we want to create 'gzip' file with extension '.gz' we have to use '-z' option along with 'tar' command.
Let's see an example
[root@TechTutorial tar]# tar -czvf tech.tar.gz *.txt d.txt g.txt kumar.txt ravi.txt Techtutorials.txt tech.txt test1.txt test2.txt test3.txt test4.txt [root@TechTutorial tar]# ls 3 arkit2.doc arkit6.doc arkit9.doc kumar.txt tech.tar.gz test1.txt test4.txt arkit10.doc arkit3.doc arkit7.doc d.txt ravi.tar Techtutorials.txt test2.txt arkit1.doc arkit5.doc arkit8.doc g.txt ravi.txt tech.txt test3.txt [root@TechTutorial tar]# du -h tech.tar.gz 4.0K tech.tar.gz [root@TechTutorial tar]# du -h *.txt 44K d.txt 44K g.txt 44K kumar.txt 44K ravi.txt 0 Techtutorials.txt 44K tech.txt 44K test1.txt 44K test2.txt 44K test3.txt 44K test4.txt [root@TechTutorial tar]#
As shown in above example, after compression of text files using '-z' we got an compression file size is 4KB actual file size 380KB
8. Compressing files using bzip
Its also same like 'gzip' only but compression ratio of '.bz2′ is more compare to '.gz' we are going to compress same files as we used in above example and see how much we will get the compressed file size, for 'bzip' we have to use '-j' option.
[root@TechTutorial tar]# tar -cjvf 1tech.tar.bz2 *.txt d.txt g.txt kumar.txt ravi.txt Techtutorials.txt tech.txt test1.txt test2.txt test3.txt test4.txt [root@TechTutorial tar]# du -h 1tech.tar.bz2 4.0K 1tech.tar.bz2
In this comparison of '.gz' and '.bz2' compression methods practical examples are below
9. Compression ratio of .gz (gzip) and .bz2 (bzip)
After compressing 34MB using '.gz' output file size is 8.6MB.
Using same files compressed with '.bz2' output file size is 7.2MB. Comparatively .bz2 compression ratio is higher than .gz
[root@TechTutorial tar]# du -h tarr.tar.gz 8.6M tarr.tar.gz [root@TechTutorial tar]# du -h tarr.tar.bz2 7.2M tarr.tar.bz2
10. Extracting compressed files from 'gzip' and 'bzip'
To extract 'gzip' and 'bzip' files we have to use '-x' option along with there own options '-z' for gzip and '-j' for bzip.
Below is the example for extracting the 'bzip' file
[root@TechTutorial tar]# tar -xjvf 1tech.tar.bz2 d.txt g.txt kumar.txt ravi.txt Techtutorials.txt tech.txt test1.txt test2.txt test3.txt test4.txt
Below is the practical example for extracting the 'gzip' file
[root@TechTutorial tar]# tar -xzvf tech.tar.gz d.txt g.txt kumar.txt ravi.txt Techtutorials.txt tech.txt test1.txt test2.txt test3.txt test4.txt [root@TechTutorial tar]#
11. zipping the files using zip command
zip command is used to compress the files with .zip extension, zip is available in different platform's such as Unix, Linux, Windows and MAC.
Syntax: zip <Destination File Path and Name>.zip <source files to compress>
below is the example to compress the files using 'zip' command
[root@TechTutorial tar]# zip docfiles.zip *.txt adding: d.txt (deflated 100%) adding: g.txt (deflated 100%) adding: kumar.txt (deflated 100%) adding: ravi.txt (deflated 100%) adding: Techtutorials.txt (stored 0%) adding: tech.txt (deflated 100%) adding: test1.txt (deflated 100%) adding: test2.txt (deflated 100%) adding: test3.txt (deflated 100%) adding: test4.txt (deflated 100%) [root@TechTutorial tar]#
12. zipping files and directories along with sub directories and its files
When we use remote directory compression using 'zip' command it will not compress all the sub directories and its content in order to compress all the sub directories and its files we have to use '-r' along with zip command
[root@TechTutorial tar]# zip -r subdir.zip ravi/ adding: ravi/ (stored 0%) adding: ravi/kumar/ (stored 0%) adding: ravi/kumar/tech/ (stored 0%) adding: ravi/kumar/tech/d.txt (deflated 100%) adding: ravi/kumar/tech/g.txt (deflated 100%) adding: ravi/kumar/tech/kumar.txt (deflated 100%) adding: ravi/kumar/tech/ravi.txt (deflated 100%)
13. compressing with high compression ratio
zip command has good feature that we can also mention an compression ratio option from 1 to 9. 9 gives high compression.
[root@TechTutorial tar]# zip -9 -r deepcompress.zip ravi/ adding: ravi/ (stored 0%) adding: ravi/kumar/ (stored 0%) adding: ravi/kumar/tech/ (stored 0%) adding: ravi/kumar/tech/d.txt (deflated 100%) adding: ravi/kumar/tech/g.txt (deflated 100%) adding: ravi/kumar/tech/kumar.txt (deflated 100%) adding: ravi/kumar/tech/ravi.txt (deflated 100%) adding: ravi/kumar/tech/Techtutorials.txt (stored 0%) adding: ravi/kumar/tech/tech.txt (deflated 100%) adding: ravi/kumar/tech/test1.txt (deflated 100%) adding: ravi/kumar/tech/test2.txt (deflated 100%) adding: ravi/kumar/tech/test3.txt (deflated 100%) adding: ravi/kumar/tech/test4.txt (deflated 100%)
14. Excluding particular file / directory from compression
We can also exclude file from compression in order to do that '-x' we have to use.
[root@TechTutorial tar]# zip -r compress1.zip ravi/ -x ravi/g.txt adding: ravi/ (stored 0%) adding: ravi/d.txt (deflated 100%) adding: ravi/kumar.txt (deflated 100%) adding: ravi/ravi.txt (deflated 100%) adding: ravi/Techtutorials.txt (stored 0%) adding: ravi/tech.txt (deflated 100%) adding: ravi/test1.txt (deflated 100%) adding: ravi/test2.txt (deflated 100%) adding: ravi/test3.txt (deflated 100%) adding: ravi/test4.txt (deflated 100%) [root@TechTutorial tar]# ls ravi/ d.txt g.txt kumar.txt ravi.txt Techtutorials.txt tech.txt test1.txt test2.txt test3.txt test4.txt
15. Delete particular file from zip
We can also delete an file from compressed file using option '-d' along with zip command
[root@TechTutorial tar]# zip -d compress1.zip ravi/tech.txt deleting: ravi/tech.txt
16. Update newly created files to zip
We can update zip file using '-u' option which will only add newly created files to zip file.
[root@TechTutorial tar]# touch Update2.txt [root@TechTutorial tar]# zip -u compress1.zip *.txt adding: Update2.txt (stored 0%) [root@TechTutorial tar]#
17. Update zip with newly modified files
Update only modifed files to zip file, in order to do modified file update use '-fr' option
[root@TechTutorial tar]# zip -fr compress1.zip *.txt freshening: Update2.txt (stored 0%) [root@TechTutorial tar]#
18. List all files from zip without extracting them
List all files from zip without extracting them
# less compress.zip
19. Check zip file content without extracting
Without extracting zip file, if you want to see zipped file content you can see using 'zmore' and 'zless' commands.
# zmore compress.zip # zless comress.zip
20. De-compress zip file
In order to extract the zip file we have to use 'unzip' command. If files are exists it will ask you for the confirmation to re-write the same.
[root@TechTutorial tar]# unzip compress1.zip Archive: compress1.zip replace d.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: y inflating: d.txt replace g.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: y inflating: g.txt replace kumar.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A inflating: kumar.txt
20 useful tar and zip commands 20 useful tar and zip commands 20 useful tar and zip commands 20 useful tar and zip commands 20 useful tar and zip commands
:: Conclusion ::
We can group all files and directories in a single file by archiving, We can also compress the files and directories in order to save the disk space. Archiving files and directories will not save a disk space.
Thanks for your precious time, please write your comments below ….