Why doesn't ZIP Compression compress anything?

A 398MB directory was only compressed to 393MB using 7Z and Normal ZIP compression. Is this normal? If so, why do people continue to use ZIP on Windows?

Replay

If you're compressing things that are already compressed (AVI, JPEG, MP3), you won't gain much other than packing everything in a single file.

Compression works by looking for repetitive patterns inside the items to compress. Also because you do not want to lose any data while compressing your files, the compression must be lossless(*).
Now with that in the back in your head, think about the way files (items) are stored on a computer. At the lowest level, they are all just a bunch of 0's and 1's.

The question can thus be transformed to: "How can I represent a bunch of 1's and 0's in a more compact way than the original representation?"

So lets start from the beginning, how can you compact the normal representation of a single bit (a single 1 or a single 0)?
The answer is really easy: you can't!... a single bit is represented in the most compact manner possible.

Fair enough, let us take a bigger example, how would you compress a binary string like 0111 0111 0100 0111?
Well because we already know that looking at the individual bits won't help us at all, we know that we have to look at a bigger scale. For example, let's take 4 bits at a time. We now see that the binary string "0111" will occur 3 times in the example, so why don't we represent that with a single bit: 0? but this still leaves 0100 in the dark, so let us represent that with "1"
We know have compressed the original to: "0010"

That's really good! However this is just the basic of basics of the "Huffman encoding algorithm", and in the real world it will be a little more complicated than that (and you would also need to store a table with the encoding information in it, but that's a bit to far for answering this question).

Now to really answer your question: why can't all data be compressed that good?, well let's take another example: "0001 0110 1000 1111", if we would use the same technique as above we would not be able to compress the data (no repetition is found), and thus would not benefit from compression...



(*) there are of course exceptions on this. The most known example of this is the compression used for MP3 files. here some information about the sounds will get lost while converting it from the raw, original file, to the MP3 format, this compression is thus lossy. Another example is the .JPG format for images

The process of compressing takes repeatable patterns and tokenizes them to shorter patterns. The output is then mostly non-repeatable and therefore cannot be compressed by much, if at all.

From the Limitations section of the Wikipedia article on Lossless Compression:

Lossless data compression algorithms cannot guarantee compression for all input data sets. In other words, for any (lossless) data compression algorithm, there will be an input data set that does not get smaller when processed by the algorithm. This is easily proven with elementary mathematics using a counting argument. ...

Basically, it's theoretically impossible to compress all possible input data losslessly.

Is this normal?

No. Not with "normal" files. What kind of files were you compressing? If they were already compressed, e.g. they are JPGs, GIFs, PNGs, videos or even other zip files, then they won't be compressed much by any algorithm. If you try compressing Text, XML, uncompressed BMP, source code etc. files, zip will provide good compression, but probably not the absolute best.

Why do people continue to use ZIP on Windows?

One reason is that there is nice zip handling built into the system - you can right click anywhere and create a new zip file, then drop stuff into it. You can just double click a zip file and it opens like a folder. You can copy stuff out of it and sometimes even use it in place. You don't need to install WinZip or 7z or any other program. I usually recommend people don't.

In a zip archive containing many files, each file is compressed independently. If there is a great deal of similarity between the files, then a different tool might give much better compression.

For example, tar.gz joins the files together, then compresses the results. Likewise a "solid" rar file makes use of similarities between files.

The downside of tar.gz or a solid rar is that you can no longer extract a single file from a large archive without decompressing the archive up to where the file you want is.

In the past, winrar has done a better compression job using the .rar algorithm, than the .zip one.

After much research and problems, I may have found a solution to sending large files through an email. I needed to attach several pictures, some PDFs, and some documents for a closeout package. I wanted it to be simple and clean for my boss to open. The files were large. Before sending, the pictures were about 32 MB and rest of the pdfs and text were around 5 MB... well over gmail's 25 MB limit.

Here is what I did... I downloaded and opened Picasa. I googled "how to resize pictures picasa windows 7". I selected my pictures that I needed in Picasa and clicked file: Export to a folder: and left everything the way it was except I put resize to 1600 instead of 800. Once I had the pictures exported to a file on my desktop, I selected all the pictures at one time (marqueeing them I believe is the correct term), right clicked on them, clicked on Send to: and clicked on Compressed file... This created an extra file with all the pictures I selected in it and the file is compressed (though I don't think it resized them again.) Then I put my pdfs and text documents into the compressed file. They did compress some. My total MB for the compressed file was from over 30MB to 12MB. So I opened up my gmail, clicked compose, and attached the compressed file that had all my resized pictures and documents in it and send it.... wa la! I could never have figured it out without all the tips others put on this site and others so i thought I would write some help for once!

*Lastly, the reason I created the compressed file to put everything in EVEN though my pictures had already been resized to a good size from Picasa, was not to resize them smaller (because I don't think it did) but that was the only way I could send everything and my boss have it in one file. I tried to select all the files and keep the pictures in their own separate file (uncompressed) and it didn't attach any pictures.

Category: windows Time: 2008-08-30 Views: 1

Related post

  • Why does 7-Zip's compression speed lower over time? 2015-03-21

    I'm busy backing up my local server files, which amounts to just under 4GB of data in ~295000 files. I selected 7z as the format, and set the level to Store. The process started pretty quickly, but the speed has dropped from 9MB/s to 1MB/s (and lower

  • Why doesn't nautilus search find anything outside of my home folder? 2015-06-04

    I installed nginx using apt-get. There are now many folders and files with "nginx" in their name. Using the file browser (nautilus) i search for "nginx" with "All Files" selected. It will not find anything outside of my home

  • why doesn't this code animate anything? 2016-02-01

    I am trying to create a blackhole simulation, here is the code, now the problem is that my code doesn't display anything, I tried a code beautifier, but it did not work, so what is wrong with this: <!DOCTYPE html> <html lang="en"> &l

  • why doesn't "which ruby" output anything? 2011-12-07

    I'm using Debian 6. I uninstalled ruby1.8, and then I installed ruby1.9.1. If I type "which ruby" at the command line, it doesn't output anything. It seems like it doesn't recognize it. --------------Solutions------------- Ruby installs on Debia

  • Why doesn't java -version return anything with Oracle's JRE 7? 2012-12-02

    I've been following these steps and got up to the check version of new JRE installation part but java -version doesn't return anything. I have run sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/jre1.7.0/bin/java. It returned updat

  • Why doesn't Ctrl+R do anything in Ubuntu's terminal? 2014-10-02

    The keystroke Ctrl+R is important to me because it's the redo command in vim. However, it seems to do nothing at all in Ubuntu's terminal. When I press Ctrl+V and then Ctrl+R, nothing is registered at all. And it does nothing in vim. Any ideas what's

  • Why does a zip file appear larger than the source file especially when it is text? 2012-08-29

    I have a text file that is 19 bytes in size and having compressed the file using zip and 7zip, it appears to be larger. I had a read of the question on Why is a 7zipped file larger than the raw file? as well as Why doesn't ZIP Compression compress an

  • Why doesn't my galaxy s4 screen display anything? 2015-01-03

    My Samsung galaxy s4 will suddenly not show anything on the display anymore. It vibrates and such as if everything but the display works. Sometimes it will flash something that looks a bit like part of the android guy, but it flashes so fast I cannot

  • what's the difference between "zip" and "compress" and "pack" ? 2015-12-21

    I saw article tile --"Compress and uncompress files (zip files)" from Windows 7 document here. What's the difference between "zip" and "compress" and "pack"? Is it the same thing? I am so confused and need your help

  • Why doesn't Chromium updates itself to the latest version, like Google Chrome does? 2011-12-19

    Is there an easier way to install Chromium than what is described here? Furthermore, is there an easier way to stay up to date with current Chromium releases rather than having to re-build Chromium each release? (a script which automatically grabs an

  • Why doesn't Macbook Pro continuously search for wifi like the iPhone does? 2014-04-12

    Here's a scenario: I'm at home with my router turned off. Neither my Macbook Pro nor my iPhone are connected to the wifi (obviously), though both search for a wifi connection upon being turned on/woken up from sleep. Leaving both machines up and runn

  • why doesn't system_menu() return the option I am expecting 2014-05-28

    If you go to this website: https://api.drupal.org/api/drupal/modules%21system%21system.module/function/system_menu/7 You will see what system_menu is supposed to output. My Drupal 7 has a jquery update under the development menu in the configuration

  • Why doesn't "try, except" work with classic "open(fname, 'r')" in python? 2016-01-24

    I have a function that opens a file and returns an opened file object. def read_any(): try: opened = gzip.open(fname, 'r') except IOError: opened = open(fname, 'r') return opened When I attempt to run this function on some non-zipped file except cond

  • MAC address spoofing - why doesn't this work? 2010-07-29

    So I'm in a new job, and they're pretty draconian about their network, hardware, and OS security. :-( I'm a web developer, but am forced to use IE7 for development simply because they don't want ANYone installing ANYthing other than the boilerplate O

  • Why doesn't my new laptop estimate how much longer it can run on battery power? 2010-08-01

    Why doesn't my HP Mini 210 netbook display an estimate of the amount of time it can run on battery power? My only previous laptop (an old Dell Latitude D600) could do this. I downgraded the HP from Windows 7 Starter to Windows XP immediately after I

  • Environments with conditionals in: why doesn't this work? 2011-03-30

    In response to this question I thought the obvious thing to do was the following: \documentclass{article} \newif\iffoo \newenvironment{foobar}{\iffoo}{\fi} \begin{document} \begin{foobar} Here is text \end{foobar} Here is more text \end{document} Thi

  • Why doesn't Oracle properly support 64bit java for unix? 2011-05-15

    http://www.java.com/en/download/manual.jsp has an insulting note: Please use the 32-bit version for Java applet and Java Web Start support. for both Solaris and for Linux. Why doesn't Oracle do a proper job and release full 64 bit support for unix? A

  • Why doesn't math \fontdimen's work with XeTeX? 2011-05-29

    Why doesn't the \fontdimen-command change anything with XeTeX? \input pgffor \def\chfndims#1{\foreach\dim in{13,...,19}{\fontdimen\dim#1=20pt}} \font\mymathfn="XITS Math:script=math" at 10pt \chfndims\mymathfn \font\mymathfns="XITS Math:scr

  • Is UTF-16 fixed-width or variable-width? Why doesn't UTF-8 have byte-order problem? 2011-07-22

    Is UTF-16 fixed-width or variable-width? I got different results from different sources: From http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF: UTF-16 stores Unicode characters in sixteen-bit chunks. From http://en.wikipedia.org/wiki/UTF-16/UCS-

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development

search

Front-end development

Database

development tools

Open Platform

Javascript development

.NET development

cloud computing

server

Copyright (C) avrocks.com, All Rights Reserved.

processed in 3.041 (s). 13 q(s)