Reading this article requires basic convolutional neural network knowledge.
The best way to encode a binary string is using Huffman’s coding. The entropy of two different binary strings is different, and after encoding the length also varies, that is, the compression ratio is different. During data transmission,data need to be encoded and is converted into binary format with values 0 and 1.If all the character data is converted to binary format of same length irrespective of their priority,length of the entire code will be large. So,huffman encoding follows a rule of assigning short length code for most frequently used words and longer length code for less frequently used words.Thus the length of entire code gets reduced.
Here are two binary strings: 100001001000000000 - 10010110101101010001
The first string is easier to compress, resulting in a shorter compressed length than second string.
Lossy compression as name implies some data is lost during process. Lossy compression of can be acieved in following steps:
To know more about JPEG Algorithm and how it works. Read this paper Image Compression Algorithm and JPEG Standard
Joint Photographic Experts Group (JPEG) has built up a worldwide standard for universally useful, shading, stillpicture compression. There are numerous compression standards such as JPEG, JPEG-LS and JPEG-2000. JPEG-LS is also called loss less compression so exactly what you see is what you compress in that case there is no irrelevant information added. The block diagram of Baseline JPEG Transformation is shown in
JPEG goes through the following process:
This method has many disadvantages:
Well, you can try it with Convolutional Neural Network (CNN).
Convolutional Neural Networks are very similar to ordinary Neural Networks, they are made up of neurons that have learnable weights and biases. Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity. The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other. And they still have a loss function (e.g. SVM/Softmax) on the last (fully-connected) layer and all the training set we developed for learning regular Neural Networks still applies.
To convert the image to a binary string and then convert it back, two CNNs are needed, one responsible for encoding (image -> 0-1 bitmap) and one for decoding (0-1 bitmap -> image).
Training:Training architecture (green arrows indicate error reduction as a target, update parameters):
To reduce reconstruction loss, the best encoding strategy for the encoder is to drive its output ("features") large, to reduce artifacts caused by the gaussian noise. Therefore by increasing the magnitude of the gaussian noise, the code will eventually saturate to 0 or 1. We encourage sparsity of the code (to allow for further compression) by adding a penalty term (mean(code**2) * 0.01)
, after which the code will tend to include more zeros and less ones.
loss = tf.reduce_mean((y-x)**2) + tf.reduce_mean(binary_code**2) * 0.01
Which sigmoid function is the role of the encoder output is limited to between 0 and 1. To make the network more output 0 and less output 1 (for compression), I added a penalty on the generated binary:
This structure becomes:
After the training began, the situation is this:
From left to right:
Right from top to bottom:
A closer look can be seen, sigmoid output is not used 0 and 1, but in different shades of gray that information, and decoder from these shades of gray restore the image. If we binarize the sigmoid output, the information is completely lost. So how can we force neural networks to represent information using 0 and 1 rather than other values between 0 and 1?
The answer is to increase the noise!
If we add Gaussian noise before the sigmoid function, then the encoder and decoder will start to find out that with gray unreliable, only information encoded with 0 and 1 can resist noise.
After adding Gaussian noise, the architecture becomes like this:
The noise standard deviation transferred to 15, the training effect is as follows:
Can see that the network has learned to use binary dither to encode the image content, the binary output is not done on the sigmoid little difference.
Testing:
Image Compression Algorithm and JPEG Standard by Suman Kunwar http://www.ijsrp.org/research-paper-1217/ijsrp-p7224.pdf
CNN structure based on VGG16, https://github.com/ry/tensorflow-vgg16/blob/master/vgg16.py
NPEG by Qin Yongliang https://github.com/ctmakro/npeg
JPEG image Compression Algorithm by Suman Kunwar https://medium.com/@sumn2u/jpeg-image-compression-algorithm-979de35c808d
Image Compression with Neural Networks https://research.googleblog.com/2016/09/image-compression-with-neural-networks.html