Telegram's Client-Side Image Compression on Android: How It Works and Why It Matters

Have you noticed when you take a picture on your smartphone and upload/share it to social media like Telegram, Instagram, WhatsApp, etc. Usually before sending it to the remote server, the quality of the original image is reduced but not so much and still negligible to the naked eye to get a clear picture of the whole image?.

Why and How do they do that? Well, that's what I'll talk about here.

I previously wrote How To Create Menu Attachment Popup Like WhatsApp on Android on Medium if that also pique your interest.

Results

Wouldn't it be nice to see results first? So, I created a pilot Android application project to implement the Telegram image optimization algorithm and compare the input image (the original) with the output image (the scaled one) resolution and file size. I share the project GitHub repository link below for you to try. Then, I go on the internet and download several high-resolution images I could find. Here are the optimization results:

The output file size would range from 86 KB to 400 KB. If you notice, the highest image resolution is the famous Van Gogh - Starry Night original file with 30000 x 23756 resolution and 205.1 MB file size got optimized to 1280 x 1014 HD Resolution and 389 KB of file size 😱, that is ~99.8% file size reduction.

💡 I am using 1920x1080: 420dpi emulator resolution

These are optimization result from photos I took with my phone (Samsung S8):

I don't know about you, but in my opinion, this is impressive image optimization. You can still see a clear image on each optimization output with a significantly smaller file size than the original. Of course the optimized image will get blurry if you zoom it in compared with the original. But, the trade-offs are worth it for chat applications like Telegram.

Why The Image Quality is Reduced?

Well, there must be several reasons depends on the company. The obvious one would be the cost of the storage space on the remote server. I guess the second reason is upload time or load time. The smaller the size of the image, the faster the image will be sent and loaded over a network. So with that, we also got smaller bandwidth usage.

Let's do the math, shall we? Now, this is just based on my assumption. Let's assume a large social media company has cloud storage somewhere to save all media-related files (all images). The total storage capacity it has is 10 TB, which means it's 10,000 GB or 10,000,000 MB and the price is $0.02 per GB per month (I took from Google Standard Cloud Storage pricing). So the company should pay $200/month or $2,400/year for the cloud storage.

Assuming photos taken from a smartphone have ~3 MB/photo in size and the company store all images without optimizing it, a.k.a the original quality, the storage can hold for about 3.3 Million of photos. What if the images are optimized before uploading them to the cloud storage? Assuming the company reduces the original image quality to 1280x720 (720p) HD resolution which Telegram currently implemented as of this post released and the size is ~600 KB/image, now the storage can hold for about 16.6 Million of photos. That is 5x of images they previously can store, and they could save $9,600/year.

And that is just from the Backend side. From the user's side, it would save their internet quota, which may also be translated to their money. Not to mention time-saving because the image upload/download is faster, reducing the user's risk of getting a timeout connection.

I don't say that every company must reduce the image quality before sending the image to the server. However, there are trade-offs it must consider. It depends on each company's use case and business model. For example, a marketplace for photography images would store the original pictures because people would buy photos with high resolution.

What is Telegram?

I'm sure most of you already familiar with this app. It's a robust chatting application created by Pavel Durov available on desktop, mobile, and web. I think this app is a strong competitor with WhatsApp chat application owned by Facebook. However, based on my experience using those apps, Telegram has a slight upper hand because it has a 60 FPS animated sticker support which WhatsApp did not support yet.

Why Telegram?

As a Software Engineer, I also like Telegram more because the source code for the Android app is open-sourced on this GitHub repo and the app has been downloaded for about ~8 million users on Google Play Store as of this writing.

What I don't like about it is that even though the source code is open-sourced, there is no code organization whatsoever, making it nearly impossible to read/interpret by the brain. For example, I noticed Telegram do not use XML file for view layout. Instead, they create all views programmatically. aaaand there is neither unit nor integration tests 😱.

Put aside the fact that it has extremely low readability. It contains several hidden gems that may pique your interest. As an app that serves over 7 million users on Android, there are several practices worth learning from it. For example, the one I will talk about in this post, how Telegram optimize images before uploading them to the remote server.

How I Navigate Through The Source Code

I am using Android Studio (AS) as the IDE for debugging Telegram source code. In the beginning, It was a bit rough for me to pinpoint whats function gets called when users start to send/upload images. So I start by going through the manifest first, looking for the application class and the main activity. I then start placing many breakpoints on every line of the activity lifecycle 😱.

Apparently, Telegram uses custom fragment add/remove implementation. So, it is pretty hard to find what views get rendered or clicked and what function gets called. After a few days of debugging and about to give up suddenly god slap me in the face 😵 to stop wasting my time and telling me there is another way, I remember that AS has The Android Profiler feature which can help me profile CPU activity.

With the Sample Java Methods record feature from The Android Profiler I can see which function gets called when the user interacts with the view. In our case, it's when the user tap sends image button.

Once I found the suspicious function triggering the upload/send image, I created a logger function to help me identify the stack trace and print it to the Logcat. The logger class I created:

package org.telegram.logger;

import com.google.android.exoplayer2.util.Log;

public class LoggerUtil {

    public static final String  DEFAULT_KEY = "LOG_INSPECT";

    public static void printCurrentStackTrace() {
        StringBuilder sb = new StringBuilder();
        for (StackTraceElement ste : Thread.currentThread().getStackTrace()) {
            sb.append(ste);
            sb.append("\n");
        }
        Log.d(DEFAULT_KEY, sb.toString());
    }

    public static void printLog(String msg) {
        printLog(DEFAULT_KEY, msg);
    }

    public static void printLog(String key, String msg) {
        Log.d(key, msg);
    }
}

The Logcat will print the stack trace and other log messages I put in the code. Once I confirm what function gets called, I continue debugging line by line using the debugger.

How Telegram do it?

How do they optimize the images? That's what we will explore in this post. Fortunately, as I went through the image optimization code, there is no indication Telegram is using native C/C++ lib code such as FFmpeg, mozjpeg, libjpeg-turbo or any other JNI call for it. So, I can purely rewrite it with Kotlin on the pilot project.

Although, I notice the FFmpeg and mozjpeg are defined as the dependencies, but I don't know when it is used.

The Big Picture

I selectively copy, paste, and rewrite the compression code with Kotlin from Telegram source code to the pilot project. Then, I try to run the app whether the output image is generating the same dimension and file size compared to when I upload it in one of my chat rooms on Telegram.

Long story short, I am able to generate the same output as Telegram does. I try to get the idea of what the code does and remove the unnecessary part of the original code. After I understand the majority of the code, I refactor it to be more readable. Finally, I end up with a Kotlin object with one public function and several private functions. The public function would look like this:

object ImageOptimizer {

    /**
     * @param context the application environment
     * @param imageUri the input image uri. usually "content://..."
     * @param compressFormat the output image file format
     * @param maxWidth the output image max width
     * @param maxHeight the output image max height
     * @param useMaxScale determine whether to use the bigger dimension
     * between [maxWidth] or [maxHeight]
     * @param quality the output image compress quality
     * @param minWidth the output image min width
     * @param minHeight the output image min height
     *
     * @return output image [android.net.Uri]
     */
    fun optimize(
        context: Context,
        imageUri: Uri,
        compressFormat: Bitmap.CompressFormat,
        maxWidth: Float,
        maxHeight: Float,
        useMaxScale: Boolean,
        quality: Int,
        minWidth: Int,
        minHeight: Int
    ): Uri? {
        /**
         * Decode uri bitmap from activity result using content provider
         */
        val bmOptions: BitmapFactory.Options = decodeBitmapFromUri(context, imageUri)

        /**
         * Calculate scale factor of the bitmap relative to [maxWidth] and [maxHeight]
         */
        val scaleDownFactor: Float = calculateScaleDownFactor(
            bmOptions, useMaxScale, maxWidth, maxHeight
        )

        /**
         * Since [BitmapFactory.Options.inSampleSize] only accept value with power of 2,
         * we calculate the nearest power of 2 to the previously calculated scaleDownFactor
         * check doc [BitmapFactory.Options.inSampleSize]
         */
        setNearestInSampleSize(bmOptions, scaleDownFactor)

        /**
         * 2 things we do here with image matrix:
         * - Adjust image rotation
         * - Scale image matrix based on remaining [scaleDownFactor / bmOption.inSampleSize]
         */
        val matrix: Matrix = calculateImageMatrix(
            context, imageUri, scaleDownFactor, bmOptions
        ) ?: return null

        /**
         * Create new bitmap based on defined bmOptions and calculated matrix
         */
        val newBitmap: Bitmap = generateNewBitmap(
            context, imageUri, bmOptions, matrix
        ) ?: return null
        val newBitmapWidth = newBitmap.width
        val newBitmapHeight = newBitmap.height

        /**
         * Determine whether to scale up the image or not if the
         * image width and height is below minimum dimension
         */
        val shouldScaleUp: Boolean = shouldScaleUp(
            newBitmapWidth, newBitmapHeight, minWidth, minHeight
        )

        /**
         * Calculate the final scaleUpFactor if the image need to be scaled up.
         */
        val scaleUpFactor: Float = calculateScaleUpFactor(
            newBitmapWidth.toFloat(), newBitmapHeight.toFloat(), maxWidth, maxHeight,
            minWidth, minHeight, shouldScaleUp
        )

        /**
         * calculate the final width and height based on final scaleUpFactor
         */
        val finalWidth: Int = finalWidth(newBitmapWidth.toFloat(), scaleUpFactor)
        val finalHeight: Int = finalHeight(newBitmapHeight.toFloat(), scaleUpFactor)

        /**
         * Generate the final bitmap, by scaling up if needed
         */
        val finalBitmap: Bitmap = scaleUpBitmapIfNeeded(
            newBitmap, finalWidth, finalHeight, scaleUpFactor, shouldScaleUp
        )

        /**
         * compress and save image
         */
        val imageFilePath: String = compressAndSaveImage(
            finalBitmap, compressFormat, quality
        ) ?: return null

        return Uri.fromFile(File(imageFilePath))
    }

}

I purposely declare the variable type for each declaration to give you an idea of what data gets returned from each function call. If you check the caller of this long-running function is executed on the background thread using Kotlin Coroutine.

Telegram also uses the algorithm to generate thumbnail of the input image. the only difference is the value passed as the arguments for the maxWidth, maxHeight and quality paremeters.

Decoding The Input Image

private fun decodeBitmapFromUri(
    context: Context,
    imageUri: Uri
): BitmapFactory.Options {
    val bmOptions = BitmapFactory.Options().apply {
        inJustDecodeBounds = true
    }
    val input: InputStream? = context.contentResolver.openInputStream(imageUri)
    BitmapFactory.decodeStream(input, null, bmOptions)
    input?.close()
    return bmOptions
}

Nothing special here, we decode the input/original into BitmapFactory.Options. We use input data in the form of content://... uri scheme that we got from the activity result.

Don't forget to use inJustDecodeBounds = true or else the high-resolution input image Bitmap will get allocated into memory, potentially leading to OOM (Out Of Memory) exception.

There is a slight difference here from what Telegram does. They use input Uri with file://... uri scheme. Converting the content to file scheme is a little tricky. I don't know why they convert it, but I'm guessing it's for the native C/C++ library that only supports the file://.. protocol (ex: FFmpeg). Because I only use Kotlin, I figure I don't need to convert it to the file scheme.

Calculate the Scale Factor

private fun calculateScaleDownFactor(
    bmOptions: BitmapFactory.Options,
    useMaxScale: Boolean,
    maxWidth: Float,
    maxHeight: Float
): Float {
    val photoW = bmOptions.outWidth.toFloat()
    val photoH = bmOptions.outHeight.toFloat()
    val widthRatio = photoW / maxWidth
    val heightRatio = photoH / maxHeight
    var scaleFactor = if (useMaxScale) {
        max(widthRatio, heightRatio)
    } else {
        min(widthRatio, heightRatio)
    }
    if (scaleFactor < 1) {
        scaleFactor = 1f
    }
    return scaleFactor
}

In this part, we calculate the scale-down factor of the original image with the max width and max height. As for Telegram, they use value of 1280 (HD Resolution) for the maxWidth and maxHeight parameters. For the thumbnail size it uses value of 90.

Note that in this part, the scale factor has Float data type.

Calculate Nearest inSampleSize

private fun setNearestInSampleSize(
    bmOptions: BitmapFactory.Options,
    scaleFactor: Float
) {
    bmOptions.inJustDecodeBounds = false
    bmOptions.inSampleSize = scaleFactor.toInt()
    if (bmOptions.inSampleSize % 2 != 0) {
        var sample = 1
        while (sample * 2 < bmOptions.inSampleSize) {
            sample *= 2
        }
        bmOptions.inSampleSize = sample
    }
}

This is the first part of the image scaling. We can use the BitmapFactory.Options.inSampleSize to scale down the image. The inSampleSize only accepts integer value with the power of 2 such as 2, 4, 8, 16, ... as so it says on its documentation page. The strange thing happened when I debug an image with inSampleSize value equal to 14. The nearest rounded down powers of 2 of 14 would be 8 but, the image still got scaled down to 1/14 of the original. Why? I have no idea. Maybe the underlying logic of it has changed, and the developer forgets to update the documentation string.

I manage to find several other people on Reddit who encounter the same thing.

Calculate Image Matrix

private fun calculateImageMatrix(
    context: Context,
    imageUri: Uri,
    scaleFactor: Float,
    bmOptions: BitmapFactory.Options
): Matrix? {
    val input: InputStream = context.contentResolver.openInputStream(imageUri) ?: return null
    val exif = ExifInterface(input)
    val matrix = Matrix()
    val orientation: Int = exif.getAttributeInt(
        ExifInterface.TAG_ORIENTATION,
        ExifInterface.ORIENTATION_NORMAL
    )
    when (orientation) {
        ExifInterface.ORIENTATION_ROTATE_90 -> matrix.postRotate(
            90f
        )
        ExifInterface.ORIENTATION_ROTATE_180 -> matrix.postRotate(
            180f
        )
        ExifInterface.ORIENTATION_ROTATE_270 -> matrix.postRotate(
            270f
        )
    }
    val remainingScaleFactor = scaleFactor / bmOptions.inSampleSize.toFloat()
    if (remainingScaleFactor > 1) {
        matrix.postScale(1.0f / remainingScaleFactor, 1.0f / remainingScaleFactor)
    }
    input.close()
    return matrix
}

After calculating the inSampleSize value, we then adjust the image orientation and more image scaling if there is a remaining unscaled portion of the image. What do you mean? As I said before, inSampleSize only accepts an integer, whereas the scaleFactor is a Float. So, there is a possibility the image is not fully scaled down because the decimal point is removed. Also, inSampleSize only accept value with powers of 2. Assuming we got a scale factor of 3, there are still several pixels of the image portion not scaled down because it is rounded down to 2. Here comes the matrix!

To determine the orientation of an image, we can use the Exif data of the image. To support the old API, I recommend using the AndroidX Exifinterface library. Here, I'm starting to familiarize myself with digital images and their relation to matrices. With matrices, you can translate, scale, rotate, and skew an image.

💡 Did you know that some of the predefined ImageView.ScaleType values are Matrix?

After adjusting the image orientation, we continue with image scaling. First thing first, we need to calculate how much we need to scale down from the previous inSampleSize scaled image. Back to my earlier example, how do we calculate the matrix scale factor if we got a scale down factor of 3 and inSampleSize value of 2? I think it's better for you to see the note I created when I was trying to understand what it's doing

In the picture above, we got a scale factor of 3 and the inSampleSize can only hold a value of 2. With inSampleSize we can get 1/2 of the image, but we still want to scale it down even more because the original scale down factor is 3. To calculate the matrix scale, we need to calculate the value of 1 / (scale factor / inSampleSize). With that, we got the matrix scale value of 0.66, which we then pass it to the Matrix.postScale(sx, sy) function. I'll let that sink in.

I found several sources that can help you familiarizes yourself with Matrix on digital images:

Creating The Scaled Image

private fun generateNewBitmap(
    context: Context,
    imageUri: Uri,
    bmOptions: BitmapFactory.Options,
    matrix: Matrix
): Bitmap? {
    var bitmap: Bitmap? = null
    val inputStream: InputStream? = context.contentResolver.openInputStream(imageUri)
    try {
        bitmap = BitmapFactory.decodeStream(inputStream, null, bmOptions)
        if (bitmap != null) {
            val matrixScaledBitmap: Bitmap = Bitmap.createBitmap(
                bitmap, 0, 0, bitmap.width, bitmap.height, matrix, true
            )
            if (matrixScaledBitmap != bitmap) {
                bitmap.recycle()
                bitmap = matrixScaledBitmap
            }
        }
        inputStream?.close()
    } catch (e: Throwable) {
        e.printStackTrace()
    }
    return bitmap
}

After calculating the scale factor for inSampleSize and the matrix, we continue creating the bitmap. But, first, it will create and allocate bitmap to memory from the defined BitmapFactory.Options.

Then, rotate or scale it down even more with a matrix if needed. If the matrixScaledBitmap bitmap variable is not the same as the previously assigned bitmap variable (because matrix scale factor is not 1 or needs to be reoriented), we then recycle the first bitmap assignment to remove its allocation from memory and reassign it with matrixScaledBitmap. Lastly, return the bitmap variable.

Check If The Image Need To be Scaled Up

private fun shouldScaleUp(
    photoW: Int,
    photoH: Int,
    minWidth: Int,
    minHeight: Int
): Boolean {
    return (minWidth != 0 && minHeight != 0 && (photoW < minWidth || photoH < minHeight))
}

At this point, We check whether we should scale up the previously generated bitmap if the dimension of it is below the minimum.

Calculate The Scale Up Factor

private fun calculateScaleUpFactor(
    photoW: Float,
    photoH: Float,
    maxWidth: Float,
    maxHeight: Float,
    minWidth: Int,
    minHeight: Int,
    shouldScaleUp: Boolean
): Float {
    var scaleUpFactor: Float = max(photoW / maxWidth, photoH / maxHeight)
    if (shouldScaleUp) {
        scaleUpFactor = if (photoW < minWidth && photoH > minHeight) {
            photoW / minWidth
        } else if (photoW > minWidth && photoH < minHeight) {
            photoH / minHeight
        } else {
            max(photoW / minWidth, photoH / minHeight)
        }
    }
    return scaleUpFactor
}

If the image does not need scaling up, the scaleUpFactor will remain 1. If yes, the value will be below 1.

Calculate the Final Width and Height

private fun finalWidth(
    photoW: Float, scaleUpFactor: Float
): Int {
    return (photoW / scaleUpFactor).toInt()
}

private fun finalHeight(
    photoH: Float, scaleUpFactor: Float
): Int {
    return (photoH / scaleUpFactor).toInt()
}

As the function name implies, we calculate the final width and height of the image output. Same as before, the width and height will remain the same, a.k.a not scaled up if the scaleUpFactor is 1.

Generate the Final Bitmap

private fun scaleUpBitmapIfNeeded(
    bitmap: Bitmap,
    finalWidth: Int,
    finalHeight: Int,
    scaleUpFactor: Float,
    shouldScaleUp: Boolean
): Bitmap {
    val scaledBitmap: Bitmap = if (scaleUpFactor > 1 || shouldScaleUp) {
        Bitmap.createScaledBitmap(bitmap, finalWidth, finalHeight, true)
    } else {
        bitmap
    }
    if (scaledBitmap != bitmap) {
        bitmap.recycle()
    }
    return scaledBitmap
}

In this function, we generate the final bitmap. The new bitmap will be allocated to memory if we need to scale it up. If a new bitmap is generated, we also need to recycle the previous bitmap because we no longer need it.

Compress & Save the Bitmap

private fun compressAndSaveImage(
    bitmap: Bitmap,
    compressFormat: Bitmap.CompressFormat?,
    quality: Int,
): String? {
    val uniqueID = UUID.randomUUID().toString()
    val fileName = "test_optimization_$uniqueID.jpg"
    val fileDir = File("/storage/emulated/0/Download/")
    val imageFile = File(fileDir, fileName)
    val stream = FileOutputStream(imageFile)
    bitmap.compress(compressFormat, quality, stream)
    stream.close()
    bitmap.recycle()
    return imageFile.absolutePath
}

Finally, in the last function, we save & compress the generated bitmap. Telegram use compress quality of 80 and Bitmap.CompressFormat.JPEG as the image format. For thumbnail, it uses quality value of 55.

For the sake of the tutorial, I don't save the image in a graceful way. You might want to use content resolver to save media-related files. You can also save it to app-specific storage if you don't want other apps to discover the compressed images.

Note, that this algorithm will create a new image file on your Download folder for each optimization. Therefore, you might want to delete it after, or you'll end up with a bunch of it.

Try It Yourself

For the full code file of the ImageOptimizer object, you can check it here. You can also check and run the code yourself in this Github repository.

I hope this post will give you some insight on how to deal with image scaling or optimization on Android development. Thank you.