Data Science

WhatsApp System Design: All You Need To Know

Published: 22nd May, 2023

Vaideswar Reddy

Technical Support Engineer-Data Science at Skill-Lync at almaBetter

In this blog, we are going to discuss the system design of WhatsApp. We are going to focus on implementing some key features and will extend our idea further.

WhatsApp is a fast, simple, and convenient way for family and friends to chat, create group texts, share photos and videos, send and receive documents, and engage in private, secure conversations anytime, day or night.

Now WhatsApp is on top among all other messaging channels in terms of their active users. Due to the pandemic, WhatsApp sees a 40% increase in usage.


Today, WhatsApp delivers roughly 100 billion messages a day

Let’s see the features required for WhatsApp system design one by one.

Direct Messaging

We can chat with a friend or any number which we have in our contacts. If we have just two devices on the same network which need to transfer messages between each other then we need to know the address of both the devices and can exchange the message directly. But when the number of devices and the number of networks increases then it becomes increasingly difficult and we need to have a server in between that is responsible for receiving/sending the messages to appropriate clients.


As you can see above there are two clients in the diagram Client A and Client B. Here Client A needs to send a message to Client B. It initiates a request to the server. A TCP connection is established between the user and the gateway for communication.

As soon as a user becomes online a new TCP connection is established. When the user sends the message it travels through the gateway. Gateway is responsible for an abstraction of security, reverse proxy, rate limiting, etc. Then the request goes to the load balancer which is responsible for distributing the load across multiple servers. The messaging server then checks in the database to which server-client B is connected and forwards the request to client B.

There could be different scenarios when sending the message.

  • Client A is offline: In this scenario, the message which needs to be sent is stored on the device’s database(e.g. SQLite) until the device becomes online.
  • Client B is offline: In this scenario, the message gets saved in the database. As soon as the client becomes online the messages are sent to the client.

Group Messaging

WhatsApp provides one important feature of group messaging. Around 256 people can enter one group and can interact together. It is helpful in establishing a collective conversation with others. It works in the same way as to direct messaging.


Whatsapp Pay

WhatsApp Pay is an in-chat payment feature that allows users to make transactions via WhatsApp to their contact list. It is UPI-based payments service that allows you to both send and receive money. It was developed by the National Payments Corporation of India (NPCI).


Online/Last Seen

This feature is used to see if the person is online or the person’s last online status.


Whenever a client is performing any activity such as sending a message, opening a new message, uploading documents, or making a phone call this recent activity along with the time stamp is recorded for the user. With each new activity, the timestamp for the user is updated. Whenever user B wants to check the last seen of user A then it requests the database to get the timestamp of user A. If the timestamp is within the range current time — threshold value(e.g. 1 minute) then we can display the status as online else display the last seen as the timestamp of user A’s last activity timestamp.

Message Acknowledgments

Let’s talk about message acknowledgments i.e. the concept of the single, double, and blue tick.

WhatsApp supports 3 types of message acknowledgments:

  • Message sent
  • Message delivered
  • Message read.


As soon as the server thread processes the message from the received message queue it sends the client that the server received the message. When the receiver receives the message or reads the message a new request is sent to the server which then updates the sender about the status of the message.

Image Sharing

In a chat messaging system it’s common to share images, videos, and file attachments. This could be a long-running process and should be separated into another microservice that could handle such requests. As shown in the diagram below there’s another multimedia server that handles these types of file sharing requests. The client initiates an HTTP request to the Multimedia server. The multimedia server checks if the receiving client B is online. If yes then it sends the hash of the file to client B. Client B uses this hash to download the file.


If client B is offline then the file is stored inside secure cloud storage. As soon as client B comes online it receives the hash and fetches the files from the cloud. To make this file sharing faster we could use a CDN. A CDN stands for Content Delivery Network which caches the information geographically near the client’s location to reduce the latency and increase the download speed.

How WhatsApp is using Data Science

This massive increase in WhatsApp usage over the last couple of years has opened many opportunities for businesses. WhatsApp chatbot is an important feature for WhatsApp businesses.

WhatsApp Chatbots

WhatsApp Chatbot is an automated software powered by AI which runs on the WhatsApp platform. People communicate with WhatsApp chatbot via the chat interface like talking to a real person. It’s a set of automated replies that simulates a human conversation on WhatsApp.


WhatsApp chatbots are actually helping companies in automating the things and services they provide.

There are official WhatsApp chatbot companies. They are working continuously to make this process smooth.

WhatsApp chatbot building company is a chatbot development company that creates chatbots for WhatsApp. Many of these companies partner with the official WhatsApp Business solution providers and help us get API access and guide us through the application process and after that, we have access to WhatsApp chatbots

Checking Fake messages

WhatsApp has revolutionized the way information is spread. It connects more than 1.5 billion people and helps exchange an unbelievable number of 65 billion messages daily. Though WhatsApp is meant to connect with family, friends and colleagues, it is being misused by some to spread rumours and fake messages. This has led to unduly violence and social disorder. After a series of mob-lynching incidents, triggered by rumors circulating on WhatsApp, Indian Government is drafting an amendment to the Indian IT act. The draft as of now requires “intermediaries” like WhatsApp and Google to deploy technology-based automation tools with appropriate controls, for proactively identifying & disabling access to unlawful information or content. Most of the fake WhatsApp messages have features and motives such as

  • A request to forward the message to many
  • The threat of consequences if the request is ignored,
  • Fake information source to add credibility,
  • Details on the author and origin of the information is not cited
  • A clear point in time is not mentioned, only generic time information is mentioned such as ‘last year’ or ‘yesterday’.

Data Science can be used to classify a WhatsApp text message into one of these five categories — fully true, mostly true, mostly false, fully false and undermined. With use of Data Science in application an individual receiving a forwarded text message will be able to get it verified.


In 2019, WhatsApp announced that it banned more than 2 million accounts for sending bulk automated messages and fake news per month. This initiative was taken up with the use of machine learning. Using Machine learning we can put a restriction on how many times a specific message can be forwarded including a number of texts.


How WhatsApp can use Data Science in near future

Virtual Assistants

A Virtual Assistant or Digital Assistant is an application program that understands natural language voice commands and completes tasks for the user. They use Natural Language Processing(NLP) to match user text or voice input to run commands. Virtual assistants are basically cloud-based programs that require internet-connected devices or applications to work with their capabilities. Every time a command is given to the assistant, they tend to provide a better experience to the user based on past experiences using Machine Learning algorithms and learning. Some of the popular virtual assistants include Amazon Alexa, Cortana (Microsoft), Siri (Apple), and Google Assistant.


Here in our case WhatsApp can use virtual assistants to send and read out the messages by voice commands.

For this WhatsApp may not necessarily build virtual assistant but it can also integrate it with the virtual assistant which is already present in the device operating system ( Ex : Siri on Apple devices and Google assistants in Android devices)

Search Keyword and show related advertisements

Although WhatsApp is end-to-end encrypted but it can use some ML models which will not break the security and can come up with some keyword list which can be used for personalized advertisements. They can send the raw decrypted messages back to the WhatsApp servers, do some machine learning on-device, create a local advertising profile tailored to our preferences, and send limited data based on this data so that the security will not breach. As we all know, WhatsApp is a part of Facebook. This means Facebook could know if we are interested in cats without actually knowing the exact content of any of our messages. As per some Journals, WhatsApp may use the facebook data to monetize the platform by showing relevant advertisements with the help of facebook data only, but WhatsApp may monetize this way also.

They may use multiple filters for video transformation

As WhatsApp platform is used to interact with people and business will grow if the interaction is higher, so WhatsApp can do something which can help people in interacting more with the help of Deep Fake.


Deepfakes are synthetic media in which a person in an existing image or video is replaced with someone else’s likeness. While the act of faking content is not new, deepfakes leverage powerful techniques from machine learning and artificial intelligence to manipulate or generate visual and audio content with a high potential to deceive. The main machine learning methods used to create deep fakes are based on deep learning and involve training generative neural network architectures, such as auto encoders or generative adversarial networks (GANs).

Check out our latest guide "DropBox System Design"


  • WhatsApp is on top among all other messaging channels with 2.5 Billion users around the globe.
  • It is using so many features that help its customers to interact virtually anytime anywhere.
  • It uses Data Science for checking fake content, chatbots, and predicting fraud in WhatsApp payments.
  • There is a huge scope where data science can be used in integration with WhatsApp without even breaching the security of its users, in near future.

That’s it for this article and we hope you have great learning through this article!!

It feels good to operate with a team that is so self-reliant and motivated. I would like to thanks Bhanu Shahi, Jatin and Rishab Kesarwani for contributing in this article.

Related Articles

Top Tutorials

Made with heartin Bengaluru, India
  • Official Address
  • 4th floor, 133/2, Janardhan Towers, Residency Road, Bengaluru, Karnataka, 560025
  • Communication Address
  • 4th floor, 315 Work Avenue, Siddhivinayak Tower, 152, 1st Cross Rd., 1st Block, Koramangala, Bengaluru, Karnataka, 560034
  • Follow Us
  • facebookinstagramlinkedintwitteryoutubetelegram

© 2024 AlmaBetter