Data Science

How does Zomato use Machine Learning?

Last Updated: 9th June, 2023

Harshini Bhat

Data Science Consultant at almaBetter

Have you ever wondered how Zomato knows what you want to eat or how everything gets delivered precisely at the right time? The answer to your questions lies in the vast expanse of Machine Learning

Have you ever wondered how Zomato knows what you want to eat, whether it's biryani or your favorite smoothie? Or how do orders get delivered so accurately in bustling cities 🏙️ like Delhi, Mumbai, Kolkata, and Bangalore? No surprises here. The answer to all your questions lies in the vast expanse of Machine Learning 🤖

Let’s deep dive into how Zomato leverages Machine Learning in its everyday operations to continuously improve its product and provide the best customer experience to its users.

Frame 16.png

What is Machine Learning?

Machine Learning is a branch of Artificial Intelligence that trains computers to learn patterns from data and make predictions or decisions based on that learning. The core components of Machine Learning are input, brain, and output, which form a simple equation. The brain is the Machine Learning algorithm that takes the input and produces the output.

Model Training is the process of determining the brain by feeding it a set of known input and output values. This process trains the algorithm to recognize patterns and make predictions based on new input values. Model Prediction is computing the output based on known input and brain values.

Frame 17.png

In order to make Machine Learning models accessible to applications and enable smarter decision-making, a Model Server is established during production deployment. This server allows for remote API usage to make predictions based on the model. Deploying additional Model Servers can improve the ML runtime, enabling faster turnaround time, increased experimentation, and better models.

Now, let us go through the different Machine Learning techniques used by Zomato.

Machine Learning at Zomato

Menu Digitization

Zomato has leveraged Machine Learning to enhance its menu digitization capabilities in several ways. One way is through Optical Character Recognition (OCR) technology, which enables text extraction from images of menus. Zomato's OCR system uses Convolutional Neural Networks (CNNs) to accurately recognize and extract text from menu images, even in low-light or low-resolution images.

In addition to OCR, Zomato's menu digitization process involves Natural Language Processing (NLP) techniques to extract structured data from unstructured menu text. It involves identifying and classifying menu items such as appetizers, entrees, desserts, and beverages. NLP also helps to identify ingredients, cooking methods, and other relevant details about each menu item.

Zomato's ML algorithms also analyze images of food dishes to identify and classify menu items visually. This process involves using Convolutional Neural Networks (CNNs) to analyze images and recognize patterns in the dishes, which are then matched to the corresponding menu items.

Frame 18.png

Zomato has used ML to its advantage by automating menu digitization. For example, it has improved the accuracy and completeness of its menu data, making it easier for users to find what they want and order confidently. Additionally, it has reduced the time and cost associated with manual menu digitization, allowing Zomato to scale its operations more efficiently and make it more user-friendly.

Personalized Homepage Restaurant Listings

Zomato's personalized homepage restaurant listings are created using a combination of collaborative filtering and content-based filtering techniques. Collaborative filtering involves analyzing user behavior, such as which restaurants they have previously visited or rated, to find patterns and similarities between users. Content-based filtering, on the other hand, involves analyzing the content of restaurant listings, such as the cuisine, price range, and location, to find restaurants that are similar to a user's preferences.

Combining these techniques allows Zomato to create a personalized restaurant listing for each user. For example, suppose a user has previously preferred South Indian cuisine. In that case, Zomato's algorithms may suggest South Indian restaurants in their area that have high ratings or are similar to other South Indian restaurants they have visited before.

Frame 20.png

In addition to personalized restaurant listings, Zomato also uses ML algorithms to predict which restaurants will most likely be popular soon. By analyzing data such as reservation patterns, user ratings, and social media activity, Zomato's algorithms can identify up-and-coming restaurants likely to be popular in the coming weeks or months. This information is used to create a "Trending This Week" section on the homepage, which showcases the hottest new restaurants in each user's area. By tailoring its recommendations to each user's preferences, Zomato can create a more engaging and personalized experience that encourages users to discover new restaurants and order from the platform more frequently.

Predicting Food Preparation Time (FPT)

FPT is the time it takes for the food to be prepared and ready for delivery. It is a crucial factor that affects the overall customer experience of ordering food online. However, FPT is not as simple as it seems. It depends on numerous factors such as the quantity and type of dishes ordered, restaurant behavior, time of day, day of the week, footfall in the restaurant, and much more. Predicting FPT accurately is essential for restaurants to manage their kitchen operations efficiently and deliver food on time.

Frame 19.png

To achieve accurate and real-time FPT prediction, advanced Machine Learning techniques such as Bidirectional Long Short-Term Memory (LSTM) are used. A Bidirectional LSTM-based Deep Learning model takes into account all the relevant factors affecting FPT and provides an estimate of FPT for each order in real time.

What makes the Bidirectional LSTM model so effective in predicting FPT? It is the fact that it takes into account the sequential nature of the input features. For example, the type of dishes ordered and their preparation time can affect the FPT of subsequent orders. The Bidirectional LSTM model can capture these dependencies and provide more accurate predictions and manage kitchen operations more efficiently. This ensures that the food arrives on time, hot and fresh, making our online ordering experience a delightful one.

Enhancing Navigation

Zomato delivery executives need accurate and reliable road data to navigate city streets efficiently. However, road data may not always be complete or accurate, especially in areas with missing road data. To overcome this challenge, Zomato uses a combination of ground truth data, vanilla OSM, and augmented geometry to enhance road detection.

Ground truth data is collected and verified data that is highly accurate. By using ground truth data as a reference, Machine Learning algorithms can be trained to detect roads more accurately. Vanilla OSM provides a base layer of road data that can be used to build upon. However, it may not be complete or accurate in some areas. To fill in the missing road data, augmented geometry can be used, which involves using computer vision techniques to identify road features from satellite imagery, LiDAR data, or other sources.

Frame 842-min.png

By combining these three approaches, Zomato can create a more accurate and up-to-date road map, ensuring its delivery executives can navigate the city streets more efficiently. This can result in faster and more timely food deliveries, enhancing the overall customer experience. This approach significantly improves the accuracy of road data, making it easier for delivery executives to navigate through the city streets, resulting in faster and more timely food deliveries.

Delivery Partner grooming, audit, and compliance

Zomato's delivery fleet plays a critical role in providing an exceptional customer experience, and to ensure its smooth operation, we have implemented various audit mechanisms. One such mechanism is the 'DP selfie audit,' which ensures compliance related to DP grooming, such as asset audit and mask audit. The asset audit verifies if the delivery personnel is wearing a Zomato t-shirt and carrying a Zomato bag, while the mask audit confirms if they are wearing masks for safe and secure deliveries.

To streamline this process, there is an automated system that enables scheduling audits easily and approves or disapproves DP grooming automatically. Audits can be triggered on the DP app during login or order deliveries, and the DP must submit a selfie within a short time frame, increasing the accuracy of the checks. These images are then passed to our DP service, which uses Deep Learning models, including Convolutional Neural Networks and image processing algorithms, to detect faces with and without masks. Similarly, the asset audit replicates the process to check for the presence of Zomato t-shirts and assets.

Frame 843-min.png

Integrating automated systems with manual moderation has enabled Zomato to conduct more frequent audits at scale and in real-time without the associated costs of manual audits. This provides seamless feedback for a better DP experience, ensuring compliance with grooming standards and safe delivery practices

Zomato's use of Machine Learning algorithms has revolutionized the food delivery industry. By automating menu digitization, creating personalized restaurant listings, and predicting food preparation times, Zomato has provided its users with a more seamless and enjoyable experience. In addition, these techniques have improved the accuracy and completeness of menu data and made it easier for users to find what they want and order confidently. As Zomato continues to refine its Machine Learning models and deploy more model servers, we can expect to see even more intelligent and personalized food delivery experiences.

Interview Questions

1. Can you explain how NLP works and why it is useful in the context of restaurant recommendations?

Answer: Natural Language Processing (NLP) is a subfield of Artificial Intelligence that focuses on enabling computers to understand, interpret, and generate human language. NLP techniques are useful in the context of restaurant recommendations as they allow the system to analyze user-generated text data, such as customer reviews and comments, to extract relevant information and insights about restaurants.

For example, NLP can be used to identify the sentiment of a review, whether it is positive or negative, and extract specific features or aspects of a restaurant that a user liked or disliked. This information can then be used to generate personalized recommendations for users based on their preferences and past behavior.

2. How does the use of Convolutional Neural Networks (CNNs) improve the accuracy of image recognition in Zomato's restaurant listings?

Answer: CNNs are a type of Deep Learning algorithm that can automatically learn features from raw data such as images. In the context of Zomato's restaurant listings, CNNs can be used to analyze the photos uploaded by users and extract visual features that are relevant for recognizing dishes and ingredients.

For example, a CNN can learn to detect the shape, texture, and color of a pizza or a burger and use this information to classify the corresponding dishes. By training a CNN on a large dataset of labeled images, Zomato can improve the accuracy of its image recognition system and provide more relevant and personalized recommendations to users. Additionally, CNNs can be used to generate captions or descriptions of images, which can further enhance the user experience and help users find the exact dish they are looking for.