Anniversary Bonanza : Get 20% off on Full Stack Data Science programs. Coupon auto applied on checkout.
x
AlmaBetter Blogs > DropBox System Design

DropBox System Design

Data Science
We all have used DropBox once in a while, without actually knowing how the system works. This blog will cover all the insights related to the different functional requirements as well as the system design of DropBox. So let's start learning...
DropBox System Design

Before starting I would like to thank Shweta singh,Keerthana for their support and the constant research they have been doing on this topic.

What is DropBox?

Dropbox is a cloud file storage device. It enables users to share the files across their devices and also with remote storage servers. It does that by allowing the users to create special folders on each of their computers or mobile devices, which the service then synchronizes so that it appears to be the same folder regardless of which computer is used to view it. Files placed in this folder also are typically accessible through a website and mobile apps and can be easily shared with other users for viewing or collaboration.

These are the table of contents that will be followed throughout the blog:

1_31WrLaLAWR9chf4mVk5tuw.png

The above flowchart describes the entire architecture of dropbox starting from the client-side, passing through the cloud server, and the several services that are provided by the DropBox services.

Conclusions:

  • The introduction of cloud-based services when it comes to storing large amounts of files has made our lives so much easier.
  • We all have used storage-based cloud services at one point in time, but without knowing the entire architecture behind it.
  • DropBox has recently seen an increase in its annual revenue in the first quarter in 2020, with paying users ending at 15.48 million, as compared to 14.31 million for the same period last year.

The reason for such a huge increase in the revenue of DropBox is because of the application of ML algorithms. They had a clear vision of their problem statement where they wanted to have a broader user base, and in turn, increase revenue. They used clustering algorithms to segment different user groups together, and provided benefits to users having less likelihood of using DropBox.

Goals of a DropBox System:

Given below are the several goals/objectives that are taken into consideration before designing the DropBox system UI:

  • Users should have the flexibility to upload and download files whenever they wish to.
  • The flexibility of sharing files/folders with anyone.
  • Synchronization between devices.
  • The system should allow the storage of large files.
  • File operations should follow an ACID approach.
  • Supporting offline editing. Flexibility to create/delete/edit files even when the users are offline. And they should get synced, as soon as the user becomes online.
  • The system should also provide security services in the form of encryption to the files stored.

Functional Requirements:

The basic system behavior can be defined by the Functional Requirements. These are what the system does, or focuses on doing. In the simplest of terms, these requirements need to be met, if we want a proper functioning of our system.

  • Users can be further divided into two parts: Free Users and Premium users. The free users will be provided with limited capabilities, limited network bandwidth, and storage. Premium service is a paid service, storage size is larger than that of free users.
  • A user can also create a root folder if the person wants. And the benefit of the cloud service is that the changes made in the files if synchronized with the cloud will also get reflected in all the user's devices.
  • The next question that arises is what is the maximum file size that is allowed? The maximum file size supported is 2Gb.
  • A user should be able to share files or folders with other users. Once a file or folder is shared with other users, any updates to the file or the folder are synchronized automatically to all other users' devices. Sharing of a folder enables sharing of all files and subfolders under that folder.
  • One of the major advantages of DropBox is that the user can also operate it in an offline mode. The changes get synchronized to their files whenever they switch to an online mode.
  • File Versioning and easy rollback: Dropbox supports multiple versions of a file or folder that the user is operating on, such that if the user is facing some serious difficulty in their present version, the person can easily roll back to the previous version if he/she wants to.
  • One of the major requirements can be its file search system. If a user has thousands of files stored in DropBox, the person should have the capability to search for the file he/she is looking for, instead of searching it manually which would be time-consuming.
  • Providing a mechanism for resolving conflicts by the system if the same file is updated by a single or multiple users.

Non-Functional Requirements:

As we learned in the previous section that functional requirements are those requirements that are required for an optimized performance of the system, whereas non-functional requirements specify the approach of the system to deal with functional requirements. They don’t affect the functionality of the system, even if they are not met, the system will still perform its basic operations.

  • DropBox should be highly available and fault-tolerant.
  • Service needs to be highly scalable with increasing load.
  • Minimizing network bandwidth consumption and file transfer latency through file synchronization. (This is one of the most important non-functional requirements as this is the reason the files get divided into smaller segments, also known as ‘chunks’. This will enable the user to modify/upload/download modified chunks of a file. Even if the action fails, the user just needs to retry that particular action with that chunk, instead of retrying for the entire file.)
  • Guarantee on the ACID system.

What is the ACID system in DropBox Design?

Whenever we deal with system designs in a cloud storage platform, the concept of ACID is extremely important. ACID is made up of four independent functionalities including Atomicity, Consistency, Isolation, and Durability. Let’s get an idea of these four functionalities and what they represent.

  • Atomicity: One of the properties of a cloud storage platform like dropbox is this if a file is changed from one version to a different one, that file is replicated/updated in all the other devices as well. Suppose the user is trying to access the updated file from a different device. The files get updated in the form of ‘chunks’, so the user trying to access the file from a different device may see the file getting updated in a transient way (that is in the form of chunks). The user should not see all the changes appearing in the form of chunks. To counter this, all the chunks in the user’s dropbox application are stored in a temporary file in a temporary location, where all the changes are first applied. After all the changes are applied to the file, we can link the actual file with the temporary file.

  • Consistency: This functionality means users should not be able to read files that don’t make sense together. If a user has updated a file in one device, the second user operating on a different device should not see a partial change in that file. The system may crash if the second user tries to operate/modify that partially updated file.

  • Isolation: In terms of DropBox service, this means that two different files having been dealt with differently by the user, should be totally independent of one another. Changing or modifying one file should not modify a different file. Secondly, if a file gets updated on multiple devices at the same time, one file will have to wait for the other write to complete first.

  • Durability: Changes made to a file will not get lost. If a file has been created and then modified based on the choices of the user, and that file has been updated to the remote server, these changes should not be lost. This is one of the best advantages of cloud storage services, where changes are preserved.

In the next section we will be looking at some of the basic design APIs that DropBox uses:

0_jYShK8d7BEgDhztm.jpg

DropBox Design APIs:

API stands for Application Programming Interface. It acts as a medium through which two or more applications connect with each other. It acts as a messenger that delivers the request of a user to the application where the user is requesting information, and delivers the response back to the user. Different design APIs for Dropbox is as follows:

  • File Upload (user token, File Metadata, File content)
  • Update file metadata (user token, file id, file metadata)
  • Delete file (user token, file id)
  • Share file (user token, file id, second user-id, permissions)

Overall Architecture of DropBox:

Given below is the entire architecture of DropBox where the box on the left signifies the operations on the client-side, and that to the right are all the services that DropBox provides.

1_Az--TfTDrOSfdeAUUxwnqg.png

Subhadip Ghosh
Data Scientist at Zip Co

Related Posts

  • Location
  • 4th floor, 133/2, Janardhan Towers, Residency Road, Bengaluru, Karnataka, 560025
  • Follow Us

© 2022 AlmaBetter