MongoDB Schema Design, Indexes & More Best Practices

Claudio Ctin2 months ago13 mins

Schema Design Approaches – Relational vs. MongoDB:
Relational Databases: Best for applications requiring complex queries, transactions, and data integrity. Ideal for structured data and systems where schema stability is important (e.g., financial applications, ERP systems).
MongoDB: Best for applications requiring scalability, flexibility, and rapid development. Ideal for handling large volumes of unstructured or semi-structured data, real-time analytics, and applications where data structures evolve frequently (e.g., content management systems, IoT applications).

Embedding vs. Referencing:

Embedding
Example:
Imagine a blog application where each post has comments.

{
“_id”: 1,
“title”: “First Post”,
“content”: “This is the content of the first post”,
“comments”: [
{
“user”: “Alice”,
“comment”: “Great post!”,
“date”: “2024-07-16”
},
{
“user”: “Bob”,
“comment”: “Thanks for sharing!”,
“date”: “2024-07-17”
}
]
}

Advantages:

Performance: Faster read operations since related data is stored together.
Atomicity: Updates to a document are atomic, ensuring consistency.
Simplicity: Easy to manage and query related data within a single document.
Limitations:

Document Size: Limited to 16 MB, so embedding large amounts of related data can exceed this limit.
Data Redundancy: Duplicating data in multiple places can lead to inconsistencies and increased storage usage.

Referencing
Example:
Using the same blog application, comments are stored in a separate collection.

Posts Collection:
{
“_id”: 1,
“title”: “First Post”,
“content”: “This is the content of the first post”,
“comments”: [
101,
102
]
}
Comments Collection:
{
“_id”: 101,
“user”: “Alice”,
“comment”: “Great post!”,
“date”: “2024-07-16”,
“postId”: 1
}
{
“_id”: 102,
“user”: “Bob”,
“comment”: “Thanks for sharing!”,
“date”: “2024-07-17”,
“postId”: 1
}
Advantages:

Flexibility: No size limit issues as comments grow.
Data Normalization: Reduces data redundancy and potential inconsistencies.
Scalability: Easier to manage large volumes of related data.
Limitations:

Complexity: More complex queries to fetch related data, requiring joins.
Performance: Slower read operations since data is spread across multiple documents.
Atomicity: Updates to related data across multiple documents are not atomic, which can lead to inconsistencies.

Recap:

One-to-One – Prefer key value pairs within the document
One-to-Few – Prefer embedding
One-to-Many – Prefer embedding
One-to-Squillions – Prefer Referencing
Many-to-Many – Prefer Referencing

General Rules for MongoDB Schema Design:

Rule 1:
Favor embedding unless there is a compelling reason not to.

Rule 2:
Needing to access an object on its own is a compelling reason not to embed it.

Rule 3:
Avoid joins and lookups if possible, but don’t be afraid if they can provide a better schema design.

Rule 4:
Arrays should not grow without bound. If there are more than a couple of hundred documents on the many side, don’t embed them; if there are more than a few thousand documents on the many side, don’t use an array of ObjectID references. High-cardinality arrays are a compelling reason not to embed.

Rule 5:
As always, with MongoDB, how you model your data depends entirely on your particular application’s data access patterns. You want to structure your data to match the ways that your application queries and updates it.

Relationships

One-to-One:

Let’s take a look at our User document. This example has some great one-to-one data in it. For example, in our system, one user can only have one name. So, this would be an example of a one-to-one relationship. We can model all one-to-one data as key-value pairs in our database.
{
“_id”: “ObjectId(‘AAA’)”,
“name”: “Joe Karlsson”,
“company”: “MongoDB”,
“twitter”: “@JoeKarlsson1”,
“twitch”: “joe_karlsson”,
“tiktok”: “joekarlsson”,
“website”: “joekarlsson.com”
}

One-to-Few:

Okay, now let’s say that we are dealing a small sequence of data that’s associated with our users. For example, we might need to store several addresses associated with a given user. It’s unlikely that a user for our application would have more than a couple of different addresses. For relationships like this, we would define this as a one-to-few relationship.
{
“_id”: “ObjectId(‘AAA’)”,
“name”: “Joe Karlsson”,
“company”: “MongoDB”,
“twitter”: “@JoeKarlsson1”,
“twitch”: “joe_karlsson”,
“tiktok”: “joekarlsson”,
“website”: “joekarlsson.com”,
“addresses”: [
{ “street”: “123 Sesame St”, “city”: “Anytown”, “cc”: “USA” },
{ “street”: “123 Avenue Q”, “city”: “New York”, “cc”: “USA” }
]
}

One-to-Many:

Alright, let’s say that you are building a product page for an e-commerce website, and you are going to have to design a schema that will be able to show product information. In our system, we save information about all the many parts that make up each product for repair services. How would you design a schema to save all this data, but still make your product page performant? You might want to consider a one-to-many schema since your one product is made up of many parts.

Product
{
“name”: “left-handed smoke shifter”,
“manufacturer”: “Acme Corp”,
“catalog_number”: “1234”,
“parts”: [“ObjectID(‘AAAA’)”, “ObjectID(‘BBBB’)”, “ObjectID(‘CCCC’)”]
}

Parts
{
“_id” : “ObjectID(‘AAAA’)”,
“partno” : “123-aff-456”,
“name” : “#4 grommet”,
“qty”: “94”,
“cost”: “0.94”,
“price”:” 3.99″
}

One-to-Squillions:

What if we have a schema where there could be potentially millions of subdocuments, or more? That’s when we get to the one-to-squillions schema. And, I know what you’re thinking: Is squillions a real word?
And the answer is yes, it is a real word.
Let’s imagine that you have been asked to create a server logging application. Each server could potentially save a massive amount of data, depending on how verbose you’re logging and how long you store server logs for.

hosts
{
“_id”: ObjectID(“AAAB”),
“name”: “goofy.example.com”,
“ipaddr”: “127.66.66.66”
}

Log messages
{
“time”: ISODate(“2014-03-28T09:42:41.382Z”),
“message”: “cpu is on fire!”,
“host”: ObjectID(“AAAB”)
}

Many-to-Many:

The last schema design pattern we are going to be covering in this post is the many-to-many relationship. This is another very common schema pattern that we see all the time in relational and MongoDB schema designs. For this pattern, let’s imagine that we are building a to-do application. In our app, a user may have many tasks and a task may have many users assigned to it.
todo_9ddb687d61
In order to preserve these relationships between users and tasks, there will need to be references from the one user to the many tasks and references from the one task to the many users. Let’s look at how this could work for a to-do list application.

users:
{
“_id”: ObjectID(“AAF1”),
“name”: “Kate Monster”,
“tasks”: [ObjectID(“ADF9”), ObjectID(“AE02”), ObjectID(“AE73”)]
}

tasks:
{
“_id”: ObjectID(“ADF9”),
“description”: “Write blog post about MongoDB schema design”,
“due_date”: ISODate(“2014-04-01”),
“owners”: [ObjectID(“AAF1”), ObjectID(“BB3G”)]
}

Optimising query patterns:

Optimizing your query patterns is crucial for reducing execution time and resource usage:

Projection:
Use projection to limit the fields returned by your queries, minimizing data transfer and processing load. Also, it’s better to exclude _id with 0 (false) if it’s not a field pertaining to the application — i.e., an auto-generated field by MongoDB. db.collection.find({ field: value }, { field1: 1, field2: 1 })

Aggregation framework:
Leverage MongoDB’s
aggregation framework for complex data processing. Ensure aggregations utilize indexed fields where possible.
db.collection.aggregate([ { $match: { field: value } }, { $group: { _id: “$field”, total: { $sum: “$amount” } } } ])

Avoid $where:
The $where operator can be slow and resource-intensive. Use it sparingly and only when necessary. Instead, the use of $expr with aggregation operators that do not use JavaScript (i.e., non-$function and non-$accumulator operators) is faster than $where because it does not execute JavaScript and is preferable, when possible. However, if you must create custom expressions, $function is preferred over $where.

Summary:

As you can see, there are a ton of different ways to express your schema design, by going beyond normalizing your data like you might be used to doing in SQL. By taking advantage of embedding data within a document or referencing documents using the $lookup operator, you can make some truly powerful, scalable, and efficient database queries that are completely unique to your application. In fact, we are only barely able to scratch the surface of all the ways that you could model your data in MongoDB.

Please follow and like us:

Stiri similare

Implementing Search Functionality in Django Rest Framework (DRF)

Claudio Ctin14 mins ago2 mins ago

Introduction When building APIs with Django Rest Framework, one of the essential features you might want to include is a search functionality. This allows clients to query the API and filter results based on specific criteria, making it much more user-friendly and flexible. Django provides us with a powerful tool for our APIs called django-filters,…

Frontend Challenge v24.09.04-Space Edition

Claudio Ctin40 mins ago2 mins ago

This is a submission for Frontend Challenge v24.09.04, Glam Up My Markup: Space What I Built I have built the solar system using three js and it’s my first time using the tool. My idea was to represent those bits pieces of space on a 3D plane. Demo GitHub code link Deployed link Journey I…

How to Create a Single Page Coupon Website (with Full Code)

Claudio Ctin42 mins ago2 mins ago

What is a Coupon Website? A coupon website is a platform that allows users to find and apply promotional codes or discounts for various online or in-store purchases. It aggregates available coupons, offering users a simple way to save on their purchases. Technologies Required To build a functional single-page coupon website, we will utilize the…

To Do List

Claudio Ctin46 mins ago1 min ago

Check out this Pen I made! Please follow and like us:

Mastering Docker Containers: A Thrilling Virtual Arena

Claudio Ctin46 mins ago1 min ago

Introduction In this lab, you will be transported back to the ancient Roman arena, where Docker containers battle for supremacy in the virtual world. You are cast as an eager spectator, with the goal of understanding and mastering the art of managing Docker containers. As you witness the thrilling Docker battles, your task is to…

Component Based Architecture in Peasy-UI: Part 5 of the Peasy-UI Series

Claudio Ctin48 mins ago50 seconds ago

Table of Contents Introduction Component Based Design Registering Single File Components with Peasy JavaScript (TypeScript) Single File Component HTML Single File Component Demo Application Label Component Button Component Demo Component Implementation AppUI Class Rendering Template Class properties as State More information Conclusion Introduction Today we are going to dive into the final layer of the…

Authentication and Authorization: Best Practices for Securing Web Applications

Claudio Ctin48 mins ago39 seconds ago

In the digital age, securing web applications has become more critical than ever. With cyber threats constantly evolving, understanding the core principles of authentication and authorization is crucial for every developer and organization. These two concepts form the backbone of web application security, ensuring that only the right users gain access to the right resources….

Auth, OAuth, and Auth0: What is what?

Claudio Ctin49 mins ago24 seconds ago

Cracking the Auth Puzzle: A Tale for Survivors and Learners Oh, is this going to be another OAuth Guide? Not at all! This will be more like what is what session? So, you know what you are going to deal with. The terms, which you have seen and heard 100 times, Auth, OAuth, Auth0…what do…

A representation of the solar system.

Claudio Ctin49 mins ago12 seconds ago

This is a submission for Frontend Challenge v24.09.04, CSS Art: Space. Inspiration A visual representation of the solar system using only CSS. Demo https://github.com/ShitanshuRoy/solar To run simply open the index.html file in any browser. Journey Interesting dive into CSS animations, being able to create interesting moving graphics. Further enhancements would be to make this controllable…

Microsoft Interview: Tips and Insights from Successful SDE-2 Application (Offer 2024)

Claudio Ctin1 hour ago60 mins ago

Embarking on the journey to land a Software Development Engineer II (SDE-2) position at a tech giant like Microsoft can be exciting and challenging. Surabhi’s experience with Microsoft Interview for SDE-2 position from a startup to FAANG provides invaluable insights for aspiring candidates. In her article, Surabhi Gupta shares her journey to secure a position…

Memento

Claudio Ctin2 hours ago60 mins ago

Memento is one of the Behavioural Design patterns that allows an o*bject to save and restore its previous state* without exposing its internal structure. This is useful when you want to provide a redo/undo feature in your application, another example will be restoring different commit versions in the repository in git. Key participants in the…

Normalizing Technological and Human Advancements: The New Affliction

Claudio Ctin2 hours ago59 mins ago

Being addicted to technology is like being an artist trapped in a gallery full of dazzling yet often overwhelming tools. Every day, we face the pressure to stay updated with the latest technological advancements, which can be both inspiring and exhausting. This relentless pursuit of the latest gadget creates an obsession with the status associated…

🎮 Understanding the Difference: Game Art vs. Game Design vs. Game Development

Claudio Ctin2 hours ago59 mins ago

Introduction The gaming industry is a complex ecosystem made up of several roles and responsibilities. For those new to the field, it’s easy to get confused between terms like game art, game design, and game development. While these terms may sound similar, they represent entirely different aspects of the game creation process. In this post,…

Flashing Message in Laravel

Claudio Ctin2 hours ago59 mins ago

Seringkali ketika melakukan aksi seperti menyimpan, memberbarui, atau menghapus data, kita melakukan redirect dan kemudian butuh untuk menampilkan pesan ‘alert’ bahwa aksi telah berhasil dilakukan. Kita gunakan method with(), dimana method ini akan melakukan flashing data pada session yang nantinya akan tersedia pada request setelahnya. public function store() { // create a new post… return…

Can you solve this problem?

Claudio Ctin2 hours ago59 mins ago

Let’s say we have two div’s, one inner div and other outer div. The inner div is responsible for zoom and pan (done using CSS transforms, translate and scale). Now new divs can be added to the inner div by drag and drop. The drop events are accepted by the outerdiv Now, during drop event…

Love you all

Claudio Ctin2 hours ago59 mins ago

CodeCrafti Welcome to my coding blog, where I share my passion for technology and programming. Here, you’ll find tutorials, tips, and insights on various program codecrafti.blogspot.com Please follow and like us:

Transform Your Code with ts-pattern

Claudio Ctin2 hours ago58 mins ago

ts-pattern is a TypeScript library that provides a functional programming concept called pattern matching. It can significantly improve code readability in several ways: Simplifies Conditional Statements ts-pattern replaces complex if-else chains or switch statements with concise and expressive pattern matching. Reduces Boilerplate Code It eliminates the need for repetitive checks and type guards, making your…

GG Coding Tips for Optimizing Performance: Speeding Up Your Code

Claudio Ctin2 hours ago2 hours ago

In the world of software development, optimizing code performance is crucial for delivering fast, responsive applications that users love. Whether you’re working on the front-end or the back-end, learning how to write efficient code is essential. In this article, we’ll explore various performance optimization techniques such as reducing time complexity, caching, lazy loading, and parallelism….

How the Event Loop Handles Microtasks and Macrotasks

Claudio Ctin2 hours ago58 mins ago

In JavaScript, microtasks and macrotasks are two types of asynchronous tasks that the event loop manages, but they are handled in different ways. Understanding how they work is crucial for predicting the execution order of asynchronous code. 1. Macrotask Queue (Task Queue) Macrotasks are placed into their own queue, often referred to as the task…

Testing Android App Accessibility: Clue

Claudio Ctin2 hours ago58 mins ago

In Droidcon Berlin, I gave a talk titled “Is This App Accessible? A Live Testing Demo,” in which I tested one app from an accessibility perspective. The audience voted for the app from three options and selected Clue, a period tracking app. The other two were the Droidcon app and Provinssi’s (a Finnish music festival…

Why shouldn’t you use async and defer in the same script tag?

Claudio Ctin2 hours ago58 mins ago

To answer that question, let’s first understand how <script> tags are downloaded and executed. When the browser receives the HTML response, it starts parsing it incrementally from top to bottom. While parsing, if it encounters a <script> tag (without async or defer), it stops parsing the HTML until the JavaScript is downloaded and executed. However,…

Book Review: Eloquent JavaScript – The Essential Guide for Web Developers

Claudio Ctin2 hours ago2 hours ago

As one of the most widely-used programming languages, JavaScript powers the web. However, because of its fast-paced evolution, staying current with JavaScript trends can be challenging. Many books on the subject quickly become outdated, but one stands the test of time: Eloquent JavaScript. This book has become a favorite among developers and is steadily growing…

Understanding Docker ENTRYPOINT vs CMD: Key Differences and Practical Examples

Claudio Ctin2 hours ago2 hours ago

When building Docker images, two critical instructions you’ll often encounter are ENTRYPOINT and CMD. While both define commands that a container executes when it starts, they serve distinct purposes. In this post, we’ll break down the differences between ENTRYPOINT and CMD and show how to use them effectively with real-world examples. What Are ENTRYPOINT and…