📕
Dan Fitz's Notes
  • README
  • Ai
    • Supervised Machine Learning
      • Introduction To Machine Learning
      • Regression With Multiple Input Variables
      • Classification
  • Csharp
    • C Sharp Advanced
      • Generics
      • Delegates
      • Lambda Expressions
      • Events
    • C Sharp Fundamentals
      • Intro To C
      • Primitive Types And Expressions
      • Non Primitive Types
      • Control Flow
      • Arrays And Lists
      • Working With Dates
      • Working With Text
      • Working With Files
      • Debugging Applications
    • C Sharp Intermediate
      • Classes
      • Association Between Classes
      • Inheritance
      • Polymorphism
      • Interfaces
  • Java
    • Inheritance Data Structures Java
      • Inheritance Polymorphism Using Overriding And Access Modifiers
      • Abstract Classes And Debugging
      • File I O And Exceptions
      • Collections Maps And Regular Expressions
    • Intro To Java
      • Introduction To Java Classes And Eclipse
      • Unit Testing Arrays And Array Lists
      • Static Variables Methods And Polymorphism Using Overloading
  • Javascript
    • Algorithms Data Structures
      • Big O Notation
      • Analyzing Performance Of Arrays And Objects
      • Problem Solving Approach
      • Problem Solving Patterns
      • Recursion
      • Searching Algorithms
      • Bubble Selection And Insertion Sort
      • Merge Sort
      • Quick Sort
      • Radix Sort
      • Data Structures Introduction
      • Singly Linked Lists
      • Doubly Linked Lists
      • Stacks And Queues
      • Binary Search Trees
      • Tree Traversal
      • Binary Heaps
    • Complete Nodejs
      • Understanding Node.js
      • REST AP Is And Mongoose
      • API Authentication And Security
      • Node.js Module System
      • File System And Command Line Args
      • Debugging Node.js
      • Asynchronous Node.js
      • Web Servers
      • Accessing API From Browser
      • Application Deployment
      • Mongo DB And Promises
    • Complete React Native
      • Working With Content
      • Building Lists
      • Navigating Users Between Screens
      • State Management
      • Handling Screen Layout
      • Setting Up An App
      • More On Navigation
      • Advanced Statement Management With Context
      • Building A Custom Express API
      • In App Authentication
    • Epic React
      • React Fundamentals
      • React Hooks
      • Advanced React Hooks
      • Advanced React Patterns
      • React Performance
    • Fireship Firestore
      • Firestore Queries And Data Modeling Course
      • Model Relational Data In Firestore No SQL
    • Functional Light Javascript
      • Intro
      • Function Purity
      • Argument Adapters
      • Point Free
      • Closure
      • Composition
      • Immutability
      • Recursion
      • List Operations
      • Transduction
      • Data Structure Operations
      • Async
    • Js Weird Parts
      • Execution Contexts And Lexical Environments
      • Types And Operators
      • Objects And Functions
      • Object Oriented Java Script And Prototypal Inheritance
      • Defining Objects
    • Mastering Chrome Dev Tools
      • Introduction
      • Editing
      • Debugging
      • Networking
      • Auditing
      • Node.js Profiling
      • Performance Monitoring
      • Image Performance
      • Memory
    • React Complete Guide
      • What Is React
      • React Basics
      • Rendering Lists And Conditionals
      • Styling React Components
      • Debugging React Apps
      • Component Deep Dive
      • Building A React App
      • Reaching Out To The Web
      • Routing
    • React Testing
      • Intro To Jest Enzyme And TDD
      • Basic Testing
      • Redux Testing
      • Redux Thunk Testing
    • Serverless Bootcamp
      • Introduction
      • Auction Service Setup
      • Auction Service CRUD Operations
      • Auction Service Processing Auctions
    • Testing Javascript
      • Fundamentals Of Testing
      • Static Analysis Testing
      • Mocking Fundamentals
      • Configuring Jest
      • Test React Components With Jest And React Testing Library
    • Typescript Developers Guide
      • Getting Started With Type Script
      • What Is A Type System
      • Type Annotations In Action
      • Annotations With Functions And Objects
      • Mastering Typed Arrays
      • Tuples In Type Script
      • The All Important Interface
      • Building Functionality With Classes
    • Web Performance With Webpack
      • Intro
      • Code Splitting
      • Module Methods Magic Comments
  • Other
    • Algo Expert
      • Defining Data Structures And Complexity Analysis
      • Memory
      • Big O Notation
      • Logarithm
      • Arrays
      • Linked Lists
      • Hash Tables
      • Stacks And Queues
      • Strings
      • Graphs
      • Trees
    • Aws Solutions Architect
      • AWS Fundamentals IAM EC 2
    • Fundamentals Math
      • Numbers And Negative Numbers
      • Factors And Multiples
      • Fractions
    • Mysql Bootcamp
      • Overview And Installation
      • Creating Databases And Tables
      • Inserting Data
      • CRUD Commands
      • The World Of String Functions
      • Refining Our Selections
      • The Magic Of Aggregate Functions
    • Random Notes
      • Understanding React Hooks
  • Python
    • Data Analysis Using Python
      • Loading Querying And Filtering Data Using The Csv Module
      • Loading Querying Joining And Filtering Data Using Pandas
      • Summarizing And Visualizing Data
    • Intro To Python
      • Course Introduction Intro To Programming And The Python Language Variables Conditionals Jupyter Notebook And IDLE
      • Intro To Lists Loops And Functions
      • More With Lists Strings Tuples Sets And Py Charm
      • Dictionaries And Files
Powered by GitBook
On this page
  • Core Concepts
  • Relational Data Modeling
  • Advanced Techniques for Efficiency
  • Aggregation via cloud functions
  1. Javascript
  2. Fireship Firestore

Firestore Queries And Data Modeling Course

Core Concepts

What is Firestore?

Firebase realtime database was the first database management system created by Firebase.

The pain points to it were:

  • You couldn't select a subset of data in a node; pulling data from a node pulled all the data.

  • Querying and sorting of data is limited; required lots of client-side code to handle queries and sorting and filtering.

Cloud Firestore was created to address these issues. It's a document-oriented database (like MongoDB).

Collections and documents

All data is stored in documents. Collections are just containers for documents.

Note: You can nest sub-collections underneath documents. And when you read a parent document, you actually don't pull any data from its sub-collections.

Firestore is denormalized

Relational databases are normalized: they break down data into the smallest parts possible, so there's no data duplication.

Mapping of terms between SQL and Firestore:

Table => Collection
Row => Document
Primary Key => Document ID
Foreign Key => Document ID references

Note: There are no JOINs in Firestore. You can "join" data client-side, but this doesn't happen directly with the database. This is by design: JOINs aren't efficient.

Firestore is schemaless

You can add data to any document on the fly, and it doesn't have to match the data types in other documents in the collection.

The fundamental idea with Firestore is to model your data in a way that doesn't require a JOIN. You are prerendering data for read efficiency.

Mental Framework for Modeling Data

Aim to model your data based on how it's going to be consumed by the app.

Think about the screens/views in your app, then design your data model that's the path of least resistance to get that view.

In contrast: SQL is designed via preset rules.

Note: Given how open-ended Firestore can be, there is usually more than one way to model your data. You have to consider the trade-offs of each.

Data Types

Firestore has the typical primitive data types: strings, number, booleans, null.

Other data types include:

  • Geopoints (latitude and longitude)

  • References to other documents

  • Map (like an object)

  • Array (ordered data)

Techniques for Data Modeling

  1. Embedding: adding data directly into a document.

  • This is the first technique you should consider because it's the most cost-effective and performant.

  • You only need 1 document read!

  • Note: It's only practical if the data is small because documents are limited to 1MB in size. Plus queries to the document will download the entire document.

  1. Root collection: creating a collection at root and creating documents in there.

  • This is normalizing your data.

  • Great for maintaining relationships across documents.

  • Great for querying across multiple properties.

  1. Subcollection: scopes data to individual document.

  • Great for when you need to set up a one-to-many relationship.

  • The only limitation is that queries across parents don't include subcollections (e.g. filtering a collection).

  • Note: When you query a document, you have to explicitly request its subcollections!

  1. Bucketing: a collection of documents that embed relational data.

  • Great for decreasing number of queries to get data.

Reading Data

Rule of thumb: If your query code is simple, you probably have a good data model. If it's complicated, you probably need to rethink the data model.

// Document read
const userPromise = db.collection('users').doc('userId')

// Subcollection read
const subPromise = db.collection('users').doc('userId').collection('tags')

// Bucket read
const userPromise = db.collection('users').doc('userId')
const tagPromise = db.collection('tags').doc('userId')

// ** Multi-document read **
// Given an array of doc IDs, read them all at a given collection
const user = await db.collection('users').doc('userId')
const tagIds = user.data().tags // grabs array of tag IDs

const readIds = async (collection, ids) => {
  const promises = ids.map(id => db.collection(collection).doc(id).get()) // returns array of promises
  const responses = await Promise.all(promises) // returns array of responses
  const data = responses.map(response => response.data()) // returns array of data
  return data
}

const tagData = readIds('tags', tagIds) // MULTI-DOCUMENT READ!

Note: Under the hood, the Firebase SDK performs pipelining, which combines requests and runs them concurrently.

Massaging Data

Some useful methods for massaging data server-side include:

  • where('key', '>=', today)

    • You'll use this the most!

    • Note: Range operators like >= can only be used once, while equality operators can be used infinitely.

    • Note 2: There is no inequality operator, so you have to combine an above and below query

// Excludes all users who are age 25
const above = db.collection('users').where('age', '>', 25)
const below = db.collection('users').where('age', '<', 25)
  • orderBy('key', 'desc')

    • Note: If some documents don't contain your chosen key, they'll be filtered out.

  • limit(20)

  • startAfter and endBefore (exclusive) plus startAt and endAt (inclusive)

    • Great for pagination

Pro tip:

  • You can chain these methods. These chains work like the AND operator.

  • However, there is no OR operator. To mimic OR, just create multiple queries.

Best practice: It's often much easier to perform many small queries and combine them client-side than to try to perform one giant query.

Bonus: where has a special array-contains operator that checks if an array contains a value, returning the array if there's a match.

Indexes

What are indices? I don't get it...

In any case, Firestore will throw an error if it recommends you create an index.

Security

The most important thing to keep in mind when it comes to security is that you cannot partially read a document. So whatever access you provide, they have access to all of the data.

Some models of security:

// No read/write access from client-side at all
// ONLY accessible server-side via dashboard
match /accounts/{id} {
  alow read, write: if false;
}

// Read/write access for user only
// Great for private data like email
match /users/{id} {
  allow read, write: if id == request.auth.uid;
}

// Public read access, private write access
// Great for visible data like usernames
match /profiles/{id} {
  allow read;
  allow write: if id == request.auth.uid;
}

Best practice: If you want a data set to have its own security rules, it's best to place it into its own collection.

Relational Data Modeling

Cardinality

Ask yourself: how many items can be in a set? A few? Hundreds? Billions?

  • If only a few, it makes the most sense to embed in the parent document.

  • If hundreds, it makes sense to create a bucket document with the same document ID.

  • If billions, opt for a collection because it can scale indefinitely.

One-to-one

One-to-one relationships are the easiest to model. There are 3 ways of creating the relationship:

  1. Embed data inside document.

  • If the relationship is truly unique, this should work.

  1. Create 2 documents in different collections using the same document ID. (Like buckets.)

  • Pro tip: Using the same document ID enforces the one-to-one relationship as well. It makes it impossible to accidentally create a duplicate relationship.

  1. Create a reference ID pointing to the other document.

  • Note: This is less efficient because it requires an extra query: getting the reference ID.

Code implementation:

// 1. Embedded one-to-one
const user = db.collection('users').doc('userId')

// 2. Bucket one-to-one
const user = db.collection('users').doc('userId')
const posts = db.collection('posts').doc('userId')

// 3. Reference one-to-one
// REQUIRES WAITING FOR FIRST PROMISE TO COMPLETE
let posts

db.collection('users').doc('userId').get()
  .then(snapshot => {
    const user = snapshot.data()

    posts = db.collection('posts').doc(user.postId)
  })

One-to-many

One-to-many relationships are the most common. There are 3 ways to model it:

  1. Embed an array of maps

  • Great when you have a small number of items in a set and don't need to query the data.

  • If you need to query the data, you have to read all parent documents and then query client-side, which is expensive.

So, if data needs to be queried, opt for:

  1. Subcollection

  • However, you can only query scoped to the specific document

So, if you need to query across parent documents, opt for:

  1. Root collection

  • Here, you reference the relevant document ID in your documents, creating the relationship.

const authorId = 'ayn-rand'

// 1. Embedded array of maps
const authorWithBooks = db.collection('authors').doc(authorId)

// 2. Subcollection
const books = db.collection('authors').doc(authorId).collection('books')

// 3. Root Collection, requires index
const booksFrom1971 = db.collection('books')
  .where('author', '==', authorId)
  .where('published', '>', 1971)

Many-to-many

Many-to-many relationships are the most complex to model.

  1. Create an middle-man collection

  • In a middle-man collection, the 2 document IDs are keys in the document.

  • The original documents don't have data about the relationship. They have to go to the middle-man collection to get the data.

  • Note: If you want to create a unique relationship (e.g. user can only heart a post once), make the document ID into a composite ID, which combines the 2 document IDs together.

  1. Embedded map with document IDs as keys

  • You're essentially creating a reference point for related data in the document.

  • The downside to this is that when we're querying the data, we're pulling in data from the parent document as well, which could be unnecessarily expensive.

  1. Embedded array with document IDs as items

  • This is similar to an embedded map but uses an array instead.

  • Note: You can only use the array-contains operator once at a time.

const authorId = 'dr-seuss';
const bookId   = 'lorax';

// 1. Middle Man Collection
const userReviews = db.collection('reviews').where('author', '==', authorId);
const bookReviews = db.collection('reviews').where('book', '==', bookId);

// Single read with composite key
const specificReview = db.collection('reviews').doc(`${bookId}_${authorId}`);

// 2. Map
// Reviews embedded on books
const bookWithReviews = db.collection('books').doc(bookId);
const userReviews = db.collection('books').orderBy('reviews.jeff-delaney');

// 3. Array
const books = db.collection('books').where('categories', 'array-contains', 'fiction');

// 4. Bucket
// Get a collection of documents with an array of IDs
const getLikedBooks = async () => {

  // Get books through user likes
  const userLikes = await db.collection('likes').orderBy('jeff-delaney').get();
  const bookIds = userLikes.docs.map(snap => snap.id);

  const bookReads = bookIds.map(id => db.collection('books').doc(id).get() );
  const books = Promise.all(bookReads)
}

getLikedBooks()

Advanced Techniques for Efficiency

Data duplication

Data duplication is about duplicating data at different places in the database to decrease the number of reads required to get the data. (This is a no-no in SQL but normal in NoSQL.)

An example of data duplication is a one-to-many relationship where a user can have multiple comments. Each comment is given the user's document ID for reference. However, if we want to avoid reading the user's document to get more data on them, we can duplicate some of that data in the comment document itself.

Data duplication makes the most sense when you have information with lots of reads but few writes: you're decreasing number of reads required to gather data, but you're increasing number of places you have to write when that data changes.

Pro tip:

  • Good candidates for data duplication are things like usernames, which almost never change. When they do change, however, you may have to update them in 1000s of documents.

  • Bad candidates are things like

Aggregation via cloud functions

We can create cloud functions like onCreate, onDelete, onUpdate, and more that run when an event happens.

The most common use case for this is incrementing and decrementing counters (e.g. like count). By creating aggregate data, you decrease the number of reads required to get that data, increasing performance.

PreviousFireship FirestoreNextModel Relational Data In Firestore No SQL

Last updated 3 years ago