Rating: 4.57 / 5 Based on 30 reviews
Firestore (and Firebase) is a really great solution for many different use cases. Everything that does so much gets complicated very quickly, even if it looks simple on the surface. Here are my top disadvantages of Firestore that can bite you badly based on more than ten mobile applications developed in Flutter and React Native and code audits, which we have accomplished at LeanCode.
This series of articles comprehensively describes the pros and cons of using Firestore as the backend for your next mobile application. In this series, we will show you that making this decision is not a simple process, and you need to analyze your app from multiple perspectives.
Posts in the series:
Would you like to learn from our experience what can go wrong with Firestore and Firebase and their biggest disadvantages and limitations? Read on to find out why and what the solution is!
Accessing the data directly from the end-user devices creates some non-trivial problems. Typically, your backend system would act as an intermediary that handles all the cross-cutting concerns. Here there isn’t one. Everything that would normally be done there is now the responsibility of Firestore. And this is a bad idea.
First and foremost, backend systems do validation & authorization. Most of the time, those are separate concerns. Firestore, on the other hand, treats them as equal and presents you with the same solution for both. You basically have to write authorization rules that sometimes do the validation. You have to do that in a custom language that somewhat resembles JavaScript but does not.
Usually, I would say that using a custom language is a good idea. Not in Firestore. These things are too complex to be handled with a language that can only make simple comparisons, retrieve related documents by id or do a simple query. And pretend to have functions. You can’t express anything meaningful and stay sane. Adding validation to the mix doesn’t help (so most of the time, there is no validation at all!).
When it comes to security rules, of course, there are cases when that is not a problem. For example, is that you can “easily” shield yourself from bad actors by just ignoring/sanitizing malformed data on the client. There are cases in Firestore when you don’t need any authorization besides simple “these users can only write these documents and read from those.” I would even say that most projects initially fall under this category. Then the complexity creeps in, and you find yourself deep in the broken Firestore rules, with validation code everywhere in your app. And no security rules.
Uh oh. I think this is the thing that made me (and well… my clients) scream in agony. Firestore pricing looks normal — you pay for ingress/egress, storage & operations. That’s understandable. What makes Firestore pricing hard to maintain is expense monitoring. Or, to be more specific — lack of it.
Firestore doesn’t give you a way to check how much you use. You can see how many ops you’ve already used, but when it comes to storage and ingress/egress, you’re pretty much left to yourself. Firestore doesn’t give you anything meaningful there. You only have a single “storage used” on your GCP bill. It doesn’t tell you how much stored data you really have or how much new data you’re creating. It only means how much they’ve billed you—nothing else. You can try to derive the changes from it, but that won’t be anywhere near “accurate.”
Theoretically, the documentation tells you how to calculate the storage you use (or will use). You can calculate everything yourself, but that requires you to download every single document in the Firestore database or do the calculation up-front when uploading the document for the first time. It’s also painfully complicated (for such a simply stated problem), terribly slow, and will cost you money just to calculate how much you will pay.
What does count under the term storage used? Well, everything. Documents, collections (i.e., paths, as a collection isn’t really a thing when we’re talking about storage space), indices, you name it. You pay for every byte that you create and every byte that Firestore creates for you. And it makes a lot.
By default, Firestore indexes all of the fields in your documents. All of them. It would be best to disable indices explicitly, and you can only create 200 exemption rules (as of 21–09–2020). This makes it extremely important to model your data carefully because one wrong index can result in a tremendous amount of unused data. I am guilty of overlooking this. In the Activy, where we used Firestore to sync activities and sync Rankings across, we generated almost 24GB of indices for every 1GB of data. We haven’t used any of that.
So, when it comes to the Firestore pricing, you have to be really, really, really careful, even for simple cases. As I say — it only takes one bad actor to pollute your data.
Google does not make any promises regarding latency. That alone might be the key to rejecting Firestore as your database. Without any assurance, you can’t design your product well. Even if the timing is high, you would know it and be able to work around it. For example, you could hide the latency by starting the request earlier in the process of just doing it entirely in the background. This would increase complexity (that Firestore tries to avoid) but would be doable. You can only measure and hope it will be consistent without known RTT (round-trip time, latency times 2).
And the measurements of Firestore aren’t that good. Over one second for a small query is a really long time. This mostly coincides with our benchmarks in the mobile app of our client - Activy, which uses Firestore quite extensively. It works more or less the same as in the article, i.e.:
Uploading the document (to a known path) takes more than 300ms in the Activy product. Waiting for processing and sending notifications takes another couple hundred ms (~200ms in our case). All of this gives 1s at best. Compared to a simple WebSockets server running on the smallest GCP instance (as per the article), Firestore looks terrible. Even if you add message processing, some small database, and such, you won’t get more than 500ms RTT on the smallest instances possible.
For example, accepting this kind of latency might be feasible for some applications, especially in their early stages, but using Firestore for near real-time communication is shooting yourself in the foot. You wouldn’t be fast even if you did your best.
Firestore, even though it is somewhat powerful, is rather limited compared to traditional databases (being it RDBMS or another document database). Combining the basic queries with the index-all-by-default approach gives you a great starting point, but you need to model your data for searchability upfront.
There are several limitations that make using Firestore usage painful. Some of them (for example, the limits of OR or array-contains/array-contains-any) are not that awkward, but the first of limitations, namely that you can do range queries only on a single field, is irritating. It’s pretty common to do “get me all transactions from this date range that are valued no less than X,” and this single rule disallows that. Also, Firestore does not support “negated” queries (like not-in or plain old !=), making common queries unrepresentable.
Document databases in Firestore tend to have limited processing capabilities. That means you can’t compute values based on query results directly in the database like, e.g., SQL, nor do they allow joins. To overcome this, Google introduced the MapReduce approach that somewhat mitigates this issue. MapReduce became the de-facto standard for document databases (even MongoDB supports it!).
Unfortunately, Firestore does not have anything like that. You can implement it yourself using so-called aggregation queries, but that solution is really far from perfect. You have control over the process, but Firestore cannot optimize any of this. You effectively have to do optimistically-concurrent transactions to update a collection that works as a map-reduce index. This can work for simple cases where your source collections aren’t modified frequently, but your retries will eat all of your performance if there is some load.
You develop version 1 of your mobile and web application with Firestore. Everything is great. Everything syncs correctly, everything works fast, and the development was a pleasure. Your user base grows, you become famous, and money starts flowing. To make you even richer, you begin to think about new features. And you decide to implement them.
This is where the pain starts. You’ve developed your app, you’ve gained a userbase, all of your business logic is in the app, running directly on end-user devices, and you need to migrate the data. Migrations are always tricky (esp. in the NoSQL database), but you’re in an even worse situation here. You need to migrate some data and build your app to handle multiple versions of said data.
Consider the situation where you need to modify the model. It doesn’t really matter if it will be just a quick fix that results in adding a field or completely revamping the data, although adding things is much easier to migrate. If you have to do it once, it is doable but needs to be accounted for upfront: the previous version of your app needs to be created not to crash when the model slightly changes. The new version needs to handle the data from the old version (so — migrate it) and save the new data so that the old app won’t break.
Why not just abandon the old version, you might ask? Because you can’t force your users to upgrade the app. Some folks will plainly refuse to upgrade, but even if you can ignore them, the upgrade isn’t instantaneous, and you can’t control it. Doing extensive upgrades with standard backend systems isn’t viable either, but at least you control everything here, and it is up to you when the upgrade will (or will not) be finished.
It gets even trickier when you need to migrate data more frequently. Requiring to handle three or even four versions will result in much grief and many bugs. And if you deploy a rogue version of the app with a critical error… You can’t really take it back fast enough. It might do you immense damage before you can revert it. Working with the Firestore database system is a challenge.
Because Firestore querying is limited and there is no map-reduce, one data model won’t handle all the cases. You will end up uploading much more data than you need to and do everything in-mem or duplicate the data. You can leverage Cloud Functions here to make it automatic, but you must handle all the CRUD actions yourself.
This also means that there will be some delay between adding/modifying the document and propagating it to other collections. Some of the best practices are to design your app so that it handles eventual consistency well, but sometimes that is just overkill. Or worse, your business (or regulatory) requirements prevent you from becoming consistent.
If you go with the data duplication (and doing the mapping yourself), you will end up with multiple separate copies of the same data, just structured differently. This not only increases the storage cost but also highly increases the complexity. You now have a single document to version and multiple related documents that need to be upgraded with care (and possibly atomically, which might make the process even harder).
This paragraph will be a short one.
There are no backups in the Firestore. Firestore usage allows you to export data to a GCS bucket, but that isn’t really a backup - it’s just an export. Firestore doesn’t ensure consistency, and the backups are mostly manual (you can script that, but you have to write it yourself). The timings aren’t really predictable.
It’s just not a backup. This just adds up to other limitations.
What matters the most is to be aware of Firestore limitations before you design your system. All of the above-mentioned problems are not always red flags. There are certain business cases where using Firestore as your backend might still be a very reasonable decision. You have to weigh all the pros and cons and decide for yourself if you want to use Firestore and follow this path.
In the next article from our series, we will describe how to use Firestore and Firebase benefits right way and which steps you should consider (as a client or developer) when you have already found yourself in the traps described above.
For best practices on preparing to use Firestore, read Why Firestore? - 6 things you need to know before using Firestore.