One thing that's sorely needed in the official documentation is a "best practice" for backup/restore from "cold and dark" where you lose your main db in a fire and are now restoring from offsite backups for business continuity. Particularly in the 100-2TB range where probably most businesses lie, and backup/restore can take anywhere from 6 to 72 hours, often in less than ideal conditions. Like many things with SQL there's many ways to do it, but an official roadmap for order of operations would be very useful for backup/restore of roles/permissions, schema etc. You will figure it out eventually, but in my experience the dev and prod db size delta is so large many things that "just work" in the sub-1gb scale really trip you up over 200-500gb. Finding out you did one step out of order (manually, or badly written script) halfway through the restore process can mean hours and hours of rework. Heaven help you if you didn't start a screen session on your EC2 instance when you logged in.
Postgres is not great with off-site replicas, unless not many writes are done. Replication protocol is very chatty. One of the reasons Uber mentioned when moving to mysql in their engineering blog.
This is oft quoted but if you read the posts, Uber discovered they didn't want SQL (or apparently transactions etc), and implemented a nosql that happened to use mysql as a backend, and that was a much bigger change than moving off PG.
> One of the reasons Uber mentioned when moving to mysql in their engineering blog
If I'm not mistaken, this was in 2016 (that's 10 years next year, time flies when you're having fun) -- which is practically an eternity in IT. I'm no DBA but I'm fairly sure many changes have been made to Postgres since then, including logical replication (which can be selective), parallel apply of large transactions in v16, and so on.
I'm not saying this means their points are invalid, I don't know Postgres well enough for that, but any point made almost 10 years ago against one of the most popular and most actively developed options in its field should probably be taken with a pinch of salt.
Offsite replica is only applicable if the cause is a failure of the primary. What if I’m restoring a backup because someone accidentally dropped the wrong table?
nah, on a long enough timeline everything will go wrong. blaming the person who managed to drop the table finally is dumb: if you can't fix literally everything that could happen to it, it's not done.
Of course that’s preferable, but OP is specifically asking about the cold restore case, which tends to pose different problems, and is just as important to maintain and test.
If you can have a secondary database (at another site or on the cloud) being updated with streaming replication, you can switch over very quickly and with little fuss.
Which is what you must do if minimizing downtime is critical.
And, of course, your disaster recovery plan is incomplete until you've tested it (at scale). You don't want to be looking up Postgres documentation when you need to restore from a cold backup, you want to be following the checklist you have in your recovery plan and already verified.
> in the 100-2TB range where probably most businesses lie
Assuming you mean that range to start at 100GB, I've worked with databases that size multiple times but as a freelancer it's definitely not been "most" businesses in that range.
What about ZFS Snapshots and send/recv for backup and restore?. For us this is the cleanest approach, since we use it not only for PostgreSQL, but for all the data in our organization.
Of course, the underlying filesystem must be ZFS.
I guess it all depends on your requirements, since this would still cause data loss for the delta time between failure and your last snapshot, but I'm a huge fan of ZFS, and it might be one reason to try out Postgres on FreeBSD, since the only Linux distro that ships ZFS painlessly out of the box is Ubuntu to my knowledge.
I'm also curious how Distributed Replicated Block Device (DRBD) would perform, it would cause obvious latency but perhaps it would be an easier and more efficient solution for a "hot spare" setup than using Postgres native functionality. To my understanding, DRBD can be configured to protect you from hardware IO errors by "detaching" from an erroring disk.
I also don't know if it's a valid point, but I've heard people say that you don't want a fancy CoW filesystem for databases, since much of the functionality offered are things that databases already solve themselves, so you might be sacrificing performance for safety from things that "should not happen"(tm) anyway, depending on how it's set up I guess.
Or btrfs. I also think that filesystem snapshots are underrated backup strategy, assuming your data fits on one disk (which should be the case for almost all applications outside of FAANG).
Postgres backups are tricky for sure. Even if you have a DR plan you should assume your incremental backups are no good and you need to restore the whole thing from scratch. That’s your real DR SLA.
If things go truly south, just hope you have a read replica you can use as your new master. Most SLAs are not written with 72h+ of downtime. Have you tried the nuclear recovery plan, from scratch? Does it work?
> Even if you have a DR plan you should assume your incremental backups are no good and you need to restore the whole thing from scratch.
"Restore from scratch" can mean a lot of different things, if you have a read replica that you can promote then in relative terms to 72h+ downtime, this should be fairly quick, no?
If you have block-level backups or snapshots, with ZFS for example as someone mentioned, it should also be relatively quick -- although I assume this would make any hypothetical read replica split-brain.
While these optimizations are solid improvements, I was hoping to see more advanced techniques beyond the standard bulk insert and deferred constraint patterns. These are well-established PostgreSQL best practices - would love to see how pgstream handles more complex scenarios like parallel workers with partition-aware loading, or custom compression strategies for specific data types.
reply