I so badly wish we could change a big Windows business application to use PostgreSQL rather than Microsoft SQL Server just because of the licensing costs. SQL Server is a fantastic product, but restricted to 128GB RAM and a few CPU cores or you have to start paying so much, that not even our biggest customers can justify it.
Migration isn't easy, as this venerable application uses ADO.NET Datasets with TableAdapters and plenty of stored procedures. The syntax is almost compatible though. But not enough unfortunately.
For our next product, we're sure to bet on PostgreSQL instead.
An enormous flaw of Sybase/Microsoft SQL Server is that it does not implement the SQL/PSM standard.
https://en.wikipedia.org/wiki/SQL/PSM
The syntax for this does come largely from Oracle.
"SQL/PSM is derived, seemingly directly, from Oracle's PL/SQL. Oracle developed PL/SQL and released it in 1991, basing the language on the US Department of Defense's Ada programming language."
In any case, I have thousands of lines of PL/SQL written by many people which are currently useless for SQL Server applications.
SQL Server should implement SQL/PSM. The sooner, the better.
For those trying to escape the licensing costs of SQL Server, Babelfish may be an option.
Sybase SQL Server hasn't been a thing for at least a decade. The Microsoft fork happened more than 30 years ago.
And yet.... it lives still....
https://www.sap.com/products/technology-platform/sybase-ase....
> An enormous flaw of Sybase/Microsoft SQL Server is that it does not implement the SQL/PSM standard.
Why is this a problem? I've always enjoyed T-SQL. Right from the start it had a "scripty" feel and stored procedures were easy to code with it. We thought about implementing PSM at Sybase in the 1990s but there was little user demand. (Unlike row level locking, the lack of which undid Sybase SAP implementations...) Internally many of the engineers thought PSM was pretty awful to use. I never liked it myself.
It is an ANSI standard, implemented by many databases.
SQLite's trigger syntax appears to be conformant.
Transact SQL has much fewer implementations, and standards conformance enables interoperability. The SQL Server family is lacking in this respect.
Yes, this is such a big challenge for commercial database products. PG is so great that it often makes the most sense to reallocate the money to more CPU/RAM/SSD, rather than licenses.
I just wish it had columnstore support.
Citus can do this. Postgres is good but extensibility makes it best.
Test this:
"Goodbye Microsoft SQL Server, Hello Babelfish" - https://aws.amazon.com/blogs/aws/goodbye-microsoft-sql-serve...
Or the Opensource project: https://babelfishpg.org/
I evaluated this for our database and ran into many cases where it wasn't a 1:1 replacement, especially if you lean heavily on stored procedures. Additionally, our db library (it's a Rails app; the database was originally used with another language) needed features Babelfish didn't support. It may be worth another look for us, but I assume it still shouldn't be considered a 100% drop-in.
> I so badly wish we could change a big Windows business application to use PostgreSQL
This ^^^
A small businness I (no longer) work for was using Windows Servers, SQL Server, Classic ASP, .NET and other things. It was expensive!
I tried sooo hard to migrate to get them to realise the savings moving over to Linux and Postgres, and get their DATED software over over afterwards!
Well, it was Linux and MySQL/MariaDB but I have slowly grown fond of Postres over the last couple of years.
I will always remember (and find funny) when we purchased a server a third-party no longer wanted to support for us anymore (linux+php) and my boss said "they only pay £300 a year for that server" -- yep.
We are in the same boat. An older application with lots of dependencies on MS SQL Server. We are slowly chipping away at it with our migration effort.
Go Babelfish
LLM's might be able to translate the stored procedure code without too much difficulty, and assuming you have test coverage. Might also do a good job of translating DDL code such as index definitions etc.
I too was impressed with SQL Server last time I used it (big note: SQL Server is one of the few commercial DB's that does real nested transactions; PG does not), but I get it.
> SQL Server is one of the few commercial DB's that does real nested transactions
More or less. The default ROLLBACK behaviour is to roll back the whole nest. You have to use SAVE TRANSACTION instead of BEGIN TRANSACTION and specify the name in ROLLBACK. If doing this in a procedure that may or may not be part of a nested transaction (an explicit transaction wasn't started before it was called) you have to test to see if you need to BEGIN or SAVE¹ and what to do if a rollback is needed (you likely don't want to ROLLBACK everything if you used SAVE, but have no option if you used BEGIN). Not exactly intuitive. And saved transactions can cause excessive lock escalation, impacting concurrent performance.
SQL Server is generally a damned fine product overall, both generally and compared to a lot of other things emitted by Microsoft, but it isn't even close to perfect in some areas.
----
[1] ref: https://learn.microsoft.com/en-us/sql/t-sql/language-element...
Thanks for the clarification.
I'm not one to compliment Microsoft software products, but Analysis Services and SQL Server, at least when I used them at the time over 10 years ago, seemed like darn fine products (which they purchased from someone else, I believe...)
SQL Server itself came from Sybase originally, with them and MS working on it as partners for a number of years when it was an OS/2 product, so it wasn't a purchase-a-working-product-and-rebrand-it deal. As Windows NT gained ground, MS & Sybase parted ways (amicably IIRC, when the partnership agreements expired MS licensened the source that Sybase owned and bought out exclusive use of the product name in relation to products for Windows – for a time MS's SQL Server still carried Sybase copyright messaging along with MS's own).
SQL Server v7 was a rewrite of significant parts of the internals, so from that point it is probably fair to call it an MS product quite distinct from Sybase's (which itself continued to be separately developed including significant but different improvements/reengineering of the internals), though still showing some signs of its heritage.
What's a real nested transaction? What's postgresql missing? https://www.postgresql.org/docs/current/subxacts.html
Savepoints are not considered legitimately isolated nested transactions.
To the best of my knowledge.
Note: Am not a DBA, merely a full stack dev that has worked with various databases over the years.
Transactions are the only composable concurrency control primitive, but this requires that they be nestable.
It’s a key “primitive” missing from almost all high-level programming languages, with the notable exception of some SQL dialects.
Had a fascinating conversation with Claude about this (don't worry about the unhinged personality I gave it with preprompting):
https://gist.github.com/pmarreck/970e5d040f9f91fd9bce8a4bcee...
Pasted it there for brevity here. Also includes an easy way to export a Claude conversation (made with the help of Claude, of course).
LLMs are not a great source for this type of information. This mess in no way articulated in what way savepoints and transactions are different.
Savepoints are fully equivalent to nested transactions with the constraint that only one is concurrently active. This limitation is just from the SQL standard programming model. At least in PostgreSQL implementation it would be reasonably simple to support multiple concurrent subtransactions. They are even called subtransactions internally.
What on Earth are you using as your system prompt!!?
LOL, it's amazing
> SQL Server is one of the few commercial DB's that does real nested transactions
Not sure where this myth keeps coming from, but no, it does not:
https://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a...
Interesting. Is this still up-to-date? The link you posted is 15 years old
Yep, nothing's changed there around transactions.
You're getting downvoted by gatekeepers but LLMs are amazing with SQL and this is a great use-case.
Anecdata, but I agree. I went from a recursive CTE (that occasionally would take 1GB+ of memory for zero rows) to a loop and used AI to do it, and it did a pretty good job
Possibly related, talked to an LLM about the whole "who actually does nested transactions" thing and commented about it above:
Turns out Postgres Is Enough: https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...
(Gist contains a bunch of Postgres ecosystems things, if anything is missing, please let me know)
Kind of, with caveats. For one "extensions can interfere with each other and cause incorrect behavior" [1], and the practicallity of implementations varies. Say, Cypher queries using AGE can be order of magnitude slower than using a dedicated graph DB [2].
Yes, you can drive a Ford Falcon coast-to-coast, but how much gas are you going to spend? Postgres is a solid database but has roots in the 80s. Optimizing for modern hardware would reduce hosting costs and improve efficiency. [3]
This isn’t criticism of Postgres as a database but of the idea that it’s inherently sufficient for all use cases.
--
1: https://www.cs.cmu.edu/~pavlo/blog/2025/01/2024-databases-re...
2: https://blog.kuzudb.com/post/what-every-gdbms-should-do-and-...
3: https://cedardb.com/blog/ode_to_postgres/#why-is-everyone-st...
Of course there are better time series than Timescale. Better graph databases than AGE. Better GIS databases than PostGIS. The point made when it comes to Postgres is 9 times out of 10 you don’t need better thing. And unless you have that compelling optimization already ready to go with receipts to back it up, probably you should pick Postgres
Not disagreeing, but for more context. These blog posts also have to be taken with a grain of salt as the authors obviously have their own project to promote, coloring their point of view.
The algorithms detailed in the graph DB post can be run on top of PostgreSQL, though one has to do some of the work of the DBMS themselves (maintaining indexes, transforming query to correct form). It's extra work and requires some knowhow, but up to some extent it works "well enough".
Similarly, the CedarDB comments, that could be boiled down to the lack of column major batched executor model, are mostly applicable to analytical queries. Point query and traversal type workloads are not as different. There are various projects that work around this to some extent by sticking DuckDB into PostgreSQL . This gets vectorized execution on mass data scans, which is usually enough to get into the right ballpark.
The point being that the amount of use cases where Postgres is sufficient is highly dependent on the skill and time put into it. And at some point the extra organizational cost of going heterogeneous is going to be worth the payoff. The hard part is figuring that out ahead of time. Which is why my recommendations for starting developers is to just use Postgres until it starts to hurt, then optimize it to work well again and use the insight gained to see what functionality needs to be peeled off into a specialized system.
> Yes, you can drive a Ford Falcon coast-to-coast, but how much gas are you going to spend? Postgres is a solid database but has roots in the 80s. Optimizing for modern hardware would reduce hosting costs and improve efficiency.
This is disingenuous. If you’re comparing Postgres to CedarDB, a more apt analogy would be Postgres is a late model Ford F-150. A utility vehicle with roots in the 80s but continuously improved since then. Not the most efficient vehicle or the fastest, but has lots of utility and great reliability. People have collectively driven F-150s for billions of miles.
And CedarDB is a concept car that some guy built in his garage. It might be efficient but it certainly isn’t proven to be reliable like Postgres is.
If I’m taking a coast-to-coast trip, I’m driving the F-150. I might not get there quickly or efficiently, but I’m sure I’ll get there. And if my F-150 breaks down, there are thousands of mechanics between the coasts that can help me get back on the road. If I take the CedarDB concept car, when it breaks down there are only one or two people that can help me, and they are thousands of miles away.
Yes, you can do everything with postgres. Supabase even has an extension that allows you to send http requests from Sql.
But for me writing stored functions is not that fun, instead it would be better to use a different language, and write code that generates all the pl/pgslq and manages stored functions for me.
Kind of like code in lisp, then deploy generated code to postgres.
I dont know about you guys but writing "create or replace function ..." boilerplate again and again manually gives me wrist pain.
Replication with Bucardo
After coming back to coding after ~10-15 years away, databases are IMHO the most improved element of the dev/devops experience.
Managed Postgres on Google Cloud is a fantastic product (I'm sure other cloud's offerings are also similarly good). Backups, replication, point-in-time recovery, managed updates, plus quick & easy system & query insight tooling. All for less than £1/day.
While I do love Postgres and use it daily on AWS and Google Cloud, I will add that the managed Postgres on Google Cloud is a mess in some areas. For example they use some EOL extensions outdated for 10+ years (a specific example is GEOS) and refuse to update it and give no control for you to upgrade it either.
I wonder where things will stand in 10 years from now. Will many orgs still be consuming vanilla Postgres, or will most workloads have shifted to ~proprietary implementations behind cloud services like "Aurora PostgreSQL Limitless Database" and "Google AlloyDB for PostgreSQL" due to unrivalled price-performance? In other words, can progress in OSS Postgres keep up with cloud economics or will things devolve into an even messier ecosystem centred purely around the wire protocol and SQL dialect?
Databases seem to grow much slower than other assets, so maybe this price advantage just won't be worth the vendor lockin. Hell my current extremely valuable postgres database worth literally millions of dollars is about thirteen gigs and could be hosted on my mac mini if I really wanted to. Still, the managed hosting is worth it—only without vendor lockin!
DBMS (and DBAs) tend to be more conservative, so extensions and implementations diverge much more slowly than, say, js.
There's also the incoming business argument in favor of not diverging too far from baseline.
If I'm AWS/Azure/GCP trying to attract a customer from a competitor service, 'completely rewrite your app to be able to use us' isn't a compelling pitch.
MS SQL Server and Oracle have different incentives, but the cloud services would probably prefer portability and decreased maintenance / fork support load.
If you think those are nice, check out neon.tech - set up was faster than any of the big cloud providers.
Unfortunately they don't have anything in a region where my code runs. Interesting though!
(neon ceo). What region are you in?
Hi! London.
Oh, that's coming very very soon! London and San Paolo. Def in Q1.
I love PG. PostgreSQL already had most of its great features 15 years ago.
Anything new that you particularly enjoy?
I guess it's the zero-hassle managed service that's the improvement rather than the DBMS itself
jsonb
At 10 years, it is newer than 15 but its not a new feature by a long shot.
RLS (row level security) came out in 9.5, January 2016. It's a game changer when used in combination with JWT tokens for identity and something like PostgREST. Specifically because a client can talk directly to PostgREST and run queries directly against the DB while letting the RLS policies dictate access. Example, a mobile client is allowed to insert data on the auth'd user's behalf, but the same user can only access rows that are associated with their own identity. Supabase rolls this up into an out of the box solution, which basically eliminates a lot of work writing "middleware APIs".
Its a good feature but I spent many days debugging problems caused with it. It might have been the fault of supabase.js apis , but still not a lot of fun memories.
It's like a lot of things... process orientation and discipline early on go a long way to keeping things sane. There are nice techniques for debugging policies as well (login_as_user functions, etc). We write pgTap tests for everything, and that REALLY helps.
It's been available for less than half the time I've been using PostgreSQL. I stil consider it new.
I wish someone that has resources to invest into making the replication in PostgreSQL seamless, easy to configure and easy to change masters without anything like pgbounce and friends.
Otherwise, PostgreSQL is fantastic.
Pgbouncer is a connection pooler; it has nothing to do with replication.
That said, it’s not that hard to set up replication [0]. Properly tuning the various parameters, monitoring, and being able to fix issues is another story.
RDBMS is hard. MySQL is IMO the easiest to maintain up to a certain point, but it can still bite you in surprising ways. Postgres appears to be as easy on the surface, buoyed by a million blog posts about it, but as your dataset grows, so does the maintenance burden. Worse, if you don’t know what you should be doing, it just eventually blows up (txid wraparound from vacuum failures probably being the most common).
[0]: https://www.postgresql.org/docs/current/runtime-config-repli...
> txid wraparound from vacuum failures probably being the most common
Hopefully OrioleDB can upstream all the necessary changes soon. For those who don't know, it's a storage engine for Postgres that uses undo logs instead of vacuuming old records.
OrioleDB maintains a very small set of Postgres patches that are targeted for upstream. The storage engine that mitigates the need for vacuuming [1] is implemented in a standard Postgres extension, so that will still need to be installed by the Postgres host in order to take advantage of it.
But yeah, looking forward to that day too!
[1] https://www.orioledb.com/blog/no-more-vacuum-in-postgresql
Yes, you are right, but I was thinking of the ease of having a high-availability setup and pgbouncer was the first thing that came to mind.
And yes, I read the documentation and it is still cumbersome to have an HA setup that is easy to maintain. This is what I mean and hope it is clearer now.
Problem is primary devs of Postgres are consulting shops that make money in this space so making it seamless would affect their revenue
I work on PGD but I'm pretty sure we actively support open-source options for HA:
Repmgr is handy, but it is not a replication system per se, but rather makes managing a bunch of moving other parts easier.
Personally I'd like BDR to be in the main tree, or something else, equivalent to Galera:
It's not perfect [1], but it'll get you a good way for many use cases.
[1] https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
I'm hoping for seamless upgrades at some point, like MySQL does it — you install the next major version and restart the daemon. It's completely unnecessary for major operators, but is a life-saver for small businesses, websites that don't need 99.99999% of uptime (i.e. almost all of them), in development, etc.
How do updates work with postgres?
Well, three ways:
1. Dump a backup to disk, then restore from dump on new version;
2. Stop the old version, then run `pg_upgrade --link` on the data of the old version which should create a new data directory using hardlinks, then start the new version using the new data directory. This is rather quick; or
3. Use Logical Replication to stand up the new version. This has ... a few caveats.
You do a backup on the old version and a restore on the new one.
From what I understood, sometimes the way data is written to disk differently between versions and they're not compatible. I guess due to optimizations or changes in the storage engine?
Out of the box HA would make it complete. Apart from patroni, anyone experience with https://github.com/hapostgres/pg_auto_failover?
This is the exact reason why we went with MySQL. Many asked why, A small team with limited resources. You cannot go to production without HA, and many time cloud may not be a solution. For small use cases MySQL Innodb Cluster is a simple elegant solution to setup.
I agree, Postgres is great but I think it's just a little crusty for cloud environments. Clustering should be built in and easy to configure.
Isn't citus quite good at that? https://www.citusdata.com/
I think Citus only does sharding not replication. You can use it together with Patroni though.
What we really need is a de-facto open source tool to manage logical replication seamlessly. The devs have built all the pieces, and as of v17 there are very few limitations. But I am constantly tripping over the fact that I cannot observe how the replication is progressing, and a good way to mitigate failures.
Take a look at https://github.com/hapostgres/pg_auto_failover it's quite simple to use and manage. You can use libpq support for target_session_attrs (been there since Pg 10 - https://paquier.xyz/postgresql-2/postgres-10-libpq-read-writ... ) so you don't need active loadbalancer in front of your Pg cluster.
You can however create your own health/status check service on top of pg_autoctl show state and use HAProxy if required.
I don't think there's something easier to setup and manage than pg_auto_failover, Patroni always appeared very complicated to me.
I would say Pgpool-II is still the best we got.
This video covers pretty much all the practical stuff. I timestamped the parts in a comment.
PgPool II Performance and best practices https://www.youtube.com/watch?v=bMnVS0slgU0
Oh man, this takes me back to 2005 when I first moved to Pg and needed replication. Its replication is statement based, i.e. sends each DML query to all nodes. That means every write query must be deterministic. That wasn't always obvious. Though it did work for us until we moved to Slony then block level WAL replication. Thankfully we have logical replication today, even if it's still a bit more painful than MySQL.
> Slony
Gosh, haven't heard that in years. I remember a company I used to work for used it on their old system but the new system didn't use it for whatever reason.
Turning on replication in neon.tech is literally just a switch. And it plays well with a CDC library of mine:
Something pg native would be fantastic.
Patroni is pretty good at this.
Can patroni replicate across mixed versions of Postgres? (i.e. can it do logical replication? [and handle DDL replication?])
I thought patroni only did physical replication (only replicates across the same version of Postgres). But maybe I'm mistaken.
The methodology and general concept of this is a bit silly IMO, but that being said I've never regretted going with postgres whenever I choose it.
DB Engine rankings are not very reliable and are constantly gamed, they shouldn't be used for anything serious.
Methodology details available here [0]. They measure popularity by collecting at several flavors of online mentions, then normalizing and averaging them. Specific sources include:
- Number of mentions of the system on websites, measured as number of results in search engines queries. […]
- General interest in the system [via] Google Trends.
- Frequency of technical discussions about the system [on] Stack Overflow and DBA Stack Exchange.
- Number of job offers, in which the system is mentioned [on] Indeed and Simply Hired.
- Number of profiles [mentioning] the system [on] LinkedIn.
- Relevance in social networks [by counting] Twitter (X) tweets, in which the system is mentioned.
Seems like a reasonable approach to me. They should add more social networks now that Twitter/X is hemorrhaging marketshare. Mastodon, Bluesky and Threads should be included.
Honestly I see that all over… so many sites have Twitter integrations that haven’t even been update to X yet. Gonna take a while for the broader ecosystem to shift.
From their methodology page (https://db-engines.com/en/ranking_definition), it seems they use:
- Number of search results for each DB's name
- Google Trends
- Mentions in Stack Overflow etc.
- Mentions in Tweets
- Mentions in people's LinkedIn profile
Probably gives a vague idea of popularity of the system, but the measure is otherwise pretty useless.
It's a bit strange that SQLite went up only 1 place on the ranking (to the 10th).
There's a lot of new articles of how to improve its performance and I love that I don't have to care about the n+1 problem so much.
It is now the default production database for Rails for a good reason.
SQLite performance is kinda nutty. My design of the DB at the time was probably poor (but I was 15 so cut me some slack :D) but I made an app that had to run on iPod Touches at the time, so we could use the accelerometer for physics class in school.
initially the performance was too poor, but after a bunch of reading and some changes in how I was using SQLite I got it to easily do more than 100k rows per second of insertions (the db wasn't very wide). On an old embedded device, mind you. Didn't need that much, but wowza! was my expression at the time. I've had a love for it in my heart every since.
Don't get me started on Rails and all the anti-patterns that Active Record introduces...
You don't have to. I believe RoR has good and bad sides, but the multi-threaded debugging stories of _why_ there were timeout problems when the interpreter got a global interpreter lock and how SQLite driver got fixed translates to lots of other programming languages.
Just saying that you run network I/O and blaming the kernel / network latency for all the delays in other SQL drivers is easy, but with SSDs getting so much lower latency than HDDs, one has to seriously consider throwing out network based databases for lots of programming tasks.
I don't know anything about db-engines.com but I have literally been hearing nothing but good things in regards to postgres for 20 years now. My very first job the boss told me a story of how they helped a client who was throwing hardware at their mysql DB by migrating them to postgres. Ever since I've preferred postgres. Even though I couldn't tell you on a technical level what exactly makes it better.
Obviously there's some level of subjectiveness here, but how many years could they realistically have gotten his award?
I wonder how they came to this conclusion given that the ranking page shows otherwise, with Oracle first, having a score of 1258.76 over PostreSQL at 4th place having a score of 663.41.
see -> Methodology ( https://db-engines.com/en/blog_post/109 )
"Methodology We calculate the DBMS of the year by subtracting the January 2024 from the January 2025 scores. We use the difference between each DBMS’s score, rather than a percentage growth, which would favor systems with lower popularity at the beginning of the year. The DMBS of the year is an illustration of how much each system increased in popularity during the year. You can read more about our exact methodology here ( https://db-engines.com/en/ranking_definition ), but in short we are ranking popularity of DBMS based on how much people are discussing and communicating about each system, including job offers, professional profile entries and citations on the web. job offers, professional profile entries and citations on the web. "
The listed diff from Jan 2024 puts Snowflake at almost double Pg, and third place MSSQL server is next to last place. Even if they meant Azure when they said MS in third, it is still in negative territory while there are more than 20 dbs with positive ratings.
Postgresql +15
Snowflake +28
MSSQL -78
So DB of the year seems to be based on feels rather than their numbers.
> We calculate the DBMS of the year by subtracting the January 2024 from the January 2025 scores.
Although by that measure Snowflake should be #1 not Postgres, so who knows.
Seems like they are just subjectively picking what is more internet popular.
How is Oracle in the top spot? Everyone I've ever talked to who used it was trying to get off it or complaining they're only on it because of regulatory capture.
Also, where is ValKey?
I was surprised to see such a big jump in ClickHouse (up 7 places). Now it's higher than Spark but lower than Hive and HBase (they were trendy around the first half of 2010x).
Confused why their ranking has postgres at 4 then?
Because it's more about trends than current rankings.
They specifically look at the absolute change from the score last January. Biggest delta = DB of the year.
From the article:
“We calculate the DBMS of the year by subtracting the January 2024 from the January 2025 scores. We use the difference between each DBMS’s score, rather than a percentage growth, which would favor systems with lower popularity at the beginning of the year. The DMBS of the year is an illustration of how much each system increased in popularity during the year”
From that site’s own info page: “The DB-Engines Ranking does not measure the number of installations of the systems, or their use within IT systems.”
Their methodology, even as stated, seems to be pretty flawed.
Deserved, Postgres is enough for 95% of normal use cases
db-engines.com is the Website of the Year 1994