« Back50 Years of Queriescacm.acm.orgSubmitted by rbanffy 3 days ago
  • qianli_cs 6 hours ago

    Great summary! I also recommend the "What Goes Around Comes Around" paper written by Mike Stonebraker and Joe Hellerstein: https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/pape...

    It was written 20 years ago, but even today, the relational model and SQL are still the prevailing choices.

  • refset 7 hours ago

    Good overview, although a rather important aspect of the 'resilience' that's not really covered here is the way that SQL technologies navigate and sustain performance trends. Relational databases are sticky in large part because the implementations are always getting faster and more sophisticated.

    SQL being a declarative language means that applications built on top of it can relatively easily take advantage of new hardware and increasing parallelization without changing any code (in addition to all manner of new software tricks, compression/optimization/joins/etc).

    • gregw2 an hour ago

      How/when/why did the word "base" get added to "data" to form the concept of a "Data Base"?

      The article mentioned CODASYL-era the term was used but wondering where it started...

      • refset 23 minutes ago

        > The origins of the term “data base” and subsequently “database” go back a long way. The first sighting of the term was its use in 1963 by the System Development Corporation who sponsored a symposium with the title “Development and Management of a Computer-centered Data Base”. The term “data base” was picked up by the contributors to the symposium in the titles of their papers.

        From Section 1 of "Nineteen Sixties History of Data Base Management" https://dl.ifip.org/db/conf/ifip3/histedu2006/Olle06.pdf

      • j-pb 4 hours ago

        Reading this gave me a weird idea. What if SQL is so successful exactly because its syntax makes JOIN operations cumbersome. Other languages where joins are syntactically convenient (e.g. datalog) make it much easier to write slow queries, whereas SQL forces you to denormalise tables tables from a syntactical standpoint already, which later translates to performance gains when it comes to semantics.

        In a sense SQL is the (typewriter) QWERTY of query languages. Inconvenient by design.

        • abetaha 8 hours ago

          A very enjoyable read of the history of databases through the lens of query language evolution, and the research done at IBM and Berkeley for System R and Ingres.

          • pncnmnp 5 hours ago

            Donald Chamberlin's (the author of this article) oral history is also quite fun - https://archive.computerhistory.org/resources/access/text/20.... A lot of the history discussed in the article is expanded upon there, and vice versa.

            He talks about how System R got its name:

            > Two of the important figures of System R were Leonard Liu, who was the person who hired me into IBM, and Frank King, who was the manager of the Relational Database project. ... One of the things that Frank thought was important was for our project to have a name so that we could make slides about it, write papers about it, and get some recognition. In order to get recognition for something it's good for it to have a name. So Frank called a meeting and said, "You guys are going to have to think of a name for your project." We thought, "Well, that's a waste of time. Don't bother us." But he persisted and said, "You guys are going to have to come up with a name." So for lack of anything else we said, "Well, we'll call ourselves System R." R stood for relations, or maybe it stood for research, or Franco Putzolu even thought it stood for Rufus, which was the name of his dog. It was a little bit of artful ambiguity what the R stood for, but that was the name of our project.

            Also his earliest interaction with Larry Ellison:

            > I’d been seeing some things in the trade press once in a while about a company called Software Development Laboratories that claimed to be developing a relational database system. I hadn’t paid much attention to it, but in the summer of 1978, I got a phone call. It was from a guy named Larry Ellison, and he said he was the president of Software Development Laboratories, and they were developing an implementation of the SQL language. Since we were in the research division of IBM, our philosophy of research was to publish our results in the open literature. As you know, many papers 22 came out of the System R project that were published in conferences and journals, describing the language and the internal interfaces of the system and some of the optimization technology and so on. The project was not a secret and, in fact, we’d been telling everybody about it that would listen. And one of the people that had listened and had read some of our papers was Larry. So he called me up and said that he was interested in implementing the SQL language in the UNIX environment. IBM wasn’t interested in UNIX at all. We were primarily a mainframe company at that time. We had some minicomputer products, but they really weren’t robust enough to manage a relational database, and there was little, if any, attention being paid to the UNIX platform. But Larry was really interested in the UNIX platform. He had a PDP-11, I think, that he was using as the basis for his SQL implementation, and he wanted to exchange visits with us and learn whatever he could about what we were doing, and in particular, he wanted to make sure that his implementation was consistent with ours so that there would be a common interface with compatible error codes and everything else. I was very pleased to get this call. I thought, “Terrific. This is somebody in the world who is interested in our work.” But I had some constraints on what I could do because of my position in IBM. I had to get management approval to talk to somebody on the outside, even though there was nothing secret about our project. Everything that Larry had access to was perfectly available in the open literature. So I went to talk to my boss, Frank King, and Frank talked to some lawyers—there were always plenty of lawyers in IBM that could think of a reason not to do most anything. Sure enough, they said, “You better not talk to other companies who are building products that are competitive with ours. We really don’t want you exchanging visits with these guys, so just tell them ‘Thanks for your interest, and have a nice day.’” So that’s what I did. I told Larry that, unfortunately, due to the constraints of the company, we wouldn’t be able to exchange information other than in the public literature. But that didn’t slow down Software Development Laboratories. They released their implementation of SQL. In fact, it was the first commercial implementation of SQL to go on the market. It was delivered by Larry Ellison’s company, initially called Software Development Laboratories, which later changed its name, I think, to Relational Software Incorporated, and later took on the name of the product, which was called Oracle. As you know, if you drive along Highway 101 in Redwood City and look at the giant 100-foot-tall disk drives over there on the edge of the bay, Larry’s had some success with these ideas. And rightly so. He brought a lot of energy and marketing expertise and a completely independent implementation, and was very successful with it, and had a major impact on the industry.