• bane 2 hours ago

    One of my first jobs was helping build an expert system for a, even today, complex computational linguistics problem. The company had a rich corporate library full of academic books on expert systems, decision trees, first gen (pre-winter) AI, and some early books on early ML approaches. I remember seeing this book in particular and its evocative title caused me to look deeper into the library than I would have normally.

    Our core system was built of thousands upon thousands of hand-crafted rules informed by careful statistical analysis of hundreds of millions of entries in a bulk data system.

    Part of my job was to build the system that analyzed the bulk data and produced the stats, and the other part was carefully testing and fixing the rulesets for certain languages. It was mind-numbing work, and looking back we were freakishly close to all the bit and pieces needed for then bleeding-edge ML had we chosen to go that way.

    However, we chose expert systems because it gave us tremendous insight into what was happening, and the opportunity to debug and test things at an incredibly granular scale. It was fully possible to say "the system has this behavior because of xyz" and it was fully possible to tune the system at individual character levels of finesse.

    Had we wanted to dive into ML, we could have used this foundation as a bootstrap into building a massive training set. But the founders biased towards expert systems and I think, at the time, it was the right choice.

    The technology was acquired, and I wonder if the current custodians use it for those obvious next-step purposes.

    • nowittyusername 9 minutes ago

      Your post is making me think maybe there is quite a lot of lost knowledge out there somewhere that maybe has pertinence in modern day agentic AI system building use. I am currently experimenting in building my own AI system that uses LLM's as the "engine" but the "harness" around the said LLM will do most of the heavy lifting. It will have internal verification systems, grounding information , metadata, etc... And I find myself making a lot of automated scripts as part of that process as I have a personal motto that its always better to automate everything possible with scripts first and only use LLM's as a last resort or for things you can script away. And that is making me look more and more in to old techniques that have been long established way back when...

    • andrehacker 2 hours ago

      Ah, the early days of AI.

      If a book or movie is ever made about the history of AI, the script would include this period of AI history and would probably go something like this…

      (Some dramatic license here, sure. But not much more than your average "based on true events" script.)

      In 1957, Frank Rosenblatt built a physical neural network machine called the Perceptron. It used variable resistors and reconfigurable wiring to simulate brain-like learning. Each resistor had a motor to adjust weights, allowing the system to "learn" from input data. Hook it up to a fridge-sized video camera (20x20 resolution), train it overnight, and it could recognize objects. Pretty wild for the time.

      Rosenblatt was a showman—loud, charismatic, and convinced intelligent machines were just around the corner.

      Marvin Minsky, a jealous academic peer of Frank, was in favor of a different approach to AI: Expert Systems. He published a book (Perceptrons, 1969) which all but killed research into neural nets. Marvin pointed out that no neural net with a depth of one layer could solve the "XOR" problem.

      While the book's findings and mathematical proof were correct, they were based on incorrect assumptions (that the Perceptron only used one layer and that algorithms like backpropagation did not exist).

      As a result, a lot of academic AI funding was directed towards Expert Systems. The flagship of this was the MYCIN project. Essentially, it was a system to find the correct antibiotic based on the exact bacteria a patient was infected with. The system thus had knowledge about thousands and thousands of different diseases with their associated symptoms. At the time, many different antibiotics existed, and using the wrong one for a given disease could be fatal to the patient.

      When the system was finally ready for use... after six years (!), the pharmaceutical industry had developed “broad-spectrum antibiotics,” which did not require any of the detailed analysis MYCIN was developed for.

      The period of suppressing Neural Net research is now referred to as (one of) the winter(s) of AI.

      --------

      As said, that is the fictional treatment. In reality, the facts, motivations, and behavior of the characters are a lot more nuanced.

      • Animats 36 minutes ago

        Not that wrong.

        I went through Stanford CS when those guys were in charge. It was starting to become clear that the emperor had no clothes, but most of the CS faculty was unwilling to admit it. It was really discouraging. Peak hype was in "The fifth generation: artificial intelligence and Japan's computer challenge to the world" (1983), by Feigenbaum. (Japan at one point in the 1980s had an AI program which attempted to build hardware to run Prolog fast.)

        Trying to use expert systems for medicine lent an appearance of importance to something that might work for auto repair manuals. It's mostly a mechanization of trouble-shooting charts. It's not totally useless, but you get out pretty much what you carefully put in.

        • mamp 3 minutes ago

          To be fair the performance of rules or Bayesian networks or statistical models wasn't the problem (performance compared to existing practice). DeDombal showed in 1972 that a simple Bayes model was better than most ED physicians in triaging abdominal pain.

          The main barrier to scaling was workflow integration due to lack of electronic data, and if it was available, interoperability (as it is today). The other barriers were problems with maintenance and performance monitoring, which are still issues today in healthcare and other industries.

          I do agree the 5th Generation project never made sense, but as you point out they had developed hardware to accelerate Prolog and wanted to show it off and overused the tech. Hmmm, sounds familiar...

        • mamp 18 minutes ago

          Don’t attribute to jealousy that can be adequately explained by vanishing gradients.

          BTW the ad hoc treatment of uncertainty in Mycin (certainty factors) motivated the work of Bayesian network.