• hubraumhugo 5 minutes ago

    Has anyone seen AI agents working in production at scale? It doesn't matter if you're using Swarm, langchain, or any other orchestration framework if the underlying issue is that AI agents too slow, too expensive, and too unreliable. I wrote about AI agent hype vs. reality[0] a while ago, and I don't think it has changed yet.

    [0] https://www.kadoa.com/blog/ai-agents-hype-vs-reality

    • sebnun 41 minutes ago

      I immediately thought of Docker Swarm. Naming things is one of the hardest problems in computer science.

    • antfarm 6 hours ago

      There used to be another open-source agentframework by the same name, but it was for multi-agent simulations. For a moment I thought there was a new wave of interest in a deeper understanding of complex systems by means of modelling.

      https://en.wikipedia.org/wiki/Swarm_(simulation)

      https://www.santafe.edu/research/results/working-papers/the-...

      • NelsonMinar 2 hours ago

        Hey, I wrote that! But it was nearly 30 years ago, it's OK for someone else to use the same name.

        Fun fact: Swarm was one of the very few non-NeXT/Apple uses of Objective C. We used the GNU Objective C runtime. Dynamic typing was a huge help for multiagent programming compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)

        • edbaskerville an hour ago

          Hey, thanks for writing the original Swarm! Also thought of that immediately when I saw the headline.

          I enjoyed using it around 2002, got introduced via Rick Riolo at the the University of Michigan Center for the Study of Complex Systems. It was a bit of a gateway drug for me from software into modeling, particularly since I was already doing OS X/Cocoa stuff in Objective-C.

          A lot of scientific modelers start with differential equations, but coming from object-oriented software ABMs made a lot more sense to me, and learning both approaches in parallel was really helpful in thinking about scale, dimensionality, representation, etc. in the modeling process, as ODEs and complex ABMs—often pathologically complex—represent end points of a continuum.

          Tangentially, in one of Rick's classes we read about perceptrons, and at one point the conversation turned to, hey, would it be possible to just dump all the text of the Internet into a neural net? And here we are.

        • mnky9800n 6 hours ago

          I believe there is a new wave of interest in deeper understanding of complex systems through modelling and connecting with machine learning. I organized this conference on exploring system dynamics with AI which you can see most of the lectures here:

          https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYf...

          The idea was to think about it from different directions including academia, industry, and education.

          Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things. There was a talk on high dimensional systems modelled with networks but the speaker didn't want their talk published online.

          Anyways I'm happy to chat more about these topics. I'm obsessed with understanding complexity using ai, modelling, and other methods.

          • patcon 2 hours ago

            This looks rad! But you should title the videos with the topic and the speakers name, and if you must include the conference name, put it at the end :)

            As-is, it's hard to skim the playlist, and likely terrible for organic search on Google or YouTube <3

            • llm_trw 5 hours ago

              An AI conference that isn't bullshit hype? Will wonders never cease?

              > Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things.

              To answer your question I did build a simulation of how a multi model agent swarm - agents have different capabilities and run times - would impact the end user wait time based on arbitrary message parsing graphs.

              After playing with it for an afternoon I realized I was basically doing a very wasteful Markov chain enumeration algorithm and wrote one up accordingly.

          • segmondy 3 hours ago

            There's absolutely nothing new in this framework that you won't find in a dozen other agent frameworks on github.

            • croes 2 hours ago

              Simulating progress.

            • ac130kz 5 hours ago

              Looks kinda poorly written: not even a single async present, print debugging, deepcopy all over the place. Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.

              • dartos 5 hours ago

                I hold no love for openai, but to be fair (and balanced) they put this right in the start of their readme.

                > Swarm is currently an experimental sample framework intended to explore ergonomic interfaces for multi-agent systems. It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)

                It’s literally not meant to replace anything.

                IMO the reason there’s no langchain replacement is because everything langchain does is so darn easy to do yourself, there’s hardly a point in taking on another dependency.

                Though griptape.ai also exists.

                • CharlieDigital 3 hours ago

                      > Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.
                  
                  Check out Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernel

                  Supports .NET, Java, and Python. Lots of sample code[0] and support for agents[1] including a detailed guide[2].

                  We use it at our startup (the .NET version). It was initially quite unstable in the early days because of frequent breaking changes, but it has stabilized (for the most part). Note: the official docs may still be trailing, but the code samples in the repo and unit tests are up to date.

                  Highly recommended.

                  [0] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

                  [1] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

                  [2] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

                  • arnaudsm 3 hours ago

                    OpenAI's code quality leaves to be desired, which is surprising considering how well compensated their engineers are.

                    Their recent realtime demo had so many race conditions, function calling didn't even work, and the patch suggested by the community hasn't been merged for a week.

                    https://github.com/openai/openai-realtime-api-beta/issues/14

                    • croes 2 hours ago

                      Why need they engineers if they have GPT?

                      Do they use their own product?

                    • d4rkp4ttern 5 hours ago

                      You can have a look at Langroid, an agent-oriented LLM framework from CMU/UW-Madison researchers (I am the lead dev). We are seeing companies using it in production in preference to other libs mentioned here.

                      https://github.com/langroid/langroid

                      Among many other things, we have a mature tools implementation, especially tools for orchestration (for addressing messages, controlling task flow, etc) and recently added XML-based tools that are especially useful when you want an LLM to return code via tools -- this is much more reliable than returning code in JSON-based tools.

                      • alchemist1e9 5 hours ago

                        Take a look at txtai as an alternative more flexible and more professional framework for this problem space.

                      • Quizzical4230 4 hours ago
                        • thawab an hour ago

                          This dude has issues, the reddit post in /r/MachineLearning top comment:

                          > Yes, basically. Delete any kyegomez link on sight. He namesquats recent papers for the clout, though the code never actually runs, much less replicates the paper results. We've had problems in /r/mlscaling with people unwittingly linking his garbage - we haven't bothered to set up an Automod rule, though.

                          [0] https://github.com/princeton-nlp/tree-of-thought-llm/issues/...

                          [1] https://x.com/ShunyuYao12/status/1663946702754021383

                          • seanhunter an hour ago

                            I would be pretty astonished if the complainer manages to get the trademark they think they have on "swarms" enforced. People have been using the word "swarm" in connection with simulations of various kinds for as long as I have been interested in simulations (I mean I think I first heard the word swarm in connection with a simulation in relation to something done by the santa fe institute in the 80s if memory serves correctly - it's been a long time).[1]

                            Most likely outcome is if they try to actually pursue this they lose their "trademark" and the costs drive them out of business.

                            [1] I didn't misremember https://www.swarm.org/wiki/Swarm:Software_main_page

                          • mnk47 17 hours ago

                            edit: They've added a cookbook article at https://cookbook.openai.com/examples/orchestrating_agents

                            It's MIT licensed.

                            • keeeba 6 hours ago

                              Thanks for linking - I know this is pedantic but one might think OpenAI’s models could make their content free of basic errors quite easily?

                              “Conretely, let's define a routine to be a list of instructions in natural langauge (which we'll repreesnt with a system prompt), along with the tools necessary to complete them.”

                              I count 3 in one mini paragraph. Is GPT writing this and being asked to add errors, or is GPT not worth using for their own content?

                              • ukuina 2 hours ago

                                > ONLY if not satesfied, offer a refund.

                                If only we had a technology to access language expertise on demand...

                                • r2_pilot 5 hours ago

                                  Clearly they should be using Claude instead.

                              • siscia 5 hours ago

                                I am not commenting on the specific framework, as I just skimmed the readme.

                                But I find this approach working well overall.

                                Moreover it is easily debuggable and testable in isolation which is one of the biggest selling point.

                                (If anyone is building ai products feel free to hit me.)

                                • thawab 5 hours ago

                                  In the example folder they used qdrant as a vector database, why not use openai’s assistants api? The idea for a vendor lock solution is to make things simpler. Is it because qdrant is faster?

                                  • htrp 5 hours ago

                                    qdrant is part of the openai tech stack for their RAG solutions

                                    • thawab 3 hours ago

                                      Why use it if you can do RAG with openai's assistants api?

                                  • 2024user 7 hours ago

                                    What is the challenge here? Orchestration/triage to specific assistants seems straight forward.

                                    • llm_trw 6 hours ago

                                      There isn't one.

                                      The real challenge for at scale inference is that the compute for models is too long to keep normal API connections open and you need a message passing system in place. This system also needs to be able to deliver large files for multi-modal models if it's not going to be obsolete in a year or two.

                                      I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale.

                                      • lrog 5 hours ago

                                        Why not use Temporal?

                                        An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...

                                        • llm_trw 5 hours ago

                                          Because when you see someone try and reinvent Erlang in another language for the Nth time you know you can safely ignore them.

                                          • jatins 3 hours ago

                                            ooc how does Temporal reinvent Erlang?

                                            • TeMPOraL 3 hours ago

                                              I don't.

                                              Sorry, you mean the company.

                                        • 2024user 4 hours ago

                                          Thanks. Could something like Kafka be used?

                                          • llm_trw 3 hours ago

                                            You could use messenger pigeons if you felt like it.

                                            People really don't understand how much better LLM swarms get with more agents. I never hit a point of diminishing returns on text quality over two days of running a swarm of llama2 70Bs on an 8x4090 cluster during the stress test.

                                            You would need something similar to, but better than, whatsapp to handle the firehose of data that needs to cascade between agents when you start running this at scale.

                                            • ValentinA23 2 hours ago

                                              >People really don't understand how much better LLM swarms get with more agents. I never hit a point of diminishing returns on text quality

                                              Could you elaborate please ?

                                              One use for swarms is to use multiple agents/prompts in place of one single agent with one long prompt in order to increase performance by splitting one big task into many. It is very time consuming though, as it requires experimenting to determine how best to divide one task into subtasks, including writing code to parse and sanitize each task output and plug it back into the rest of the agent graph.

                                              Dspy [1] seems to target this problem space but last time I checked it only focused on single prompt optimization (by selecting which few shots examples lead to the best prompt performance for instance), but even though I have seen papers on the subject, I have yet to find a framework that tackles the problem of agent graph optimization although research on this topic has been done [2][3][4]

                                              [1]DSPy: The framework for programming—not prompting—foundation models: https://github.com/stanfordnlp/dspy

                                              [2]TextGrad: Automatic 'Differentiation' via Text -- using large language models to backpropagate textual gradients: https://github.com/zou-group/textgrad

                                              [3]What's the Magic Word? A Control Theory of LLM Prompting: https://arxiv.org/abs/2310.04444

                                              [4]Language Agents as Optimizable Graphs: https://arxiv.org/abs/2402.16823

                                      • nsonha 8 hours ago

                                        how does this compare to Autogen and LangGraph? As someone new to this space, I tried to look into the other 2 but got pretty overwhelmed. Context is making multi agents, multi steps reasoning workflows

                                        • fkilaiwi 5 hours ago

                                          what is context?

                                        • nobrains 10 hours ago

                                          It is a foreshadowing name...

                                          • nsonha 8 hours ago

                                            where is my llm-compose.yml

                                          • htrp 5 hours ago

                                            Does anyone else feel like these are Google-style 20% time projects from the OpenAI team members looking to leave and trying to line up VC funding?

                                            • exitb 5 hours ago

                                              Doesn’t working on a venture on company time put you at an enormous disadvantage in terms of ownership?