Comments Page - Swarm, a new agent framework by OpenAI

« Back Swarm, a new agent framework by OpenAIgithub.comSubmitted by mnk47 9 months ago

hubraumhugo 9 months ago
Has anyone seen AI agents working in production at scale? It doesn't matter if you're using Swarm, langchain, or any other orchestration framework if the underlying issue is that AI agents too slow, too expensive, and too unreliable. I wrote about AI agent hype vs. reality[0] a while ago, and I don't think it has changed yet.
[0] https://www.kadoa.com/blog/ai-agents-hype-vs-reality
- fnordpiglet 9 months ago
  Yes we use agents in a human support agent facing application that has many sub agents used to summarize and analyze a lot of different models, prior support cases, knowledge base information, third party data sets, etc, to form an expert in a specific customer and their unique situation in detected potential fraud and other cases. The goal of the expert is to reduce the cognitive load of our support agent in analyzing some often complex situation with lots of information more rapidly and reliably. Because there is no right answer and the goal is error reduction not elimination it’s not necessary to have determinism, just do better than a human at understanding a lot of divergent information rapidly and answering various queries. Cost isn’t an issue because the decisions are high value. Speed isn’t an issue because the alternative is a human attempting to make sense of an enormous amount of information in many systems. It has dramatically improved our precision and recall over pure humans.
  doctorpangloss 9 months ago
  Isn’t the best customer service:
  Cost to Solve < Remaining LTV * Profit Margin
  In other words, do the details matter? If the customer leaves because you don’t take a fraudulent $10 return, but he’s worth $1,000 in the long term, that’s dumb.
  You might think that such a user doesn’t exist. Then you’d be getting the details wrong again! Example: Should ISPs disconnect users for piracy? Should Apple close your iCloud sub for pirating Apple TV? Should Amazon lose accounts for rejecting returns? Etc etc.
  A business that makes CS more details oriented is 200% the wrong solution.
  fnordpiglet 9 months ago
  The fraud we deal with is a lot more than $10.
  undefined 9 months ago
  [deleted]
  mmcwilliams 9 months ago
  Do you find that the entities committing fraud are using generative AI tools to facilitate the crimes?
  fnordpiglet 8 months ago
  They use every tool you can imagine. Most are not imaginative but many are profoundly smart. They could do anything they set their minds to and for some reason this is what they do.
- LASR 9 months ago
  The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.
  There are a whole class of problems that do not require low-latency. But not having consistency makes them pretty useless.
  Frameworks don’t solve that. You’ll probably need some sort of ground-truth injection at every sub-agent level. Ie: you just need data.
  Totally agree with you. Unreliability is the thing that needs solving first.
  debo_ 9 months ago
  > The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.
  Sounds like management to me.
  mycall 9 months ago
  Sticky goals and reevaluation of tasks is one way to keep the end result on track.
  How does gpt o1 solve this?
- irthomasthomas 9 months ago
  I use my own agent all day, every day. Here is one example: https://x.com/xundecidability/status/1835085853506650269
  I've been using the general agent to build specialised sub-agents. Here's an example search agent beating perplexity: https://x.com/xundecidability/status/1835059091506450493
  999900000999 9 months ago
  Do you have any code to share?
  I'm failing to see the point in the example, unless the agents can do things on multiple threads. For example let's say we have Boss Agent.
  I can ask Boss agent to organize a trip for five people to the Netherlands.
  Boss agent can ask some basic questions, about where my Friends are traveling from, and what our budget is .
  Then travel agent can go and look up how we each can get there, hotel agent can search for hotel prices, weather agent can make sure it's nice out, sightseeing agent can suggest things for us to do. And I guess correspondence agent can send out emails to my actual friends.
  If this is multi-threaded, you could get a ton of work done much faster. But if it's all running on a single thread anyway, then couldn't boss agent just switch functionality after completing each job ?
  irthomasthomas 9 months ago
  That particular task didn't need parallel agents or any of the advanced features.
  The prompt was: <prompt> Research claude pricing with caching and then review a conversation history to calculate the cost. First, search online for pricing for anthropic api with and without caching enabled for all of the models: claude-3-haiku, claude-3-opus and claude-3.5-sonnet (sonnet 3.5). Create a json file with ALL the pricing data.
  from the llm history db, fetch the response.response_json.usage for each result under conversation_id=01j7jzcbxzrspg7qz9h8xbq1ww llm_db=$(llm logs path) schema=$(sqlite3 $llm_db '.schema') example usage: { "input_tokens": 1086, "output_tokens": 1154, "cache_creation_input_tokens": 2364, "cache_read_input_tokens": 0 }
  Calculate the actual costs of each prompt by using the usage object for each response based the actual token usage cached or not. Also calculate/simulate what it would have cost if the tokens where not cached. Create interactive graphs of different kinds to show the real cost of conversation, the cache usage, and a comparison to what it would have costed without caching.
  Write to intermediary files along the way.
  Ask me if anything is unclear. </prompt>
  I just gave it your task and I'll share the results tomorrow (I'm off to bed).
- fsndz 9 months ago
  True. In the classic form of automation, reasoning is externalized into rules. In the case of AI agents, reasoning is internalized within a language model. This is a fundamental difference. The problem is that language models are not designed to reason. They are designed to predict the next most likely word. They mimic human skills but possess no general intelligence. They are not ready to function without a human in the loop. So, what are the implications of this new form of automation that AI agents represent? https://www.lycee.ai/blog/ai-agents-automation-eng
- xrd 9 months ago
  I want hear more about this. I'm playing with langroid, crew.ai, and dspy and they all layer so many abstractions on top of a shifting LLM landscape. I can't believe anyone is really using them in the way their readme goals profess.
  d4rkp4ttern 9 months ago
  Not you in particular, but I hear this common refrain that the "LLM landscape is shifting", but what exactly is shifting? Yes new models are constantly announced, but at the end of the day, interacting with the LLMs involves making calls to an API, and the OpenAI API (and perhaps Anthropic's variant) has become fairly established, and this API will obviously not change significantly any time soon.
  Given that there is (a fairly standard) API to interact with LLMs, the next question is, what abstractions and primitives help easily build applications on top of these, while giving enough flexibility for complex use cases.
  The features in Langroid have evolved in response to the requirements of various use-cases that arose while building applications for clients, or companies that have requested them.
  dimitri-vs 9 months ago
  Sonnet 3.5 and other large context models made context management approaches irrelevant and will continue to do so.
  o1 (and likely sonnet 3.5) made chain of through and other complex prompt engineering irrelevant.
  Realtime API (and others that will soon follow) will made the best VTT > LLM > TTV irrelevant.
  VLMs will likely make LLMs irrelevant. Who knows what Google has planned for Gemini 2.
  The point is building these complex agents has been proven a waste of time over and over again until, at least until we see a plateau in models. It's much easier to swap in a single API call and modify one or two prompts than to rework a convoluted agentic approach. Especially when it's very clear that the same prompts can't be reused reliably between different models.
  lmeyerov 9 months ago
  I encourage you to run evals on result quality for real b2b tasks before making these claims. Almost all of your post is measurably wrong in ways that cause customers to churn an AI product same-day.
  xrd 9 months ago
  I appreciate your comment.
  I suppose my comment is reserved more for the documentation than the actual models in the wild?
  I do worry that LLM service providers won't do any better than rest API providers in versioning their backend. Even if we specify the model in the call to the API, it feels like it will silently be upgraded behind the scenes. There are so many parameters that could be adjusted to "improve" the experience for users even if the weights don't change.
  I prefer to use open weight models when possible. But so many agentic frameworks, like this one (to be fair, I would not expect OpenAI to offer a framework that work local first), treat the local LLM experience as second class, at best.
  soco 8 months ago
  Years ago we complained about the speed with which new JavaScript frameworks were popping into existence. Today it goes one order of magnitude faster, and the quality of the outputs can only be suffering. Yes there's code but so and so, interfaces and APIs change dramatically, and the documentation is a few versions behind. Who has time to compare simply cannot do it in depth, and ideas get also dropped on the way. I don't want to call it a mess because it's too negative, to have many ideas is great but I feel we're still in the brainstorming phase.
- islewis 9 months ago
  > The underlying issue is that AI agents too slow,
  Inference speed is being rapidly optimized, especially for edge devices.
  > too expensive,
  The half-life of OpenAI's API pricing is a couple of months. While the bleeding edge model is always costly, the cost of API's are becoming rapidly available to the public.
  > and too unreliable
  Out of the 3 points raised, this is probably the most up in the air. Personally I chalk this up to sideeffects of OpenAI's rapid growth over the last few years. I think this gets solved, especially once price and latency have been figured out.
  IMO, the biggest unknown here isn't a technical one, but rather a business one- I don't think it's certain that products built on multi-agent architectures will be addressing a need for end users. Most of the talk I see in this space are by people excited by building with LLM's, not by people who are asking to pay for these products.
- theptip 9 months ago
  Frankly, what you are describing is a money-printing machine. You should expect anyone who has figured out such a thing to keep it as a trade secret, until the FOSS community figures out and publishes something comparable.
  I don’t think the tech is ready yet for other reasons, but absence of anyone publishing is not good evidence against.
- morgante 9 months ago
  Agents can work in production, but usually only when they are closer to "workflows" that are very targeted to a specific use case.
- mycall 9 months ago
  If the solution the agents create is immediately useful, then waiting a few minutes or longer for the answer is fine.
- inglor 9 months ago
  Yes I built a lot of stuff (at batch, not to respond to user queries). Mostly large scale code generation and testing tasks.
antfarm 9 months ago
There used to be another open-source agentframework by the same name, but it was for multi-agent simulations. For a moment I thought there was a new wave of interest in a deeper understanding of complex systems by means of modelling.
https://en.wikipedia.org/wiki/Swarm_(simulation)
https://www.santafe.edu/research/results/working-papers/the-...
- NelsonMinar 9 months ago
  Hey, I wrote that! But it was nearly 30 years ago, it's OK for someone else to use the same name.
  Fun fact: Swarm was one of the very few non-NeXT/Apple uses of Objective C. We used the GNU Objective C runtime. Dynamic typing was a huge help for multiagent programming compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)
  edbaskerville 9 months ago
  Hey, thanks for writing the original Swarm! Also thought of that immediately when I saw the headline.
  I enjoyed using it around 2002, got introduced via Rick Riolo at the the University of Michigan Center for the Study of Complex Systems. It was a bit of a gateway drug for me from software into modeling, particularly since I was already doing OS X/Cocoa stuff in Objective-C.
  A lot of scientific modelers start with differential equations, but coming from object-oriented software ABMs made a lot more sense to me, and learning both approaches in parallel was really helpful in thinking about scale, dimensionality, representation, etc. in the modeling process, as ODEs and complex ABMs—often pathologically complex—represent end points of a continuum.
  Tangentially, in one of Rick's classes we read about perceptrons, and at one point the conversation turned to, hey, would it be possible to just dump all the text of the Internet into a neural net? And here we are.
  NelsonMinar 9 months ago
  I took a graduate level class in the 1990s from some SFI luminaries. It was a great class but the dismal conclusion was "this stuff is kind of neat but not very practical, traditional optimization techniques usually work better". None of us guessed if you could scale the networks up 1 million X or more they'd become magic.
  seanhunter 9 months ago
  Hey thanks for writing the original swarm. I found your framework very inspiring when I was conducting my own personal (pretty much universally failed) experiments into making this kind of multi-agent simulation.
  darknavi 9 months ago
  > compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)
  C++ has added a ton of great features since (especially C++11 onward) but run-time reflection is still sorely missed.
- mnky9800n 9 months ago
  I believe there is a new wave of interest in deeper understanding of complex systems through modelling and connecting with machine learning. I organized this conference on exploring system dynamics with AI which you can see most of the lectures here:
  https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYf...
  The idea was to think about it from different directions including academia, industry, and education.
  Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things. There was a talk on high dimensional systems modelled with networks but the speaker didn't want their talk published online.
  Anyways I'm happy to chat more about these topics. I'm obsessed with understanding complexity using ai, modelling, and other methods.
  patcon 9 months ago
  This looks rad! But you should title the videos with the topic and the speakers name, and if you must include the conference name, put it at the end :)
  As-is, it's hard to skim the playlist, and likely terrible for organic search on Google or YouTube <3
  mnky9800n 9 months ago
  I agree with you. Unfortunately I'm not in charge of the videos so even though I asked them to do this they didn't. Haha.
  llm_trw 9 months ago
  An AI conference that isn't bullshit hype? Will wonders never cease?
  > Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things.
  To answer your question I did build a simulation of how a multi model agent swarm - agents have different capabilities and run times - would impact the end user wait time based on arbitrary message parsing graphs.
  After playing with it for an afternoon I realized I was basically doing a very wasteful Markov chain enumeration algorithm and wrote one up accordingly.
  mnky9800n 8 months ago
  Yeah I already have loads of people asking when the next one is for this exact reason. Haha. Well, I would love to have people help organise the next one. But I don't know yet
ac130kz 9 months ago
Looks kinda poorly written: not even a single async present, print debugging, deepcopy all over the place. Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.
- dartos 9 months ago
  I hold no love for openai, but to be fair (and balanced) they put this right in the start of their readme.
  > Swarm is currently an experimental sample framework intended to explore ergonomic interfaces for multi-agent systems. It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)
  It’s literally not meant to replace anything.
  IMO the reason there’s no langchain replacement is because everything langchain does is so darn easy to do yourself, there’s hardly a point in taking on another dependency.
  Though griptape.ai also exists.
  undefined 9 months ago
  [deleted]
- CharlieDigital 9 months ago
  > Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.
  Check out Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernel
  Supports .NET, Java, and Python. Lots of sample code[0] and support for agents[1] including a detailed guide[2].
  We use it at our startup (the .NET version). It was initially quite unstable in the early days because of frequent breaking changes, but it has stabilized (for the most part). Note: the official docs may still be trailing, but the code samples in the repo and unit tests are up to date.
  Highly recommended.
  [0] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
  [1] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
  [2] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
- arnaudsm 9 months ago
  OpenAI's code quality leaves to be desired, which is surprising considering how well compensated their engineers are.
  Their recent realtime demo had so many race conditions, function calling didn't even work, and the patch suggested by the community hasn't been merged for a week.
  https://github.com/openai/openai-realtime-api-beta/issues/14
  keithwhor 9 months ago
  Hey! I was responsible for developing this.
  Not speaking for OpenAI here, only myself — but this is not an official SDK — only a reference implementation. The included relay is only intended as an example. The issues here will certainly be tackled for the production release of the API :).
  I’d love to build something more full-featured here and may approach it as a side project. Feel free to ping me directly if you have ideas. @keithwhor on GitHub / X dot com.
  arnaudsm 9 months ago
  Thank you, will contact you asap, I'd be happy to help :)
  croes 9 months ago
  Why need they engineers if they have GPT?
  Do they use their own product?
- d4rkp4ttern 9 months ago
  You can have a look at Langroid, an agent-oriented LLM framework from CMU/UW-Madison researchers (I am the lead dev). We are seeing companies using it in production in preference to other libs mentioned here.
  https://github.com/langroid/langroid
  Among many other things, we have a mature tools implementation, especially tools for orchestration (for addressing messages, controlling task flow, etc) and recently added XML-based tools that are especially useful when you want an LLM to return code via tools -- this is much more reliable than returning code in JSON-based tools.
- alchemist1e9 9 months ago
  Take a look at txtai as an alternative more flexible and more professional framework for this problem space.
mnk47 9 months ago
edit: They've added a cookbook article at https://cookbook.openai.com/examples/orchestrating_agents
It's MIT licensed.
- keeeba 9 months ago
  Thanks for linking - I know this is pedantic but one might think OpenAI’s models could make their content free of basic errors quite easily?
  “Conretely, let's define a routine to be a list of instructions in natural langauge (which we'll repreesnt with a system prompt), along with the tools necessary to complete them.”
  I count 3 in one mini paragraph. Is GPT writing this and being asked to add errors, or is GPT not worth using for their own content?
  ukuina 9 months ago
  > ONLY if not satesfied, offer a refund.
  If only we had a technology to access language expertise on demand...
  r2_pilot 9 months ago
  Clearly they should be using Claude instead.
segmondy 9 months ago
There's absolutely nothing new in this framework that you won't find in a dozen other agent frameworks on github.
- croes 9 months ago
  Simulating progress.
- xrd 9 months ago
  Which ones do you suggest considering?
  adamdiy 9 months ago
  one mentioned elsewhere in thread: https://github.com/crewAIInc/crewAI
  meadhikari 9 months ago
  I have noticed that CrewAI burns too much token for anything significant
kgc 9 months ago
I feel like there's a motivation here to generate a lot of inference demand. Having multiple o1 style agents churning tokens with each other seems like a great demand driver.
sebnun 9 months ago
I immediately thought of Docker Swarm. Naming things is one of the hardest problems in computer science.
- 8f2ab37a-ed6c 9 months ago
  Or https://www.perforce.com/products/helix-swarm if you’re in the game dev world
Quizzical4230 9 months ago
Anyone see the drama here: https://github.com/openai/swarm/issues/50
- thawab 9 months ago
  This dude has issues, the reddit post in /r/MachineLearning top comment:
  > Yes, basically. Delete any kyegomez link on sight. He namesquats recent papers for the clout, though the code never actually runs, much less replicates the paper results. We've had problems in /r/mlscaling with people unwittingly linking his garbage - we haven't bothered to set up an Automod rule, though.
  [0] https://github.com/princeton-nlp/tree-of-thought-llm/issues/...
  [1] https://x.com/ShunyuYao12/status/1663946702754021383
  Quizzical4230 9 months ago
  Oh!
  What really bothers me is that this kyegomez person wasted time and energy of so many people and for what?
  thawab 9 months ago
  followers, clicks? anyone who spends a few minutes browsing his repo will know he is a fraud. Here is an example:
  https://github.com/kyegomez/AlphaFold3
  most issues are people not able to run his code. These issues are closed. The repo has 700 stars.
  Quizzical4230 9 months ago
  Saw an issue which fails at `pip install alphafold3`. If you're gonna bamboozle me, at least put in the effort to get the first step right Xp
  az226 9 months ago
  Just gonna leave this here: https://github.com/kyegomez/tree-of-thoughts/issues/78#issue...
  Also this part from the reply before editing it away:
  They get mad that my repo and code is better than their's and they published they paper, they feel entitled even though I reproduced the entire paper based on 4 phrases, dfs, bfs (search algos), generate solutions, and generate thoughts and this is it. I didn't even read the full paper when I first started to implement it. The reason they want people to unstar my repo is because they are jealous that they made a mistake by not sharing the code when they published the paper as real scientists would do. If you do not publish your code as a AI research scientists you are a heretic, as your work cannot be tried and tested. and the code works amazingly much better than theirs, I looked at their code and couldn't figure out how to run it for hours, as well as other people have reported the same. the motivations are jealously, self hatred, guilt, envy, inferiority complex, ego, and much more psychographic principles.
  Quizzical4230 9 months ago
  I have SO MANY THOUGHTS.
  But its best to leave them out for sanity :)
  Thanks for adding the unedited comment as it shines light over the newly fabricated comment.
  newman314 9 months ago
  I looked on his GH profile page. How was he able to amass over 16k GitHub stars?
  thawab 9 months ago
  New research paper drop or go viral > create a repo with AI code > post it in social media. Users star a repo to bookmark it. The few who test the code write in the issue section and get their issue closed with no replies.
  Thats why some subreddits flagged these name squatters.
  kevindamm 9 months ago
  I think a lot of people use stars as a kind of bookmark, not for recognition. It takes time to read through the code or set up a working build from a fork. I, for one, occasionally use stars to remind myself to return to a repo for a more thorough look (especially if I'm on mobile at the time).
  Also, bots.
- seanhunter 9 months ago
  I would be pretty astonished if the complainer manages to get the trademark they think they have on "swarms" enforced. People have been using the word "swarm" in connection with simulations of various kinds for as long as I have been interested in simulations (I mean I think I first heard the word swarm in connection with a simulation in relation to something done by the santa fe institute in the 80s if memory serves correctly - it's been a long time).[1]
  Most likely outcome is if they try to actually pursue this they lose their "trademark" and the costs drive them out of business.
  [1] I didn't misremember https://www.swarm.org/wiki/Swarm:Software_main_page
  sunnybeetroot 9 months ago
  You may be interested in seeing a reply by the creator in these comments: https://news.ycombinator.com/item?id=41819866
  seanhunter 9 months ago
  Yeah I saw it after posting my thing. So cool to have folks like that on this forum.
  Quizzical4230 9 months ago
  Are they trying to advertise swarm.ai?
  Bad press is still press XD
nsonha 9 months ago
how does this compare to Autogen and LangGraph? As someone new to this space, I tried to look into the other 2 but got pretty overwhelmed. Context is making multi agents, multi steps reasoning workflows
- fkilaiwi 9 months ago
  what is context?
arach 9 months ago
Worth noting there is an interesting multi-agent open source project named Swarms. When I saw this on X earlier I thought maybe the team had joined OpenAI but there's no connection between these projects
> "Swarms: The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework"
[0] https://github.com/kyegomez/swarms
[1] https://docs.swarms.world/en/latest/
ItsSaturday 9 months ago
" It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)"
Nope this doesn't mean it at all. You decided additionaly and independent from the other statements that you do not allow collaboration at all.
Which is fine the sentence is still unlogical
2024user 9 months ago
What is the challenge here? Orchestration/triage to specific assistants seems straight forward.
- llm_trw 9 months ago
  There isn't one.
  The real challenge for at scale inference is that the compute for models is too long to keep normal API connections open and you need a message passing system in place. This system also needs to be able to deliver large files for multi-modal models if it's not going to be obsolete in a year or two.
  I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale.
  lrog 9 months ago
  Why not use Temporal?
  An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...
  llm_trw 9 months ago
  Because when you see someone try and reinvent Erlang in another language for the Nth time you know you can safely ignore them.
  jatins 9 months ago
  ooc how does Temporal reinvent Erlang?
  TeMPOraL 9 months ago
  I don't.
  Sorry, you mean the company.
  2024user 9 months ago
  Thanks. Could something like Kafka be used?
  llm_trw 9 months ago
  You could use messenger pigeons if you felt like it.
  People really don't understand how much better LLM swarms get with more agents. I never hit a point of diminishing returns on text quality over two days of running a swarm of llama2 70Bs on an 8x4090 cluster during the stress test.
  You would need something similar to, but better than, whatsapp to handle the firehose of data that needs to cascade between agents when you start running this at scale.
  ValentinA23 9 months ago
  >People really don't understand how much better LLM swarms get with more agents. I never hit a point of diminishing returns on text quality
  Could you elaborate please ?
  One use for swarms is to use multiple agents/prompts in place of one single agent with one long prompt in order to increase performance by splitting one big task into many. It is very time consuming though, as it requires experimenting to determine how best to divide one task into subtasks, including writing code to parse and sanitize each task output and plug it back into the rest of the agent graph.
  Dspy [1] seems to target this problem space but last time I checked it only focused on single prompt optimization (by selecting which few shots examples lead to the best prompt performance for instance), but even though I have seen papers on the subject, I have yet to find a framework that tackles the problem of agent graph optimization although research on this topic has been done [2][3][4]
  [1]DSPy: The framework for programming—not prompting—foundation models: https://github.com/stanfordnlp/dspy
  [2]TextGrad: Automatic 'Differentiation' via Text -- using large language models to backpropagate textual gradients: https://github.com/zou-group/textgrad
  [3]What's the Magic Word? A Control Theory of LLM Prompting: https://arxiv.org/abs/2310.04444
  [4]Language Agents as Optimizable Graphs: https://arxiv.org/abs/2402.16823
  llm_trw 9 months ago
  >Could you elaborate please ?
  No.
  I've tried explaining this to supposedly smart people in both a 15 minute pitch deck and a research paper and unless they were inclined to think it from the start no amount of proof has managed to convince them.
  I figure it's just not possible to convince people, even with the proof in front of them, of how powerful the system is. The same way that we still have people arguing _right now_ that all LLMs are just auto complete on steroids.
  Veen 9 months ago
  Prove how powerful "the system" is by doing something useful or value-generating with it. Then people will believe you. Talk is cheap.
  llm_trw 9 months ago
  >Prove how useful LLMs are by doing something useful or value-generating with them. Then people will believe you. Talk is cheap.
  You after chat GPT2 was released.
  dboreham 9 months ago
  > people arguing _right now_ that all LLMs are just auto complete on steroids.
  Funny because when I learned about how LLMS worked my immediate thought was "Oh, humans are just LLMs on steroids". So auto complete on steroids squared.
  ValentinA23 9 months ago
  I'm inclined to think it from the start
  llm_trw 9 months ago
  If you care enough you can email me at omni_vision_ai@proton.me I'd be happy to talk more in a less public setting.
siscia 9 months ago
I am not commenting on the specific framework, as I just skimmed the readme.
But I find this approach working well overall.
Moreover it is easily debuggable and testable in isolation which is one of the biggest selling point.
(If anyone is building ai products feel free to hit me.)
thawab 9 months ago
In the example folder they used qdrant as a vector database, why not use openai’s assistants api? The idea for a vendor lock solution is to make things simpler. Is it because qdrant is faster?
- htrp 9 months ago
  qdrant is part of the openai tech stack for their RAG solutions
  thawab 9 months ago
  Why use it if you can do RAG with openai's assistants api?
  jeffchuber 8 months ago
  its not
undefined 9 months ago
[deleted]
htrp 9 months ago
Does anyone else feel like these are Google-style 20% time projects from the OpenAI team members looking to leave and trying to line up VC funding?
- exitb 9 months ago
  Doesn’t working on a venture on company time put you at an enormous disadvantage in terms of ownership?
  johntash 9 months ago
  Not just company time, but company resources and the company's github org.
  But yeah, I'd assume they have no ownership themselves unless they signed something explicit?
- undefined 9 months ago
  [deleted]
sidcool 9 months ago
ELI5 anyone?
AIFounder 9 months ago
[dead]
Reclaimer 9 months ago
[flagged]
- codekisser 9 months ago
  Aren't you this clown? https://news.ycombinator.com/item?id=41818961
- warkanlock 9 months ago
  A classic of self-promotion. So, with your own criteria, technically, we can say also you “copied?" the name, concept, style of this previous application developed by the Santa Fe Institute https://en.wikipedia.org/wiki/Swarm_(simulation)
- thawab 9 months ago
  Here is more context about OP:
  https://www.reddit.com/r/MachineLearning/comments/15sq2v1/d_...
- Technetium 9 months ago
  The title of this issue is: "Notorious namesquatter is threatening legal action" https://github.com/openai/swarm/issues/50
- NicolasKixely 9 months ago
  I wish I could say I'm surprised they didn't even attempt to hide it with a name change, but after the 'her' incident I'm not that surprised.
- henrysg 9 months ago
  Isn't their project called Swarm and yours called Swarms?
nobrains 9 months ago
It is a foreshadowing name...
- nsonha 9 months ago
  where is my llm-compose.yml