• elgrantomate a day ago

    I've been playing with this for the past 24 hours or so. I like the atomic containment of the LLM, and the clear separation of logic, code, and prompts.

    You have some great working examples, but, for example: translate_text specifies the default language in three places: the card, the input schema, and the deck. This can't be necessary; I'll experiment, but shouldn't it just be defined in one place?

    The descriptive language of the project is a bit dense for me too. I'm having a hard time figuring out how to do basic things like parameters -- let's say that I want to constrain summarize_text to a certain length... I've tried to write language in the cards/decks, but the model doesn't seem to be paying attention.

    I also want to be able to load a file, e.g. not just "translate 'hello my friend' to Italian" but "translate '/test/hello_my_friend.txt' to Italian" and have it load the contents of the file as input text. How do I do that?

    Super cool project!

    • Agent_Builder a day ago

      This is interesting. What resonated for me is how much effort you’re putting into making the execution graph explicit.

      In my experience, a lot of agent failures don’t come from the harness or the model choice, but from authority and assumptions leaking across steps. Once agents stay “alive” too long, it gets hard to tell which decision was made under which constraints.

      I ran into similar issues experimenting with step-gated workflows using GTWY.ai. What helped was forcing each step to declare its inputs, tools, and outputs up front, and then dropping all of that before the next step ran. The system felt less clever, but debugging and reliability improved a lot.

      Curious how you’re thinking about permission and context lifetimes as agents call other agents. That boundary is where things got subtle for me.

      • benban a day ago

        nice work. the idea of breaking agents into short-lived executors with explicit inputs/outputs makes a lot of sense - most failures i've seen come from agents staying alive too long and leaking assumptions across steps.

        curious how you're handling context lifetimes when agents call other agents. do you drop context between calls or is there a way to bound it? that's been the trickiest part for us.

        • iainctduncan 2 days ago

          You might want to know that Gambit is an open source Scheme implementation that has been around a very long time.

          • Agent_Builder 2 days ago

            We ran into similar reliability issues while building GTWY. What surprised us was that most failures weren’t about model quality, but about agents being allowed to run too long without clear boundaries.

            What helped was treating agents less like “always-on brains” and more like short-lived executors. Each step had an explicit goal, explicit inputs, and a defined end. Once the step finished, the agent stopped and context was rebuilt deliberately.

            Harnesses like this feel important because they shift the problem from “make the model smarter” to “make the system more predictable.” In our experience, reliability came more from reducing degrees of freedom than from adding intelligence.

            • brap 2 days ago

              This seems to be where it’s at right now, we can’t seem to make the models significantly more intelligent, so we “inject” our own intelligence into the system, in the form of good old fashioned code.

              My philosophy is make the LLMs do as little work as possible. Only small, simple steps. Anything that can be reasonably done in code (orchestration, tool calls, etc) should be done in code. Basically any time you find yourself instructing an LLM to follow a certain recipe, just break it down to multiple agents and do what you can with code.

              • randall 2 days ago

                i have a slightly different but related take. the models actually are getting smarter, and now the challenge becomes successfully communicating intent with them instead of simply getting them to do anything remotely useful.

                Gambit hopefully solves some of that, giving you a set of primitives and principles that make it simpler to communicate intent.

            • Trufa 2 days ago

              Is this an alternative to https://mastra.ai/docs

              How would it compare?

              • randall 2 days ago

                So I look at something like Mastra (or LangChain) as agent orchestration, where you do computing tasks to line up things for an LLM to execute against.

                I look at Gambit as more of an "agent harness", meaning you're building agents that can decide what to do more than you're orchestrating pipelines.

                Basically, if we're successful, you should be able to chain agents together to accomplish things extremely simply (using markdown). Mastra, as far as I'm aware, is focused on helping people use programming languages (typescript) to build pipelines and workflows.

                So yes it's an alternative, but more like an alternative approach rather than a direct competitor if that makes sense.

              • elgrantomate a day ago

                also, it seems like this works with openrouter, and perhaps OpenAI -- what about Gemini API?

                • tomhow 2 days ago

                  [under-the-rug stub]

                  [see https://news.ycombinator.com/item?id=45988611 for explanation]

                  • franciscomello 3 days ago

                    This looks quite interesting in terms of the architecture. Seems like a fresh take on stuff like Langchain, which at least last time I checked sucks.

                    • randall 3 days ago

                      thx!

                    • sofdao 3 days ago

                      this is awesome

                      are things like file system baked in?

                      fan of the design of the system. looks great architecturally

                      • randall 3 days ago

                        omg thank you so much. We're working on the file system stuff, that's an easier lift for us than the initial work, so we wanted to start with the big stuff and work backward. Claude Code and Codex are obviously really great at that stuff, and we'd like to be able to support a lot of that out of the box.

                      • alberson 2 days ago

                        I’m excited to give this a spin at Agentive! Really interesting approach.

                        • pych 2 days ago

                          wow this looks cool - been meaning to dig into harness stuff this looks like a good starting point

                          • randall 2 days ago

                            Thx! Happy to help if you need it. :)

                          • randall 2 days ago

                            thx, i appreciate it, believe it or not. :)