Read the architecture document here.[1]
The usual problems with these things are discovery and security. Discovery is done via local WiFi broadcast. Not clear how security is done. How do you allow ad-hoc networking yet disallow hostile actors from connecting?
Great idea combining batman with libp2p! You guys have the heart in the right place :-).
Currently, your project seems to be an opinionated wrapper ontop of libp2p. For this to become a proper distributed toolkit you lack an abstraction to for apps to collaborate over shared state (incl. convergence after partition). Come up with a good abstraction for that, and make it work p2p (e.g. delta state based CRDTs, or op-based CRDTs based on a replicated log; event sourcing ..). Tangentially related, a consensus abstraction might also be handy for some applications.
Also check out [iroh](https://github.com/n0-computer/iroh) as a potential awesome replacement for p2p; as well as [Actyx](https://github.com/Actyx/Actyx) as an inspiration of similar (sadly failed) project using rust-libp2p.
Oh, and you might want to give your docs a grammar review.
Kudos for showing!
You are right. At the moment, we are an opinionated wrapper, but we take a different approach to discovery than other libp2p-based networks with our custom batman-adv-based neighbor discovery.
Abstractions for collaboration are currently in the works, and we hope to release that soon. The work on consensus has already started. Your suggestions seem all very interesting, and we'll definitely consider them. We are also currently in the process of talking to potential users to build handy and approachable abstractions for them.
I saw that [freenet](https://docs.freenet.org/components/contracts.html) went with CRDTs, but I think they made it too complicated. We were thinking about a graph (or wide-column) with an engine similar to Kassandara and a frontend like (or ideally just) SurrealDB.
I remember that iroh moved away from libp2p when they dropped IPFS compatibility and moved to a self-built stack: https://www.iroh.computer/blog/a-new-direction-for-iroh When we got started, the capabilities of iroh didn't really fit our bill, but it seems like it's time to reevaluate that. As a former contributor to rust-libp2p, I never quite got the frustration with libp2p that many people have, Iroh included, especially since many of the described problems seemed fixable, and I would have preferred if they did that instead, and libp2p remains the shared base people build these things on.
I remember Actyx being a rust-libp2p user, but I wasn't aware that they failed. Do you have more info? How and why? It would be great if we could learn from them.
Grammar will be reviewed ;) thank you!
Very fun. Is this primarily a passion project or are you hoping to get corporate sponsorship & adoption?
Can you provide some insight as to why this would be preferred over an orchestration server? In this context - Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
This isn't my area of expertise but I think "Hive mind drones" tickles every engineer.
> Is this primarily a passion project or are you hoping to get corporate sponsorship & adoption?
We are in the current YC W25 batch and our vision is to build a developer framework for autonomous robotics systems from the system we already have.
> Can you provide some insight as to why this would be preferred over an orchestration server?
It heavily depends on your application, there are applications where it makes sense and others where it doesn’t. The main advantages are that you don’t need an internet connection, the system is more resilient against network outages, and most importantly, the resources on the robots, which are idle otherwise, are used. I think for hobbyists, the main upsides is that it’s quick to set up, you only have to turn on the machines and it should work without having to care about networking or setting up a cloud connection.
> Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
If the mothership is static, in the sense that it doesn’t change over time, we would consider it an orchestration server. Our core services don’t need that and we envision that most of the decentralized algorithms running on our system also don’t rely on such central point of failure. However, there are some applications where it makes sense to have a “temporary mothership”. We are just currently working on a “group” abstraction, which continuously runs a leader election to determine a “mothership” among the group (which is fault-tolerant however, as the leader can fail anytime and the system will instantly determine another one).
> The main advantages are that you don’t need an internet connection
To that end, I'm not clear on benefit in this model. To solve that problem I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight (in CPU, battery, etc.). There are several centralized models that don't require an external data connection
> the resources on the robots, which are idle otherwise, are used
But what's the benefit of this? I don't see the use case of needing the swarm to perform lots of calculations beyond the ones required for it's own navigation & communication with others. I suppose I could imagine a chain of these 'idle' drones acting as a communication relay between two separate, active hives. But the benefit there seems marginal.
> our system also don’t rely on such central point of failure
This seems like the primary upside, and it's a big one. I'm imagining a disaster or military situation where natural or human forces could be trying to disable the hive. Now instead of knocking out a single mothership ATV - each and every drone need to be removed to full disable it. Big advantage.
> We are just currently working on a “group” abstraction
Makes sense to me. That's the 'value add', might as well really spec that out
> leader election to determine a “mothership” among the group
This seems perfectly reasonable to me and doesn't remove the advantages of the disconnected "hive". But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It's not a hive of peers, it's a hive of temporary kings.
Thanks for the feedback!
> I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight
Makes sense. I think there are scenarios where such “base stations” are a priori available and “shielded,” so in this case, it might make more sense to just go with a centralized system. This could also be built on top of our system, though.
> But what’s the benefit of this?
I agree that, in many cases, the return on saving costs might be marginal. However, say you have a cluster of drones equipped with computing hardware capable enough to run all algorithms themselves—why spin up a cloud instance for running a centralized version of that algorithm? It is more of an engineering-ideological point, though ;)
> But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It’s not a hive of peers, it’s a hive of temporary kings.
Most of our applications will not need this group leader. For example, the pubsub system does not work by aggregating and dispatching the messages at a central point (like MQTT) but employs a gossip mechanism (https://docs.libp2p.io/concepts/pubsub/overview/).
What I meant is that, in some situations, it might be more efficient (and it’s easier to reason about) to elect a leader. For example, say you have an algorithm that needs to do a matching between neighboring nodes —i.e., each node has some data point, and the algorithm wants to compute a pairwise similarity metric and share all computed metrics back to all nodes. You could do some kind of “ring-structure” algorithm, where you have an ordering among the nodes, and each node receives data points from the predecessor, computes its own similarity against the incoming data point, and forwards the received data point to its successor. If one node fails, the neighboring nodes in the ring will switch to the successor. This would be truly decentralized, and there is no single point of failure. However, in most cases, this approach will have a higher computation latency than just electing a temporary leader (by letting the leader compute the matchings and send them back to everyone). So someone caring about efficiency (and not resiliency) will probably want such a leader mechanism.
It's not clear what the hardware requirements for a system that can run this would be. Raspberry Pi is mentioned but it seems like an actual OS (not ESP32 for example) is a requirement.
You are right that, at the moment, the system inherently requires a 64-bit OS. We currently support Debian-based distros; it should work with other parent distributions as well, but you need to translate the installer script ;) But we definitely need to highlight this more clearly in the docs. Thanks for pointing it out!
We also don’t have a definitive hardware spec requirement yet. We’ve tested it on Raspberry Pi 3s and later models (so anything more capable than a 3 should be fine).
> not ESP32 for example
Running on ESP32 is tricky because it would require porting libp2p to a embedded (which, as far as we know, nobody has done yet). However, we are considering support for embedded “light” nodes that run only a limited portion of the stack. It depends on the feedback we get. Do you have a use case where you’d need it to run on embedded?
This is awesome stuff, I'm going to look into getting this running my Pis this weekend. How hard would it be to add in custom services? I like to play with decentralized algorithms such as Size Estimation and Clock Synchronization (https://jasonfantl.com/) and have always wanted to get them running on real hardware.
Awesome! From what I see, the clock synchronization can be implemented with our SDKs (mainly pub-sub).
I think the size estimation could also be implemented within the provided abstractions (mainly request-response) but might require you to keep track of neighbors. I think you could implement both algorithms by using our SDKs (none for Go yet).
If you need more control or performance, beyond what we expose through our SDKs, you might need to write a custom libp2p behavior and add it to our daemon. The libp2p part is fairly involved, but I would love to help you with that. Either way I would love to help you out :)
I'm so disappointed that I've never seen your blog before. The stuff you write about is so interesting and actually addresses some issues we are facing. I just sent you an email :)
I am neophyte in this realm, but I like what I am seeing so far. Since I want to get into robotics for my own fun, I will be looking at it more closely this weekend:D
Please do and let us know if you have any questions!
> We’d love to hear your thoughts! :)
Have you ever played any of the Horizon (Zero Dawn/Forbidden West) games? :)
Jokes aside, it looks pretty cool. What kind of hardware have you tested it with so far? Is this using WiFi only?
Actually, just very briefly at a friend’s ;)
Thank you! So far, we have tested it with Raspberry Pi 4/5. Jetson boards are on backorder. We have some Intel WiFi chips (since they support some stuff we want), and we will get around to trying them next.
The binaries were also tested on x86 machinery.
In general, I'm not too worried about hardware support since batman-adv is quite widely deployed on a diverse set of hardware and the rest is hardware agnostic.
I've been thinking about building a little tiny SLAM robot to have something to drive around the house when I'm out of town (I don't want always on cameras everywhere but having a camera that can move around sounds useful). The ideas here are awesome and I'm looking forward to the tutorials being more fleshed out.
Yeah, SLAM seems also like a natural showcase for us. I am just working on a decentralised collaborative SLAM package on top of our system, where multiple robots can drive around and continuously merge their maps without a coordination server, using the Mesh integration and PubSub system. Should be out in about a week.
Now that sounds absolutely fascinating. I'll look forward to that
this is so cool, congrats on launching. what kind of biz model are you guys going after?
We will most likely go with an open-core model. The main part will stay open source (the Core OS extension is under GPL3, and everything SDK-related is MIT).
For paid features, we have several ideas: a hosted management plane to configure and control the swarm (with company rbac integration) when one of the nodes is connected to the internet; advanced security (currently no access management or authentication is happening); sophisticated orchestration primitives; and LoRa connectivity (to scale the mesh radius to miles).
Appreciate your feedback on this!