Comments Page - Comparing our Rust-based indexing and querying pipeline to Langchain

« Back Comparing our Rust-based indexing and querying pipeline to Langchainbosun.aiSubmitted by tinco 2 hours ago

zitterbewegung 2 minutes ago
This is a comparison of apples to oranges. Langchain has an order of magnitude of examples, of integrations and features and also rewrote its whole architecture to try to make the chaining more understandable. I don't see enough documentation in this pipeline to understand how to migrate my app to this. I also realize it would take me at least a week even migrate my own app to Langchain's rewrite.
Langchain is used because it was a first mover and that's the same reason it's achilles heel and not for speed at all.
zozbot234 36 minutes ago
Am I the only one who thinks a Swift IDE project should be called Taylor?
- giancarlostoro 5 minutes ago
  Sure, but this is a Rust project for building LLMs called Swiftide, not a Swift IDE...
  https://swiftide.rs/what-is-swiftide/
- Svoka 33 minutes ago
  I would name it Tailor
pjmlp 2 hours ago
Most of the Python libraries, are anyway bindings to native libraries.
Any other ecosystem is able to plug into the same underlying native libraries, or even call them directly in case of being the same language.
In a way it is kind of interesting the performance pressure that is going on Python world, otherwise CPython folks would never reconsider changing their stance on performance.
- OptionOfT an hour ago
  Most of these native libraries' output isn't 1-1 mappable to Python. Based on the data you need to write native data wrappers, or worse, marshal the data into managed memory. The overhead can be high.
  It gets worse because Python doesn't expose you to memory management. This initially is an advantage, but later on causes bloat.
  Python is an incredibly easy interface over these native libraries, but has a lot of runtime costs.
  nicce 20 minutes ago
  > Python is an incredibly easy interface over these native libraries, but has a lot of runtime costs.
  It also means that many people use Python while they don't understand that what part of the code is actually fast. They mix Python code with wrappers to native libraries, and sometimes the Python code slows down the overall work substantially and people don't know that fault is there. E.g use Python Maths with the mix of Numpy math bindings, while they can do it with Numpy alone.
  __coaxialcabal 37 minutes ago
  Have you had any success using LLMs to rewrite Python to rust?
  pjmlp an hour ago
  Yet another reason to use native compiled languages with bindings to the same C and C++ libraries.
  If using C++20 onwards, then it is relatively easy to have similar high level abstractions, one just needs to let go of Cisms that many insist in using.
  Here Rust has clearly an advantage that it doesn't allow for copy-paste of C like code.
  Naturally D and Swift with their safety and C++ interop, would be an option as well.
- oersted an hour ago
  Indeed, but Python is used to orchestrate all these lower-level libraries. If you have Python on top, you often want to call these libraries on a loop, or more often, within parallelized multi-stage pipelines.
  Overhead and parallelization limitations become a serious issue then. Frameworks like PySpark take your Python code and are able to distribute it better, but it's still (relatively) very slow and clunky. Or they can limit what you can do to a natively implemented DSL (often SQL, or some DataFrame API, or an API to define DAGs and execute them within a native engine), but you can't to much serious data work without UDFs, where again Python comes in. There are tricks but you can never really avoid the limitations of the Python interpreter.
lmeyerov an hour ago
At least for Louie.ai, where analysts will do intensive analytics tasks for like pulling Splunk/Databricks/neo4j data, get it wrangled in some runtime, visualized, etc, we have ups and downs:
On the plus side, it means our backend gets to handle small/mid datasets well. Apache Arrow adoption in analytics packages is strong, so zero copy & and columnar flows on many rows is normal. Pushing that to the GPU or another process is great.
OTOH, one of our greatest issues is the GIL. It shows up a bit in single user code, esp when doing divide-and-conquer flows for a user, but the bigger issue is in concurrent users. We would like the memory sharing benefits of threaded, but because of the GIL, want the isolation benefits of multiprocess. A bit same-but-different, we stream results to the browser as agents progress in your investigation, and that has not been as smooth as we have done with other languages.
And moving to multiprocess is no panacea. Eg, a local embedding engine is expensive to do per worker because modern models have high RAM needs, so biases to using a local inference server for what is meant to be an otherwise local call.
Interesting times!
serjester an hour ago
I'm surprised they don't talk about the business side of this - did they have users complaining about the speed? At the end of day they only increased performance by 50%.
These kind of optimization seem awesome once you have a somewhat mature product but you really have to wonder if this is the best use of a startup's very limited bandwidth.
- godelski an hour ago
  > At the end of day they only increased performance by 50%. > only 50%.
  I'm sorry... what?! That's a lot of improvement and will save you a lot of money. 10% increases are quite large!
  Think about it this way, if you have a task that takes an hour and you turn that into 59 minutes and 59 seconds, it might seem like nothing (0.02%). But now consider you have a million users, that's a million seconds, or 277 hrs! This can save you money, you are often paying by the hour in one way or another (even if you own the system, your energy has cost that's dynamic). If this is a task run frequently, you're saving a lot of time in aggregate, despite not a lot per person. But even for a single person, this is helpful if more devs do this. Death by a thousand cuts.
  But in the specific case, if a task takes an hour and you save 50%, your task takes 30 minutes. Maybe the task here took only a few minutes, but people will be chaining these together quite a lot.
  lpapez an hour ago
  Maybe these optimizations benefit the two users who do the operation three times a year.
  In such an extreme case no amount of optimization work would be profitable.
  So the parent comment asks a very valid question: how much total time was saved by this and who asked for it to be saved (paying or free tier customers for example)?
  People who see the business side of things rightfully fear when they hear the word "optimization", it's often not the best use of limited development resources - especially in an early stage product under development.
  sroussey 11 minutes ago
  I do wish that when people write about optimization that they would then multiply by usage, or something similar.
  Another way is to show CPU usage over a fleet of servers before and after. And then reshuffle the servers and use fewer and use the number of servers no longer needed as the metric.
  Number of servers have direct costs, as well as indirect costs, so you can even derive a dollar value. More so if you have a growth rate.
sandGorgon 39 minutes ago
this is very cool!
we built something for our internal consumption (and now used in quite a few places in India).
Edgechains is declarative (jsonnet) based. so chains + prompts are declarative. And we built an wasm compiler (in rust based on wasmedge).
https://github.com/arakoodev/EdgeChains/actions/runs/1039197...
satvikpendem an hour ago
I was asking the same question, turns out mistral.rs [0] has pretty good abstractions in order to not depend and package llama.cpp for every platform.
[0] https://github.com/EricLBuehler/mistral.rs
bborud an hour ago
It would be helpful to move to a compiled language with a decent toolchain. Rust and Go are good candidates.
RcouF1uZ4gsC an hour ago
Why not use C++?
For the most part, these aren't security critical components.
You already have a massive amount of code you can use like say llama.cpp
You get the performance that you do with Rust.
Compared to Python, in addition to performance, you also get a much easier deployment story.
- oersted an hour ago
  If you already have substantial experience with C++, this could be a good option. But I'd say nowadays that learning to use Rust *well* is much easier than learning to use C++ *well*. And the ecosystem, even if it's a lot less mature, I'd say is already better in Rust for these use-cases.
  Indeed, here security (generally safety) is a secondary concern and is not the main reason for choosing Rust, although welcome. It's just that Rust has everything that C++ gives you, but in a more modern and ergonomic package. Although, again, I can see how someone already steeped in C/C++ for years might not feel that, and reasonably so. But I think I can farely safely say that Rust is just "a better C++" from the perspective of someone starting from scratch now.
- Philpax 10 minutes ago
  Why use C++? What's the benefit over Rust here?
- IshKebab 30 minutes ago
  Rust is much better than C++ overall and far easier to debug (C++ is prone to very difficult to debug memory errors which don't happen in Rust).
  The main reasons to use C++ these days are compatibility with existing code (C++ and Rust are a bit of a pain to mix), and if a big dependency is C++ (e.g. Qt).
dmezzetti an hour ago
I've covered this before in articles such as this: https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...
You can make anything performant if you know the right buttons to push. While Rust makes it easy in some ways, Rust is also a difficult language to develop with for many developers. There is a tradeoff.
I'd also say LangChain's primary goal isn't performance it's convenience and functionality coverage.
swyx an hour ago
i mean LLM based or not has nothing to do with it, this is a standard optimization, scripting lang vs systems lang story.
- godelski an hour ago
  Shhhh, let this one go. So many people don't get optimization and why it is needed that I'll take anything we can get. Hell, I routinely see people saying no one needs to know C because python calls C in "the backend" (who the fuck writes "the backend" then?). The more people that can learn some HPC and parallelism, the better.
  pjmlp an hour ago
  Even better if they would learn about these amazing managed languages where we can introspect the generated machine code of their dynamic compilers.
zie1ony an hour ago
DSPy is in Python, so it must be Python. Sorry bro :P