« BackRefrag: Rethinking RAG Based Decodingarxiv.orgSubmitted by datadrivenangel a day ago
  • datadrivenangel a day ago

    Am I misunderstanding this or is basically just taking RAG results and doing a vector search on the results and only passing some to the context window?

    Also, why do these AI papers never get speedup times in human time units?