I'm not sure determinism alone is sufficient for proper attribution.
This presumes "chunks" are the source. But it's not easy to identify the propositions that form the source of some knowledge. In the best case, you are looking for an association and find it in a sentence you've semantically parsed, but that's rarely the case, particularly for medical histories.
That said, deterministic accuracy might not matter if you can provide enough context, particularly for further exploration. But that's not really "chunks".
So it's unclear to me that tracing probability clouds back to chunks of text will work better than semantic search.
It's all grey isn't it? Vanilla RAG is a big step along the spectrum from LLM towards search, DQ is perhaps another small step. I'm no expert in search but I've read that those systems coming from the other direction, perhaps they'll meet in the middle.
There are three "lookups" in a system with DQ: (1) The original top-k chunk extraction (in the minimalist implementation, that's unchanged from vanilla RAG, just a vector embeddings match) (2) the LLM call, which takes its pick from 1, and (3) the call-back deterministic lookup after the LLM has written its answer.
(3) is much more bounded, because it's only working with those top-k, at least for today's context constrained systems.
In any case, another way to think of DQ is a "band-aid" that can sit on top of that, essentially a "UX feature", until the underlying systems improve enough.
I also agree about the importance of chunk-size. It has "non-linear" effects on UX.
This presumes "chunks" are the source. But it's not easy to identify the propositions that form the source of some knowledge. In the best case, you are looking for an association and find it in a sentence you've semantically parsed, but that's rarely the case, particularly for medical histories.
That said, deterministic accuracy might not matter if you can provide enough context, particularly for further exploration. But that's not really "chunks".
So it's unclear to me that tracing probability clouds back to chunks of text will work better than semantic search.