I strongly agree this is the direction the author is looking for. RAG is one app...

I strongly agree this is the direction the author is looking for. RAG is one approach, but if the query doesn't match the right documents, you're screwed. And often they use a different, much simpler, embedding model.

I think harpercarrol link is a pretty good one, but it basically just feeds in the documents for completion, which isn't a good approach. The dataset needs to represent how you want to use it.

This one might also be helpful https://www.deeplearning.ai/short-courses/finetuning-large-l...

Honestly surprised how almost everyone is saying to use RAG (on its own). One strong benefit to RAG is the data can change, but has lots of failure modes.

People often use hybrid search (fuzzy or bm25 etc alongside embedding search) which I suppose is still RAG.

But fine-tuning models to be better at RAG is valuable as well, increasing accuracy.

https://ragntune.com/blog/Fine-tuning-an-LLM-to-be-good-at-R...

Ideally, I'd try both. Fine tune on both the documents (create a question / answer dataset with gpt4) and rag instruction fine tune it.