In the first part, we discussed how to converse with a model to obtain dialogue based on untrained information. In our second part, we looked at how to create our indexes and store them. In this final part, through a practical case, we will cover how to create a complete RAG system and initiate a dialogue, giving users the chance to converse with our model enriched with our own data.
Just a reminder, all the information is in the documentation available here.
Therefore, you do not really ne...
In the first part, we discussed how to interact with a model to obtain a dialogue based on information it was not trained on. In summary, you add the desired information to the context. But what if you want to use an entire knowledge base? That would be far too much information to add to the context. For this, we need to put all the information we wish to provide to users into a database. We will break down our content into paragraphs and apply a vector index. This index converts text into ...
For this first part of our series of articles, we're going to base our exploration on this tutorial
to implement a RAG process. But what is a RAG? RAG, or "Retrieval-Augmented Generation," is an advanced technique in artificial intelligence, specifically in the field of natural language processing (NLP), which involves enriching the text generation process by incorporating an information retrieval phase. This hybrid method combines the power of deep learning-based language models with the e...
Llama CPP is a new tool designed to run language models directly in C/C++. This tool is specially optimized for Apple Silicon processors through the use of ARM NEON technology and the Accelerate framework. It also offers AVX2 compatibility for systems based on x86 architectures. Operating mainly on the CPU, Llama CPP also integrates 4-bit quantization capability, further enhancing its efficiency. Its advantage is that it allows for the execution of a language model directly on a personal co...