POSTS

Breaking Down What Retrieval Augmented Generation Really Is

Breaking Down What Retrieval Augmented Generation Really Is

The artificial intelligence market is perhaps the boom of the current decade. From a market value of $184bn in 2024, it’s projected to surpass $826bn by 2030. It’s almost doubled (from $93bn) in the four years since retrieval augmented generation (RAG) became a reality in 2020. 

It seems highly likely that the sector’s dramatic climb is a result of RAG’s seemingly limitless potential. But what is retrieval augmented generation? Let’s take a deeper dive. 

Generative AI 101

GenAI, put simply, is using artificial intelligence to produce new content. Pose a query to a GenAI package and it will draw from data that it has access in order to formulate a response, using neural networks to identify structures and patterns within said data to craft new and original content. Almost all AI packages work on large language models, or LLMs. The package is trained on vast amounts of data, in order, that it has a huge resource to draw from. 

Until relatively recently, the LLM would work on a generative adversarial network. The GAN model involves pitting two neural networks against each other – a generator that produces the content and a discriminator, or tester, that measures the veracity and accuracy of said content. The networks are trained together and grow with each other, allowing both the generator to improve its responses, and the discriminator to become more accurate in detecting erroneous content. 

However, the LLM architecture has limitations, the most obvious being the content it has access to. While ChatGPT is one of our top AI development tools for 2024, as recently as October 2023 its library stopped at January 2022. This meant for querying around real time events – asking it to plot a graph for a stock price over the last decade, say – it could be somewhat limited. So, enter RAG. 

Go fetch

Retrieval augmented generation expands the boundaries of what an LLM can do, in theory, infinitely. It uses powerful searching algorithms to retrieve data from external sources, feed it into the LLM with the correct context and allow the AI system to produce more accurate, and certainly more timeous results. It captures information in real time, and allows the LLM to analyze the new results when devising a response, rather than drawing from only the data the system was initially trained on, or last updated with. This has massive implications in fields such as medical care and law, where an LLM not employing RAG may not be up to speed on latest developments (either on a macro level, or in the case of an individual’s medical history) or new legal rulings or precedents. 

It also means that the LLM can keep learning with entirely new retrieved data, rather than simply getting better at fine-tuning the results of a fixed database. Being able to search through a much wider scope of resource means RAG-enabled systems can reduce response bias and cut down on hallucinations, when output can be misleading entirely. RAG systems will typically search from a variety of sources - databases and knowledge bases outside the LLM - and the wider internet. RAG-enabled models can also provide sources for their responses to the end user making the query, thereby increasing transparency and (ideally!) boosting the credibility of the answer to the questioner.

RAG isn’t foolproof – being reliant on external knowledge, if that source is erroneous or biased then it may produce erroneous or biased results. The principle of garbage in, garbage out still applies. With RAG, in theory, being able to access the entire internet, one could argue that it throws up issues around the dominance of particular contexts and viewpoints on the web, however that’s another article in itself! For the applications where facts can be sourced to the minute - from everything from traffic planning to weather forecasting - RAG increases the scope of GenAI so boundlessly that it’s little wonder McKinsey are forecasting that GenAI could add up to $4.4tn to the global economy on an annual basis. 

Post Comments

Leave a reply