AI Tech Circle

Hey Reader!

When we talk about Large Language model implementations in the business context, you will hear the widespread term Retrieval-Augmented Generation (RAG), and it is being presented as the Magic wand to several scenarios where you need to rely on your data while using the generative AI. RAG is the solution for assembling your business data and the LLM; you will get the desired outputs.

So, I thought of going through the fundamentals of RAG; it is just for understanding and clarity. In a paper in 2020, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," Meta introduced a retrieval-augmented generation framework to give LLMs access to information beyond their training data. RAG allows LLMs to build on a specialized body of knowledge to answer questions more accurately.

Retrieval-augmented generation (RAG) in Large Language Models (LLMs) enhances the model’s ability to generate responses by dynamically retrieving relevant information from a large dataset or database at the time of the query. This approach combines the generative power of LLMs with the specificity and accuracy provided by external data sources, enabling the model to produce more accurate, detailed, and contextually relevant outputs.

How RAG Works:

Query Processing: When a query or prompt is received, the RAG system interprets the request.
Data Retrieval: It then searches a connected database or knowledge base (could be PDFs, etc) to find relevant information related to the query.
Content Generation: The retrieved information is fed into the LLM, which uses this context to generate a more informed and accurate response.

Example:

Suppose you are using a RAG-enhanced LLM for a medical information system. A user asks, “What are the latest treatment options for type 2 diabetes?”

Interpretation: The RAG system interprets the query to understand that it needs information on recent diabetes treatments.
Retrieval: It queries the connected medical database or sources of medical information stored in its knowledge base, retrieving articles, studies, and guidelines related to the latest treatment options for type 2 diabetes.
Generation: The LLM, now equipped with the latest retrieved information, generates a response summarizing the current treatment options, perhaps mentioning new drugs, lifestyle modification strategies, and the latest findings from recent studies.

Without RAG, an LLM would have to rely solely on the information it was trained on, which might be outdated or lack the specific details in newly published research. RAG ensures the model’s output is current and deeply informed by the most relevant available data, significantly enhancing the quality and utility of the response.

What are the use cases for RAG (Retrieval-Augmented Generation)?:

Question-Answering Chatbots: By integrating large language models (LLMs) with chatbots, they can autonomously generate more precise answers by accessing company documents and knowledge bases. This approach is primarily utilized to enhance customer support, automate website responses, and add business context and data for providing quick solutions to inquiries and resolving issues efficiently.
Enhanced Search Capabilities: When combined with search engines, LLMs can enrich search outcomes with generated responses, improving the accuracy of informational queries. This advancement makes it simpler for users to locate the necessary information for their tasks.
Data Query Engines: Utilizing company data as a context for LLMs enables employees to obtain answers to their queries effortlessly. This application is handy for accessing information from HR, Finance, Procurement, and Legal to several divisions documents, such as questions about company policies, benefits, and compliance standards.

These use cases demonstrate the versatility and potential of RAG to transform information retrieval and interaction within organizations. In the next week, I will go through the technical aspects of the RAG and how it works.

Weekly News & Updates...

This week's unveiling of new AI tools and products drives the technology revolution forward.

Aya open-source LLM from Cohere multilingual model is available on Kaggle, so go to Kaggle and start exploring.
Gemma from Google Open Language Models is now available in the KerasNLP collection.
Gemini Business from Google will be available in the Google Workspace apps
The EU’s AI Act and How Companies Can Achieve Compliance

The Cloud: the backbone of the AI revolution

Building Open Models Responsibly in the Gemini Era
Comprehensive tactics for optimizing large language models for your application
Streamline diarization using AI as an assistive technology: ZOO Digital’s story
Artistry With Adobe: Creator Esteban Toro Delivers Inspirational Master Class Powered by AI and RTX

Favorite Tip Of The Week:

Here's my favorite resource of the week.

Responsible Generative AI Toolkit from Google: This toolkit provides resources to apply best practices for the responsible use of open models

Potential of AI

Experiment: Figma to Replit Plugin: This experimental plugin turns static designs into responsive React components. Export the generated code to Replit to share an instantly-deployable React app.

Things to Know

Stable Diffusion 3 has released an early preview of the model with the capabilities of the text-to-image model with significantly improved performance in multi-subject prompts, image quality, and spelling abilities.

The Opportunity...

Podcast:

This week's Open Tech Talks episode 126 is "Web3 Unveiled: Revolutionizing Digital Engagement with Viktoriia Miracle"

Apple | Spotify | Google Podcast

Courses to attend:

Let's build the GPT Tokenizer, a new video from Andrej Karpathy
MIT 6.S192: Deep Learning for Art, Aesthetics, and Creativity

Events:

WebSummit, Qatar, Feb 26-29, 202
Nvidia GTC AI Conference and Expo, March 18–21, San Jose, CA and Virtual
KubeCon + CloudNativeCon Europe March 19-22 | Paris, France
GISEC, Global, 23-24 April, Dubai, UAE

Tech and Tools...

Gemma in PyTorch: PyTorch implementation of Gemma models
SoraWebui is an open-source project that simplifies video creation by allowing users to generate videos online with OpenAI's Sora model using text
ChatGPT + Enterprise data with Azure OpenAI and AI Search

Data Sets...

fastMRI Dataset from NYU School of Medicine and NYU Langone Health
ROSE: A Retinal OCT-Angiography Vessel SEgmentation Dataset

Other Technology News

Want to stay on the cutting edge?

Here's what else is happening in Information Technology you should know about:

Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis, as reported by The Verge
Cyberattacks are the No. 1 worry for business leaders—and AI may be able to help, as reported by Fortune

Earlier Edition of a newsletter

That's it!

As always, thanks for reading.

Hit reply and let me know what you found most helpful this week - I'd love to hear from you!

Until next week,

Kashif Manzoor

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.

Dubai, UAE

You are receiving this because you signed up for the AI Tech Circle newsletter or Open Tech Talks. If you'd like to stop receiving all emails, click here. Unsubscribe · Preferences

AI Tech Circle

Build Your business specific LLMs Using RAG

AI Tech Circle

Weekly News & Updates...