Member-only story

RAG Hello World for Vertex AI

3 min readNov 6, 2023

Adapt existing LangChain tutorials to get a basic Vertex AI RAG setup

Since ChatGPT burst onto the world a year ago Retrieval Augmented Generation (RAG) has become a common use case for getting the most out of LLMs. With RAG, you can constrain LLMs to answer questions from a defined set of documents. It’s as easy as using LangChain to stitch together a solution that chunks up the documents you want to ground the responses in, generates embeddings for the chunks, stores the chunks and their embeddings in a vector store, uses the vector store to get matching chunks for your input prompt, and then uses those matching chunks to create a final prompt to send to the LLM. Simple, right? Well, maybe not so simple the first time.

The good news is that LangChain documentation includes two easy-to-follow tutorials that show you how to:

The bad news is that both these tutorials assume that you are using an OpenAI LLM. In order to create a Hello World RAG application using Vertex AI I adapted these two tutorials. In this article I will describe how I adapted the tutorials to get a simple RAG example that uses Vertex AI rather than OpenAI for the LLM and embedding portions of the application.

RAG Hello World for Vertex AI

Written by Mark Ryan

No responses yet