Developing a Custom Chatbot Using Function Calling and Retrieval Augmented Generation (RAG)

Developing a Custom Chatbot Using Function Calling and Retrieval Augmented Generation (RAG)

Hello everyone,

For the last 7 months, I've been working with generative AI, LLMs, and Retrieval Augmented Generation (RAG). During this time, I've discovered many features that I believe are not getting the attention they deserve, and people aren't utilizing them correctly.

One feature that particularly interests me is the combination of function calling in OpenAI, along with the use of Retrieval Augmented Generation (RAG). Today, I'll explain how to create a custom chatbot that can answer questions specific to your data, as well as handle general inquiries like ChatGPT does.


Let me begin by explaining what function calling and RAG are.

Function Calling: This feature enables you to outline the structure of functions to an assistant and then receive back the functions that should be executed along with their corresponding arguments. Function calling can assist you in retrieving pertinent data from various sources or APIs, integrating with other systems or tools, generating structured outputs from your prompts, and performing other tasks.

Retrieval Augmented Generation (RAG): It is a method used to enhance the output of a large language model. It involves the model referencing an authoritative knowledge base beyond its original training data sources before generating a response.


Now, let me outline the steps to follow in order to develop such a chatbot:

Step 1: Establish a connection with your external knowledge base. If your knowledge base is stored in a vector store, define the necessary parameters for connecting to it.

vector_store_address: str = "The vector store address where your knowledge base is stored"
vector_store_password: str = "The vector store password where your knowledge base is stored"
model: str = "model name"

embeddings: OpenAIEmbeddings = OpenAIEmbeddings( deployment="Enter you embeddings model deployment",
model="Enter your model",
openai_api_base="Your openai api base",
openai_api_type="Your openai api type",
openai_api_key = "Your openai api key")


index_name: str = "Your knowledge base name depending on where it is stored"
vector_store: AzureSearch = AzureSearch(
    azure_search_endpoint=vector_store_address,
    azure_search_key=vector_store_password,
    index_name=index_name,
    embedding_function=embeddings.embed_query)        

Step 2: Create a function that retrieves the pertinent documents from the designated knowledge base based on the user query. Note that setting k=5 means this function will retrieve the 5 most relevant documents, but you can adjust this value according to your preference.

def get_documents_from_document_db(prompt):
  docs = vector_store.similarity_search_with_relevance_scores(
    query=prompt,
    k=5,
    score_threshold=0.6
  )
  return docs        

Step 3: Next, define a function to be passed to your LLM model, informing it when to access data from the knowledge base. A function typically consists of three main parameters: name, description, and parameters. The description parameter plays a crucial role for the model to understand when and how to invoke the function, so it's essential to provide a clear description of its purpose. Additionally, ensure to provide descriptions for any parameters that might not be self-explanatory to the model.

functions = [
    {
        "name": "get_documents_from_document_db",
        "description": "Retrieves the documents based on the user prompt input from an external document database. Only retrieve the document in accordance with the user prompt.",
        "parameters": {
            "type": "object",
            "properties": {
                "prompt": {
                    "type": "string",
                    "description": f"""Last user prompt entered as string""",
                },
            },
            "required": ["prompt"],
        },
    }]        

Step 4: Define your Large Language Model (LLM).

def openai_answers(message_list):
  
  message_list = message_list
  
  completion = openai.ChatCompletion.create(
      engine= "gpt4",
      messages=message_list,
      temperature = 1)
  return (completion["choices"][0]["message"])        

Step 5: Finally, create a system message that instructs the agent when to activate function calling and when not to. I've set up a while loop to simulate the user interface of a chatbot. However, you can also test with a single input without the loop.

When the if condition is true, it indicates that the user input is related to your knowledge base, activating function calling. The subsequent response will package the 5 extracted documents to generate a clean response according to the defined tone (you can specify a tone instruction in the system message). If the if condition is not true, the bot will respond to the user query without utilizing the knowledge base. This occurs when the user query is general and unrelated to your data.

message_list = [{"role": "system", "content": """Role: You are equipped with a function to pull documents from the knowledge base. Adhere strictly to these guidelines:\n\n- ONLY activate the function when a question is EXPLICITLY about the organization. If there's any doubt, ASK the user to confirm if they're referencing the organization before calling the function.\n- Upon retrieving a document, FILTER out unrelated details. Only present information that DIRECTLY answers the user's question.\n- When a question's relation to the organization is AMBIGUOUS, you MUST seek clarity. Avoid assumptions.\n- For all other topics outside of the organization, lean on your pre-existing knowledge. Keep responses concise and to the point. Avoid verbosity."""}]

while (True):
  prompt = input("Prompt: ")
  if prompt == 'END':
    break
  new_message = {"role": "user", "content": prompt}
  message_list.append(new_message)
  response_message = openai_answers(message_list)
  response_message = response_message.to_dict()
  if response_message.get("function_call"):
      response_message['function_call'] = response_message['function_call'].to_dict()
      available_functions = {
          "get_documents_from_document_db": get_documents_from_document_db,
      }
      function_name = response_message["function_call"]["name"]

      function_to_call = available_functions[function_name]
      function_args = json.loads(response_message["function_call"]["arguments"])
      function_response = function_to_call(
          prompt=function_args.get("prompt")
      )
      response_message['content'] = None

      message_list.append(response_message)
      message_list.append(
          {
              "role": "function",
              "name": function_name,
              "content": function_response,
          }
      )
      second_response = openai.ChatCompletion.create(
          engine="gpt4",
          messages=message_list,
          temperature = 1.0
      )

      gpt_response = second_response.choices[0].message['content']
      print('\nGPT RESPONSE:')
      print(gpt_response)
      new_gpt_message = {"role":"assistant", "content":gpt_response}
      message_list.append(new_gpt_message)
    
  else:
    second_response = openai.ChatCompletion.create(
          engine="gpt4",
          messages=message_list,
          temperature = 1.0
      )
    gpt_response = second_response.choices[0].message['content']
    print(gpt_response)
    new_gpt_message = {"role":"assistant", "content":gpt_response}
    message_list.append(new_gpt_message)        


To view or add a comment, sign in

More articles by Muhammad Shumyl Akbar

Others also viewed

Explore content categories