Demystifying Generative AI with Databricks Data Intelligence Platform (Part 2)

Mohammed Arif

Published Nov 4, 2024

In Continuation with "Demystifying Generative AI with Databricks Data Intelligence Platform" Part-1 which can be accessed here (https://lnkd.in/g49wPrjb), here is the second and the final part.

AI Systems

Compound AI Systems - An AI system that has multiple interacting components usually independent of the framework or language model to be used
LangChain - A composition software framework that helps to build and manage multi-stage reasoning AI systems using large language models, consists of components for building chains and agents, integrations with other tools and off-the-shelf implementations for common tasks

Components of LangChain

Prompt - A structured text input designed to communicate a specific task ort query to a language model, guiding it to produce the desired output
Chain - A sequence of automated actions or components that process a user's query and produce a model's output
Retriever - An interface thatreturns relevant documents or information based on an unstructured query, often used in conjunction with indexed data to enhance search and retrieval capabilities
Tool - A functionality or resource that an agent can activate, such as APIs, databases or custom functions, to perform specific tasks

LLamaIndex - A data framework that enhances the capabilities of LLMs by structuring and indexing data to make it easily consumable. Components include - Models, Prompts, Indexing & storing, Querying and Agents

Haystack - An open-source Python framework for building custom applications with LLMs, focusing on document retrieval, text generation, and summarization. Components include - Generators, Retrievers, Document stores and Pipelines

Databricks Foundation Model API - An API that supports accessing and querying state-of-the-art open generative AI models, it has a pay-per-token pricing model for low-throughput applications and provisioned throughput for high-throughput. Some of the supported models include DBRX Instruct, Meta Llama 3 8/70B, Mixtral-8x7B Instruct, BGE Large (English) etc.

DBRX - A new source open LLM by Databricks, it has 2 versions - DBRX Base and DBRX Instruct
DBRX Base - A pre-trained model which functions like a smart auto-complete
DBRX Instruct - A fine-tuned model designed to answer questions and follow instructions, built on top of DBRX by performing further training on domain-specific data and fine-tuning for instruction-following

Agents

Agent - An application that can execute complex tasks by using a language model to define a sequence of actions to take, the sequences are query-dependent chosen dynamically by the LLMs

Components of an Agent

Task - The user request through prompt to be solved
LLM (Brain) -The central coordination module that manages the core logic and behavioral characteristics of an agent, it is the brain of the agent.
Tools - External resources that the agent uses to accomplish the tasks at hand
Memory & Planning - Components for planning and executing the future actions

Agent Reasoning - Cognitive process by which agents draw logical conclusions and make decisions autonomously mirroring aspects of human cognitive abilities

Recommended by LinkedIn

The Rise of Generative AI: What It Means for Data…

Walter Shields 1 year ago

2 Key Changes that Unlocked Huge Scale in our Machine…

Zarif Aziz 3 years ago

Synerise open-sourcing Cleora AI framework for…

Jaroslaw Krolewski 5 years ago

Agent reasoning Design patterns

ReAct - Reason + Act - Enable models to generate verbal reasoning traces and actions. Main states used in ReAct agents are Thought (reflect on the given problem and previous actions taken), Act (choosing correct tool and input format) and Observe (evaluate the result of the action and generate next thought)
Tool use & Function Calling - Agents interact with external tools and APIs to perform specific tasks, decides which tools to use and when/how to use them
Planning - Able to dynamically adjust their goals and plans based on changing conditions
Multi-Agent Collaboration - Involves several agents working collaboratively, each handling different aspects of the task, allows modularization, specialized in solving specific business problems

LangChain Agents - Provides a structure for building agents that can use tools to interact with the world

AutoGPT - Provides tools to build AI agents

AutoGen - A framework that enables applications with multiple agents that can communicate with each other

Transformers Agents - Provides a natural language API for interacting with transformers

Multi-Modal AI - Models with inputs or outputs that include data types beyond text, it can include images, audios and videos

Multi-Modal Retrieval - Embeds all modalities (data types) in the same vector space (e.g. CLIP) OR into different vector spaces
Multi-Modal Generator - Enables generating responses in multiple formats (e.g. generating a story with images, GPT-4V)

LLM Security

Guardrails - A prompt injection risk mitigation technique where additional guidance is provided to LLMs to control it's responses
DASF (Data and AI Security Framework) - A security approach by organizing AI security problem with a component based approach, 12 AI system components and 55 associated risks have been identified as of now (e.g. Catalog, Algorithm, Evaluation, Model management, Operations, Platform)
Databricks as Security - Databricks meets the AI security needs by having a in-built tight security framework involving components such as Unity Catalog and Mosaic AI

LLM Evaluation

Loss (Evaluation Metrics) - Measures the difference between predictions and the truth, measures when training LLMs by how well they predict the next token
Perplexity - It's the model's confidence in it's predictions i.e. measuring if the model was surprised that its prediction was correct? Low perlexity = High confidence and High perlexity = Low confidence. A sharp peek in the LLMs probability distribution reflects a low perplexity
Toxicity - Measures the harmfulness of the responses generated by the LLMs, it identifies and flags harmful offensive or inappropriate language. Low toxicity = Low harm. It uses a pre-trained hate speech classification model
BLEU (BiLingual Evaluation Understudy) - Compares translated output to a reference, comparing n-gram similarities between the output and reference.
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) - Compares summarized output to a reference, comparing n-gram similarities between the output and reference.
Benchmarking - Comparing models against standard evaluation datasets, LLMs are evaluated on large reference datasets (e.g. Stanford Q&A for Q&A evaluation) and your own data too.
LLM-as-a-Judge - Ask an existing LLM to do the evaluation for you. Uses few-shot examples with human-provided scores for more guidance, it also provides more specific instructions of what good looks like. The evaluation scale is more specific and component-based rubric. Often it utilizes prompt engineering templating.
Offline Evaluation - Evaluating LLMs and it's components in static and non-prod environments i.e. before deployment to prod.
Online Evaluation - Real-time evaluation of LLMs after it has been deployed in prod.

LLM Deployment

Model Flavor - A standard format for packaging machine learning models with additional metadata such as signature, input example etc. e.g. MLflow LangChain flavor, Open AI flavor, HuggingFace flavor, PyTorch, Python Function etc.
Unity Catalog Model Registry - A centralized model store with full-fledged model lifecycle management with versioning, it can deploy and organize models, manage ACLs, stores full model lineage, taggings and annotations.
Gen AI Model deployment - Process of integrating an AI model into a production environment, making it accessible for end-users or other systems to generate predictions or completions. Deployment strategies can be either batch, streaming, real-time or embedded.
TensorRT - A Tensorflow-friendly SDK from NVIDIA for high performing batch interface on GPUs
vLLM - A Transformer-friendly library for memory-efficient inference on GPUs
Ray on Spark - A pythonic distributed computing primitive for parallelizing and scaling python applications, can be used on AWS, Azure or GCP
Databricks Model Serving - A databricks native production grade model serving framework with high availability and low latency, accelerated deployments with Lakehouse-Unified serving and simplified deployment through UI or API
Inference Tables - Databricks delta tables used for monitoring and debugging deployed models, each request-response is appended to the table in UC. It can perform diagnostics and debugging of suspicious inferences, it can also create a dataset of mislabeled data to be re-labeled.
Databricks Lakehouse Monitoring - A monitoring tool for automated insights and out-of-the-box metrics on data and ML pipelines. It is a fully managed infrastructure, frictionless with easy setup and provides a unified solution for data and models for holistic understanding.
MLOps - Set of processes and automation for managing fata, code and models to improve peroformance, stability and long-term efficiency of ML systems. MLOps = DataOps + DevOps + ModelOps
LLMOps - MLOps for GenAI applications and environments with code management across different environments as well as data/system component management. Areas under LLMOps (as well as MLOps) include Dev patterns, Packaging, Serving, API Governance, Cost & Performance and Human Feedback.

#GenerativeAI #Databricks #MosaicAI #MLOps #MachineLearning #DataEngineering #TechInnovation

To view or add a comment, sign in

Demystifying Generative AI with Databricks Data Intelligence Platform (Part 2)

Mohammed Arif

AI Systems

Components of LangChain

Agents

Components of an Agent

Recommended by LinkedIn

Agent reasoning Design patterns

LLM Security

LLM Evaluation

LLM Deployment

More articles by Mohammed Arif

Others also viewed

How to Use Synthetic and Simulated Data Effectively

Five trends in data platform in 2025

Data Alchemy. Crafting AI with Synthetic Data Generation

Snowflake Cortex Launch: What’s in store for AI Engineers?

Artificial Intelligence specialists don’t exist

The Advent of AI - Excited? Or Scared?

I learned the concepts of RAG and CAG using the LLM approach

Lundi, le Quatorze Juillet: Devs using AI code more slowly; a framework for inference using unstructured data; some cool French startups

Advancing Feature Engineering for Structured Data Beyond Generative AI

Building Production-Grade LLM Judges in Databricks (How We Operationalized AI Evaluation)

How Llms Process Language

How LLMs Generate Data-Rich Predictions

Using LLMs as Microservices in Application Development

Llama Model Results on Advanced AI Tasks

How Large Language Models Create Text Responses

Explore content categories

AI Systems

Components of LangChain

Agents

Components of an Agent

Recommended by LinkedIn

Agent reasoning Design patterns

LLM Security

LLM Evaluation

LLM Deployment

More articles by Mohammed Arif

Demystifying Generative AI with Databricks Data Intelligence Platform (Part-1)

NO_MERGE_ROW error in EIM Merge process

Organization, Internal Division, Account and Partner in Siebel

Others also viewed

How to Use Synthetic and Simulated Data Effectively

Five trends in data platform in 2025

Data Alchemy. Crafting AI with Synthetic Data Generation

Snowflake Cortex Launch: What’s in store for AI Engineers?

Artificial Intelligence specialists don’t exist

The Advent of AI - Excited? Or Scared?

I learned the concepts of RAG and CAG using the LLM approach

Lundi, le Quatorze Juillet: Devs using AI code more slowly; a framework for inference using unstructured data; some cool French startups

Advancing Feature Engineering for Structured Data Beyond Generative AI

Building Production-Grade LLM Judges in Databricks (How We Operationalized AI Evaluation)

Similar topics

How Llms Process Language

How LLMs Generate Data-Rich Predictions

Using LLMs as Microservices in Application Development

Llama Model Results on Advanced AI Tasks

How Large Language Models Create Text Responses

Explore content categories