NextGen EDA Workloads using AWS ParallelCluster with AI LLM Powered Conversational Agent
One of the most exciting advancements in chip design to come about in the last decade is the application of artificial intelligence (AI). Today with the rapid development of Agentic AI we see an opportunity to make Application Engineers aware of underlying HPC infrastructure for performance, troubleshooting, cost and context aware using simple natural language.
This post describes generic steps to create an AI-powered solution for managing Electronic Design Automation (EDA) workloads in semiconductor design by integrating Amazon Bedrock AgentCore with AWS ParallelCluster through a custom Model Context Protocol (MCP) server. The solution could also be applied to other HPC workloads like Computational Fluid Dynamics, Life Sciences and Genomics.
MCP (Model Context Protocol) server is an open-standard interface that allows AI models to securely connect with local or remote data sources and tools, such as your files, databases, or APIs. It works as a standardized "plug-and-play" connector, enabling an AI to move beyond simple chat and actively interact with your technical environment to perform tasks or retrieve live information.
AWS ParallelCluster is an open-source cluster management tool that makes it easy to deploy and manage high-performance computing (HPC) clusters on AWS. AWS ParallelCluster offers cloud advantages such as elasticity and fast setup are available to deliver optimal performance for massive EDA workloads.
Amazon Bedrock AgentCore provides a specialized, serverless platform to deploy and operate AI agents that can securely execute code, retain memory across interactions, and connect with internal enterprise systems.
Solution Overview – Reference Architecture
Generic Steps
These are the steps we will take for the solution:
Prerequisites
You should have the following prerequisites:
Step 1- Install ParallelCluster
Please follow instructions on how to install and configure ParallelCluster. Instructions are found here: Setting up AWS ParallelCluster.
Step 2- Roles and Policies
You will need to create a new role and attach managed and custom policies
Recommended by LinkedIn
Step 3- Lambda Function (code partial)
import json
import boto3
import logging
from typing import Dict, Any
# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
"""
AgentCore-compatible Lambda handler for EDA workflows
"""
try:
logger.info(f"Received event: {json.dumps(event)}")
# Extract input from AgentCore event format
user_input = event.get('input', {}).get('text', '')
session_id = event.get('sessionId', 'default')
# Handle empty input
if not user_input:
user_input = "status" # Default to status check
logger.info(f"Processing EDA request: {user_input}")
# Process EDA request
response = process_eda_request(user_input, event)
# Ensure response has required fields
if not response or 'message' not in response:
response = {
'message': '🤖 EDA ParallelCluster Assistant Ready\n\nI can help you with cluster management, job submission, and cost optimization.',
'metadata': {'action': 'default_response'}
}
# Return AgentCore-compatible response
result = {
'statusCode': 200,
'body': {
'response': response['message'],
'sessionId': session_id,
'metadata': response.get('metadata', {})
}
}
logger.info(f"Returning response: {json.dumps(result)}")
return result
except Exception as e:
logger.error(f"Error processing request: {str(e)}", exc_info=True)
return {
'statusCode': 500,
'body': {
'error': f"Failed to process EDA request: {str(e)}",
'sessionId': event.get('sessionId', 'default'),
'response': f"❌ Error: {str(e)}"
}
}
def process_eda_request(user_input: str, context: Dict[str, Any]) -> Dict[str, Any]:
"""Process EDA-specific requests"""
# Analyze request intent
if any(keyword in user_input.lower() for keyword in ['create', 'cluster', 'setup']):
return handle_cluster_creation(user_input)
elif any(keyword in user_input.lower() for keyword in ['submit', 'job', 'run']):
return handle_job_submission(user_input)
elif any(keyword in user_input.lower() for keyword in ['cost', 'optimize', 'savings']):
return handle_cost_optimization(user_input)
elif any(keyword in user_input.lower() for keyword in ['status', 'monitor', 'check']):
return handle_cluster_status(user_input)
else:
return handle_general_query(user_input)
Step 4- Create Agent in Bedrock AgentCore
You will use the AWS console or AWS CLI, but CLI on Windows has a bug so I needed to jump into console
Step 5- MCP Server
Please follow instructions on how to create MCP Server. Instructions are found here: Setting up AWS MCP Server.
"parallelcluster-mcp-server":
"command": "python",
"args": [
"-m", "parallelcluster_mcp.server"
],
"cwd": "parallelclusterlelcluster-mcp-server",
"env": {
"AWS_PROFILE": "default",
"FASTMCP_LOG_LEVEL": "ERROR",
"CLUSTER_NAME": "eda-parallelcluster",
"PYTHONPATH": "parallelcluster-mcp-server/src"
},
"disabled": false,
"autoApprove": [
"create_eda_cluster",
"describe_cluster",
"list_clusters",
"submit_eda_job",
"get_eda_cluster_metrics",
"optimize_eda_cluster_config",
"estimate_eda_costs",
"monitor_license_usage"
]
}
Final Product: Testing from MCP Client
After connecting MCP Client to MCP server we tested with several prompts for querying cluster status, cost estimates and creation of cluster (Prompt Functionality: Workflow Analysis, Cost Estimation, Script Generation, AI integration).
Future further work
Conclusion
The solution enables conversational AI (using LLM) management of high-performance computing clusters optimized for memory-intensive EDA tools, featuring job scheduling, FSx high-performance storage, cost optimization recommendations, and real-time monitoring of cluster metrics and license usage. By combining enterprise-grade AI orchestration with specialized AWS HPC infrastructure, this project delivers a platform that allows semiconductor engineers to interact using natural language with EDA workflows while suggesting optimization for performance, cost, and license efficiency.