Faster Data Transformation - SDTM Engine
To realize our goal of accelerating SDTM transformation, we developed the SDTM Engine—a solution deeply integrated with the internal ecosystem. This engine is designed to unify fragmented workflows, automate complex mappings, and ensure regulatory compliance with speed and precision.
AI Powered - Step-By-Step - Therapeutic Area Agent
Description
As an Orchestration step for the production of SDTM, we need to determine the Therapeutic Area applicable for a study.
I used MS Copilot 365.
Prompt:
{
Identify the therapeutic area for the study described in the uploaded protocol document. Use the provided therapeutic area reference list to match the study indication to the correct therapeutic area. Return the result in the following JSON format:
{ "therapeutic_area": "<Matched therapeutic area from reference list>", "study_indication": "<Brief description of the disease or condition being studied>", "supporting_evidence": [ { "source": "<File name>", "details": "<Quote or summary from the document that supports the therapeutic area assignment>" } ], "matched_therapeutic_area_from_reference": "<Exact match from therapeutic_areas.json>", "reference_file": "<Name of the reference file used>"}
If the therapeutic area cannot be confidently determined from the provided documents, or if multiple areas are equally plausible, return the following structure to flag the need for human review:
{"therapeutic_area": "Undetermined", "study_indication": "<Brief description of the disease or condition being studied>", "supporting_evidence": [ { "source": "<File name>", "details": "<Quote or summary from the document that highlights ambiguity or multiple possible matches>" } ], "matched_therapeutic_area_from_reference": [], "reference_file": "<Name of the reference file used>", "human_review_required": true, "notes": "Therapeutic area could not be confidently assigned. Please review the study indication and supporting evidence manually."}
Input :
Raw text from the Protocol extracted by MS Copilot with a simple prompt.
List of Sponsor Therapeutic Area in a JSON format.
Output:
Positive Case
{
"therapeutic_area": "Oncology",
"study_indication": "Advanced, metastatic, or recurrent endometrial cancer (measurable stage III/IVA, stage IVB, or recurrent disease).",
"supporting_evidence": [
{
"source": "[Protocol_raw_text.txt]
"details": "Protocol title: “A PHASE III RANDOMIZED, PLACEBO‑CONTROLLED STUDY OF XXX IN ADDITION TO XXX AND XXX FOR MEASURABLE XXXXX CANCER.” [1] },
{
"source": "[Protocol_raw_text.txt], "details": "Regulatory correspondence routed to the FDA Center for Drug Evaluation and Research, Office of Oncologic Diseases; division listed as “XXX” [1] }
],
"matched_therapeutic_area_from_reference": "Oncology",
"reference_file": "[therapeutic_areas.json], ...
}
Negative Case
Recommended by LinkedIn
{
"therapeutic_area": "Undetermined",
"study_indication": "Unable to extract study indication due to missing or unreadable content.",
"supporting_evidence": [
{
"source": "Protocol_raw_text.txt",
"details": "No retrievable content found in the protocol document. Retrieval error occurred during processing."
}
],
"matched_therapeutic_area_from_reference": [],
"reference_file": "therapeutic_areas.json",
"human_review_required": true,
"notes": "Therapeutic area could not be confidently assigned. Please review the study indication and supporting evidence manually."
}
Ref.:
{
"THERAPEUTIC_AREA": [
"None",
"Oncology",
"Cardiovascular",
"Immunology",
"Neuroscience",
"Antiviral",
"Vaccines",
"Ophthalmology",
"Endocrinology",
"Antibacterial/Antifungal",
"Special Projects"
]
}