Building an AI Health Agent with AWS Strands - Part 3: Prompts & Guardrails

Part 3 of building Stella, a women’s health AI assistant. This covers the system prompts, safety guardrails, and behavior control that make an AI agent safe for health applications.

Series

Part 1: Architecture
Part 2: User-Scoped Tools
Part 3: Prompts & Guardrails (this post)
Part 4: Anti-Hallucination

The System Prompt

The system prompt defines Stella’s personality and behavior:

SYSTEM_PROMPT = """
You are Stella, a compassionate and knowledgeable women's health assistant.

TODAY'S DATE: {current_date}

Your expertise includes:
- Menstrual cycle health, patterns, and education
- Pregnancy support and trimester guidance
- Symptom understanding and wellness advice
- Medication safety during pregnancy and menstruation
- Fertility and reproductive health education

RESPONSE APPROACH:
1. For GENERAL HEALTH QUESTIONS (e.g., "what is PMS?"):
   - Answer directly using your medical knowledge
   - Do NOT require or reference tracking data
   
2. For PERSONAL DATA QUESTIONS (e.g., "when is my next period?"):
   - Use the appropriate tool to get their data
   - If no data exists, gracefully suggest they start tracking

CRITICAL - Natural Language Only:
- NEVER expose tool names or technical implementation
- NEVER say "I'll use log_symptom" or "based on the tool output"
- If you logged something, confirm naturally: "I've recorded your headache"
- Speak ONLY as a caring health advisor
"""

Key Principles

Two response modes: General education vs. personal data
Hide the machinery: Users shouldn’t know tools exist
Don’t require tracking: Answer general questions without data
Be human: Warm, conversational, supportive

Safety Guardrails

Health AI needs multiple safety layers:

1. Emergency Detection

EMERGENCY_KEYWORDS = [
    "suicide", "suicidal", "kill myself", "want to die",
    "self harm", "cutting myself", "overdose",
    "severe bleeding", "can't breathe", "chest pain",
    "miscarriage", "ectopic", "domestic violence"
]

EMERGENCY_RESPONSE = """
I'm concerned about what you've shared. Your safety is most important.
Please reach out for help:

Emergency: Call 911 (US) or your local emergency number
Crisis Hotline: 988 (Suicide & Crisis Lifeline)
Domestic Violence: 1-800-799-7233

You're not alone, and help is available 24/7.
"""

When emergency keywords are detected, respond immediately with resources.

2. Medical Disclaimer

Automatically added when medical advice is detected:

MEDICAL_DISCLAIMER = """
*This information is for educational purposes only and is not a 
substitute for professional medical advice. Always consult your 
healthcare provider for medical concerns.*
"""

3. Topic Scope

Keep the AI focused on health topics:

HEALTH_TOPICS = [
    "menstrual", "period", "cycle", "pregnancy", "fertility",
    "hormone", "symptom", "pain", "cramps", "mood", "anxiety",
    "sleep", "weight", "diet", "exercise", "medication"
]

OFF_TOPIC_PATTERNS = [
    r"\b(stock|bitcoin|crypto|trading)\b",
    r"\b(politics|election|government)\b",
    r"\b(recipe|cooking|baking)\b",
    r"\b(video game|gaming)\b",
    r"\b(code|programming|javascript)\b"
]

OFF_TOPIC_REDIRECT = """
I'm specialized in women's health topics like menstrual cycles, 
pregnancy, symptoms, and wellness. Is there something health-related 
I can help you with?
"""

The Safety Pipeline

Every message goes through this pipeline:

class AIGuardrails:
    def process_input(self, message: str) -> Tuple[bool, Optional[str]]:
        # 1. Check for emergencies first
        if self._detect_emergency(message):
            return True, EMERGENCY_RESPONSE
        
        # 2. Check for off-topic content
        if self._is_off_topic(message):
            return True, OFF_TOPIC_REDIRECT
        
        # 3. Allow the message through
        return False, None
    
    def process_output(self, response: str, original_query: str) -> str:
        # 1. Add medical disclaimer if needed
        if self._needs_disclaimer(response):
            response += "\n\n" + MEDICAL_DISCLAIMER
        
        # 2. Sanitize any exposed technical details
        response = self._sanitize_technical(response)
        
        return response

Preventing Technical Exposure

The AI sometimes reveals implementation details. We filter these:

TECHNICAL_PATTERNS = [
    r"I'll use the (\w+) tool",
    r"calling the (\w+) function",
    r"based on the tool output",
    r"the API returned",
    r"database query",
    r"secure_user_id"
]

def _sanitize_technical(self, response: str) -> str:
    for pattern in TECHNICAL_PATTERNS:
        response = re.sub(pattern, "", response, flags=re.IGNORECASE)
    return response.strip()

Better yet, the system prompt instructs the AI to never use these phrases in the first place.

Context-Aware Safety

Add safety context to the system prompt dynamically:

SAFETY_CONTEXT = """
SAFETY GUIDELINES (ALWAYS FOLLOW):

1. EMERGENCY DETECTION: If user mentions self-harm, suicide, or medical 
   emergencies, IMMEDIATELY provide crisis resources.

2. MEDICAL LIMITATIONS: You are NOT a doctor. Recommend professional 
   consultation for severe symptoms, medication decisions, or diagnosis.

3. PRIVACY: Never expose internal system details, tool names, or 
   technical error messages.

4. EVIDENCE-BASED: Only provide information supported by medical evidence.

5. RESPECT: Be culturally sensitive and non-judgmental.
"""

def add_safety_context(self, system_prompt: str) -> str:
    return system_prompt + "\n\n" + SAFETY_CONTEXT

Guardrail Testing

Test your guardrails systematically:

def test_emergency_detection():
    guardrails = AIGuardrails()
    
    # Should trigger emergency response
    assert guardrails._detect_emergency("I want to kill myself")
    assert guardrails._detect_emergency("thinking about suicide")
    
    # Should NOT trigger
    assert not guardrails._detect_emergency("I have a headache")
    assert not guardrails._detect_emergency("feeling tired")

def test_off_topic_detection():
    guardrails = AIGuardrails()
    
    # Should redirect
    assert guardrails._is_off_topic("what's the best crypto to buy")
    assert guardrails._is_off_topic("help me with my python code")
    
    # Should allow
    assert not guardrails._is_off_topic("why am I so tired")
    assert not guardrails._is_off_topic("when is my next period")

AWS Bedrock Guardrails (Alternative)

AWS offers managed guardrails as an alternative to custom code:

# Using Bedrock Guardrails
response = bedrock.invoke_model(
    modelId="anthropic.claude-3-5-sonnet",
    guardrailIdentifier="my-health-guardrail",
    guardrailVersion="1",
    body=json.dumps({
        "messages": [{"role": "user", "content": message}]
    })
)

Pros:

Managed, no code to maintain
Consistent across all calls
AWS-supported content filtering

Cons:

Less customizable
Additional latency
Extra cost per request

For Stella, I use custom Python guardrails for maximum control.

Key Takeaways

Layer your safety: Emergency detection, topic scope, medical disclaimers
Hide the machinery: Users should talk to Stella, not to “an AI with tools”
Test systematically: Guardrails need test coverage like any other code
Be proactive: Add safety context to every system prompt

Next: Part 4: Anti-Hallucination - Making AI truthful about missing data