IDAgents Developer commited on
Commit
7730f73
Β·
1 Parent(s): 702faa5

Add comprehensive orchestrator test case with complex multi-specialist scenario

Browse files
Files changed (1) hide show
  1. ORCHESTRATOR_TEST_CASE.md +336 -0
ORCHESTRATOR_TEST_CASE.md ADDED
@@ -0,0 +1,336 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎼 Comprehensive Orchestrator Test Case
2
+
3
+ ## Test Scenario: Complex Multi-Drug Resistant Infection
4
+
5
+ This test case demonstrates the orchestrator's ability to coordinate multiple specialist agents to solve a complex infectious disease case requiring expertise in multiple domains.
6
+
7
+ ---
8
+
9
+ ## 🎯 Setup: Create Your Agent Team
10
+
11
+ ### Step 1: Create Specialist Subagents
12
+
13
+ Create these **4 specialist agents** first (they will be the orchestrator's team):
14
+
15
+ #### Agent 1: Stewardship Specialist
16
+ ```
17
+ Agent Type: 🎯 Specialist
18
+ Agent Name: Stewardship Expert
19
+ Mission: Expert in antimicrobial stewardship, antibiotic selection, de-escalation strategies, and optimizing duration of therapy
20
+ Skills:
21
+ βœ“ recommend_deescalation
22
+ βœ“ alert_prolonged_antibiotic_use
23
+ βœ“ search_pubmed (if you want literature support)
24
+ ```
25
+
26
+ #### Agent 2: ID Diagnostician
27
+ ```
28
+ Agent Type: 🎯 Specialist
29
+ Agent Name: ID Diagnostician
30
+ Mission: Expert in infectious disease diagnosis, differential diagnosis, culture interpretation, and diagnostic workup
31
+ Skills:
32
+ βœ“ differential_diagnosis
33
+ βœ“ generate_ddx_mermaid
34
+ βœ“ search_pubmed
35
+ ```
36
+
37
+ #### Agent 3: ICU Consultant
38
+ ```
39
+ Agent Type: 🎯 Specialist
40
+ Agent Name: ICU Sepsis Consultant
41
+ Mission: Expert in critical care infectious diseases, sepsis management, hemodynamic support, and ICU-specific considerations
42
+ Skills:
43
+ βœ“ Any relevant tools you want
44
+ βœ“ search_pubmed
45
+ ```
46
+
47
+ #### Agent 4: Infection Prevention Specialist
48
+ ```
49
+ Agent Type: 🎯 Specialist
50
+ Agent Name: IPC Specialist
51
+ Mission: Expert in infection prevention and control, isolation precautions, outbreak management, and transmission-based precautions
52
+ Skills:
53
+ βœ“ ipc_reporting (if available)
54
+ βœ“ Any other relevant tools
55
+ ```
56
+
57
+ ### Step 2: Create the Orchestrator
58
+ ```
59
+ Agent Type: 🎼 Orchestrator
60
+ Agent Name: ID Maestro
61
+ Mission: Coordinate multiple ID specialists to provide comprehensive infectious disease consultation for complex cases requiring multidisciplinary expertise
62
+ Skills:
63
+ (Orchestrators have access to all your other agents automatically)
64
+ API Key: Your OpenAI API key
65
+ ```
66
+
67
+ ---
68
+
69
+ ## πŸ“‹ Test Case: Complex Septic Patient
70
+
71
+ ### Patient Presentation:
72
+
73
+ Copy and paste this detailed case to the **ID Maestro** orchestrator:
74
+
75
+ ```
76
+ I need help with a complex case:
77
+
78
+ Patient: 68-year-old male
79
+ PMH: Type 2 diabetes (A1C 9.2%), ESRD on hemodialysis (MWF), recent hospitalization 3 weeks ago for MRSA bacteremia from infected dialysis catheter
80
+
81
+ Current Presentation:
82
+ - Day 3 ICU admission for septic shock
83
+ - Fever 39.8Β°C, HR 118, BP 85/50 on norepinephrine 0.15 mcg/kg/min
84
+ - WBC 24,000 with 18% bands, procalcitonin 8.5
85
+ - Lactate 4.2 β†’ 2.8 after 4L crystalloid
86
+
87
+ Cultures:
88
+ - Blood cultures (Γ—2 sets): Gram-positive cocci in clusters at 14 hours
89
+ - Preliminary: MRSA (same strain as 3 weeks ago)
90
+ - Sensitivities pending (results in 24 hours)
91
+ - Urine culture: Negative
92
+ - CXR: Right lower lobe infiltrate
93
+
94
+ Current Antibiotics (started 72 hours ago):
95
+ - Vancomycin 1g IV q12h (trough pending)
96
+ - Piperacillin-tazobactam 3.375g IV q6h
97
+
98
+ Additional Info:
99
+ - New tunneled dialysis catheter placed 2 weeks ago
100
+ - Patient on contact precautions
101
+ - Last vancomycin trough (from previous admission): 18 mcg/mL
102
+ - CrCl: Not applicable (on dialysis)
103
+ - Patient improving clinically: BP improving, lactate trending down
104
+
105
+ Questions:
106
+ 1. Is current antibiotic coverage appropriate?
107
+ 2. Should we de-escalate or change therapy?
108
+ 3. What's the optimal duration?
109
+ 4. Any diagnostic workup needed?
110
+ 5. Are isolation precautions correct?
111
+ 6. What are the key stewardship considerations?
112
+ ```
113
+
114
+ ---
115
+
116
+ ## βœ… Expected Orchestrator Behavior
117
+
118
+ ### Phase 1: Planning
119
+ The orchestrator should:
120
+ 1. **Analyze the request** and identify multiple distinct tasks
121
+ 2. **Create an execution plan** listing which agents to invoke:
122
+ - ID Diagnostician (for differential diagnosis and culture interpretation)
123
+ - Stewardship Expert (for antibiotic optimization and de-escalation)
124
+ - ICU Sepsis Consultant (for critical care considerations)
125
+ - IPC Specialist (for isolation precautions)
126
+ 3. **Display the plan** with numbered steps
127
+ 4. **Wait for your confirmation** ("proceed")
128
+
129
+ ### Phase 2: Execution (after you say "proceed")
130
+ The orchestrator should:
131
+ 1. **Invoke each specialist agent** sequentially or in parallel
132
+ 2. **Show progress**: "πŸš€ Invoking [Agent Name]..."
133
+ 3. **Collect responses** from each specialist
134
+ 4. **Display intermediate results** as they come in
135
+
136
+ ### Phase 3: Synthesis
137
+ The orchestrator should:
138
+ 1. **Synthesize all specialist inputs** into a comprehensive recommendation
139
+ 2. **Address all 6 questions** from the original query
140
+ 3. **Highlight agreements** between specialists
141
+ 4. **Resolve conflicts** if specialists disagree
142
+ 5. **Provide prioritized recommendations**
143
+ 6. **Include specific details** from each specialty area
144
+
145
+ ---
146
+
147
+ ## πŸ“Š What to Look For
148
+
149
+ ### βœ… Good Orchestrator Performance:
150
+
151
+ **Planning Phase:**
152
+ - [ ] Identifies 4-6 distinct tasks in the case
153
+ - [ ] Plans to invoke 3-4 specialist agents
154
+ - [ ] Explains rationale for each agent selection
155
+ - [ ] Asks for confirmation before proceeding
156
+
157
+ **Execution Phase:**
158
+ - [ ] Shows "Invoking [Agent Name]" messages
159
+ - [ ] Displays each agent's response as it arrives
160
+ - [ ] Maintains context between agent calls
161
+ - [ ] Handles any agent errors gracefully
162
+
163
+ **Synthesis Phase:**
164
+ - [ ] Comprehensive final answer addressing all questions
165
+ - [ ] Specific recommendations from each specialist:
166
+ - **Diagnostician**: Culture interpretation, workup recommendations
167
+ - **Stewardship**: De-escalation plan, duration recommendations
168
+ - **ICU Consultant**: Hemodynamic considerations, monitoring
169
+ - **IPC**: Isolation precautions, transmission prevention
170
+ - [ ] Prioritized action items (urgent vs routine)
171
+ - [ ] Clinical reasoning and evidence-based rationale
172
+
173
+ ### ❌ Signs of Issues:
174
+
175
+ - Orchestrator doesn't invoke any subagents (just gives generic answer)
176
+ - Shows execution plan but doesn't actually call the agents
177
+ - Can't find subagents (error: "agent not found")
178
+ - Synthesis doesn't incorporate subagent responses
179
+ - Generic response without specialty-specific details
180
+
181
+ ---
182
+
183
+ ## πŸ” Detailed Expected Outputs
184
+
185
+ ### From ID Diagnostician:
186
+ ```
187
+ Expected content:
188
+ - MRSA bacteremia recurrence vs new infection
189
+ - Differential diagnosis for persistent bacteremia
190
+ - Recommendations for:
191
+ * Echo to rule out endocarditis
192
+ * Consider removing/replacing dialysis catheter
193
+ * Imaging for metastatic foci
194
+ - Discussion of complicated vs uncomplicated bacteremia
195
+ ```
196
+
197
+ ### From Stewardship Expert:
198
+ ```
199
+ Expected content:
200
+ - Vancomycin optimization (check trough, AUC/MIC)
201
+ - Piperacillin-tazobactam: likely unnecessary (can de-escalate)
202
+ - Alternative options: daptomycin, ceftaroline, linezolid
203
+ - Duration: 14 days for uncomplicated, longer if endocarditis
204
+ - Monitoring: Weekly vancomycin troughs, renal function
205
+ - De-escalation timeline: After sensitivities available
206
+ ```
207
+
208
+ ### From ICU Sepsis Consultant:
209
+ ```
210
+ Expected content:
211
+ - Hemodynamic status: Improving (decreasing vasopressor needs)
212
+ - Source control: Consider catheter removal
213
+ - Fluid resuscitation: Adequate (lactate improving)
214
+ - Monitoring: Daily blood cultures until clearance
215
+ - ICU-specific considerations: Dialysis timing with antibiotics
216
+ - Prognosis: Good if source controlled
217
+ ```
218
+
219
+ ### From IPC Specialist:
220
+ ```
221
+ Expected content:
222
+ - Contact precautions: Appropriate for MRSA
223
+ - Isolation duration: Until cultures negative
224
+ - Staff education: Hand hygiene, PPE compliance
225
+ - Cohorting considerations
226
+ - Decolonization protocols (if recurrent MRSA)
227
+ - Environmental cleaning protocols
228
+ ```
229
+
230
+ ### Orchestrator's Synthesis:
231
+ ```
232
+ Expected structure:
233
+ 1. Summary of case (MRSA bacteremia, improving septic shock)
234
+ 2. Answers to each question with specialist input:
235
+ Q1: Coverage - vanc appropriate, pip-tazo can stop
236
+ Q2: De-escalation - stop pip-tazo, optimize vanc dosing
237
+ Q3: Duration - 14+ days depending on source control
238
+ Q4: Workup - echo, consider catheter removal
239
+ Q5: Isolation - contact precautions correct
240
+ Q6: Stewardship - multiple opportunities to optimize
241
+ 3. Prioritized action plan:
242
+ - Urgent (today): Echo, check vanc trough, stop pip-tazo
243
+ - Within 24h: Review sensitivities, daily blood cultures
244
+ - Ongoing: Monitor clinical response, consider catheter removal
245
+ 4. Key takeaways and follow-up plan
246
+ ```
247
+
248
+ ---
249
+
250
+ ## πŸ§ͺ Alternative Test Cases
251
+
252
+ ### Quick Test (Simpler):
253
+ ```
254
+ "Patient with pneumonia needing antibiotic choice and stewardship guidance.
255
+ How should I treat and what's the optimal duration?"
256
+ ```
257
+
258
+ ### Complex Multi-System Test:
259
+ ```
260
+ "78F with UTI, pneumonia, and C. diff. Multiple antibiotics on board.
261
+ Need help with antibiotic optimization, infection prevention, and diagnostic workup."
262
+ ```
263
+
264
+ ### Outbreak Scenario:
265
+ ```
266
+ "3 patients in ICU with carbapenem-resistant Enterobacterales.
267
+ Need infection control measures, treatment options, and stewardship guidance."
268
+ ```
269
+
270
+ ---
271
+
272
+ ## πŸ“ Testing Checklist
273
+
274
+ ### Before Testing:
275
+ - [ ] All 4 specialist agents created
276
+ - [ ] Orchestrator agent created
277
+ - [ ] NCBI_EMAIL and API keys set (for PubMed searches)
278
+ - [ ] Browser ready in builder panel
279
+
280
+ ### During Test:
281
+ - [ ] Paste complete case to orchestrator
282
+ - [ ] Wait for execution plan
283
+ - [ ] Type "proceed" to start execution
284
+ - [ ] Watch for agent invocation messages
285
+ - [ ] Note any errors or missing agents
286
+
287
+ ### After Test:
288
+ - [ ] Review final synthesis
289
+ - [ ] Verify all questions answered
290
+ - [ ] Check if recommendations are actionable
291
+ - [ ] Confirm specialty-specific details included
292
+ - [ ] Test passed: βœ… or needs debugging: ❌
293
+
294
+ ---
295
+
296
+ ## πŸ› Troubleshooting
297
+
298
+ ### Problem: Orchestrator doesn't invoke subagents
299
+ **Solution**: Verify you created the subagents first and they're visible in your "Active Agents" list
300
+
301
+ ### Problem: "Agent not found" errors
302
+ **Solution**: Check agent names match exactly (case-sensitive). Recreate agents if needed.
303
+
304
+ ### Problem: Generic response without specialist details
305
+ **Solution**: Orchestrator might not have access to subagents. Verify per-user isolation working correctly.
306
+
307
+ ### Problem: Execution plan created but agents not invoked
308
+ **Solution**: Make sure to type "proceed" after the plan is shown
309
+
310
+ ### Problem: Only one agent invoked instead of multiple
311
+ **Solution**: Case might not be complex enough. Use the detailed test case above.
312
+
313
+ ---
314
+
315
+ ## πŸŽ“ Learning Objectives
316
+
317
+ This test demonstrates:
318
+ 1. βœ… **Multi-agent coordination** - Orchestrator managing 4+ specialists
319
+ 2. βœ… **Complex reasoning** - Breaking down multifaceted clinical case
320
+ 3. βœ… **Information synthesis** - Combining multiple expert opinions
321
+ 4. βœ… **Conflict resolution** - Handling differing recommendations
322
+ 5. βœ… **Prioritization** - Urgent vs routine actions
323
+ 6. βœ… **Comprehensive coverage** - Addressing all aspects of care
324
+ 7. βœ… **Clinical applicability** - Actionable recommendations
325
+
326
+ ---
327
+
328
+ ## πŸ“š Documentation Reference
329
+
330
+ - **Orchestrator Setup**: See `FIX_ORCHESTRATOR_SUBAGENTS.md`
331
+ - **Agent Isolation**: See `AGENT_ISOLATION_COMPLETE.md`
332
+ - **Testing Guide**: See `TEST_AGENT_ISOLATION.md`
333
+
334
+ ---
335
+
336
+ **Ready to test?** Start by creating your 4 specialist agents, then the orchestrator, then paste the complex case! πŸŽΌπŸ”¬