# Speaker Name Conflict Resolution - Enhancement ## Date October 1, 2025 ## Overview Enhancement to handle conflicts between user-edited speaker names and automatically detected names, allowing users to intentionally clear names to enable automatic detection to fill them again. ## Current Behavior Analysis ### Existing Merge Logic **Location:** `frontend/app.js:handleSpeakerNameDetection()` (lines ~1030-1037) ```javascript // Merge detected names with existing user-edited names (preserve user edits) const mergedNames = { ...speakerNames }; if (state.speakerNames) { Object.entries(state.speakerNames).forEach(([speakerId, info]) => { if (info.confidence === 'user') { // Preserve user-edited names mergedNames[speakerId] = info; } }); } ``` **Current Logic:** - ✅ Preserves ALL user-edited names (confidence === 'user') - ✅ Prevents auto-detection from overwriting user edits - ❌ Does NOT allow user to "reset" a name to enable auto-detection ### Existing Edit Logic **Location:** `frontend/app.js:startSpeakerEdit()` (lines ~665-680) ```javascript const finishEdit = (save = true) => { const newName = input.value.trim(); if (save && newName) { // Update state with user-edited name state.speakerNames[speakerId] = { name: newName, confidence: 'user', reason: 'User edited' }; // Force re-render... } else { // Restore original name (doesn't remove from state) const originalName = state.speakerNames?.[speakerId]?.name || `Speaker ${speakerId + 1}`; speakerTag.textContent = originalName; } }; ``` **Current Logic:** - ✅ Saves non-empty names as user-edited - ❌ Empty input (after trim) restores original name, doesn't clear it - ❌ No way to "clear" a user-edited name to allow auto-detection ## Problem Statement (Bug 2.4.3) ### User Story **As a user**, I want to: 1. Manually edit speaker names when I know them 2. Have my edits protected from auto-detection override 3. **Clear/reset a name to allow auto-detection to try again** 4. Have auto-detection only fill empty speaker names ### Current Limitations #### Scenario 1: User Cannot Clear Name ``` 1. User edits: "Speaker 1" → "John" state.speakerNames[0] = { name: "John", confidence: "user" } 2. User realizes this is wrong, tries to clear it - Edits tag, deletes all text, presses Enter - Expected: Name cleared, tag shows "Speaker 1" - Actual: Name restored to "John" (not cleared) 3. User clicks "Detect Speaker Names" - Expected: Auto-detection fills "Speaker 1" - Actual: "John" preserved (auto-detection blocked) 4. User is stuck with wrong name! ``` #### Scenario 2: No Way to Reset ``` User workflow: 1. Manually name speakers: 0="Alice", 1="Bob" 2. Later realize they're wrong 3. Want auto-detection to try again 4. No way to "reset" to allow auto-detection Current workaround: None (would need to reload page or manually correct) ``` ## Solution Design ### Design Principles 1. **Explicit Intent:** Empty input should signal "clear this name" 2. **Selective Override:** Auto-detection should only fill empty names 3. **User Control:** User can always override auto-detection 4. **Clear Reset Path:** User can clear name to enable auto-detection ### State Management Strategy #### Three States for Speaker Names ```javascript state.speakerNames[speakerId] = // State 1: User-Edited (Protected) { name: "John", confidence: "user", reason: "User edited" } // State 2: Auto-Detected (Overridable) { name: "Alice", confidence: "high", reason: "Self-introduction" } // State 3: Cleared (Allows Auto-Detection) undefined // Speaker removed from state.speakerNames ``` **Key Decision:** When user clears a name, **remove it from `state.speakerNames`** entirely. ### Logic Flow #### Flow 1: Manual Edit (Non-Empty) ``` User edits tag: "" or "Speaker 1" → "John" ↓ Input validation: newName.trim() !== "" ↓ state.speakerNames[0] = { name: "John", confidence: "user" } ↓ Re-render UI: All tags show "John" ↓ Auto-detection: Skips speaker 0 (user-edited) ``` #### Flow 2: Manual Clear (Empty) ``` User edits tag: "John" → "" (empty) ↓ Input validation: newName.trim() === "" ↓ delete state.speakerNames[0] // Remove from state ↓ Re-render UI: Tags show "Speaker 1" (default) ↓ Auto-detection: Can fill speaker 0 (no longer user-edited) ``` #### Flow 3: Auto-Detection Merge ``` Auto-detection returns: { 0: {name: "Alice", ...}, 1: {name: "Bob", ...} } ↓ Merge logic checks each speaker: - Speaker 0: state.speakerNames[0] exists? → YES (confidence="user"): Keep "John", skip "Alice" → NO: Use "Alice" - Speaker 1: state.speakerNames[1] exists? → YES (confidence="user"): Keep user name → NO: Use "Bob" ↓ Re-render UI with merged names ``` ## Implementation ### Change 1: Update Manual Edit Logic **File:** `frontend/app.js:startSpeakerEdit()` (lines ~665-680) ```javascript const finishEdit = (save = true) => { const newName = input.value.trim(); if (save) { if (newName) { // Non-empty name: Save as user-edited if (!state.speakerNames) state.speakerNames = {}; state.speakerNames[speakerId] = { name: newName, confidence: 'user', reason: 'User edited' }; } else { // Empty name: Clear from state (allow auto-detection) if (state.speakerNames && state.speakerNames[speakerId]) { delete state.speakerNames[speakerId]; } } // Force re-render to update all UI elements renderTranscript(true); renderTimelineSegments(); renderDiarizationStats(); } else { // Cancel edit: Restore current name const originalName = state.speakerNames?.[speakerId]?.name || `Speaker ${speakerId + 1}`; speakerTag.textContent = originalName; speakerTag.classList.add('editable-speaker'); } }; ``` **Key Changes:** - Added `else` branch for empty input - `delete state.speakerNames[speakerId]` removes from state - Triggers re-render even for empty input (to show default name) - Cancel (Escape) still restores original name without clearing ### Change 2: Enhance Merge Logic (Already Correct!) **File:** `frontend/app.js:handleSpeakerNameDetection()` (lines ~1030-1037) ```javascript // Current code is already correct! const mergedNames = { ...speakerNames }; if (state.speakerNames) { Object.entries(state.speakerNames).forEach(([speakerId, info]) => { if (info.confidence === 'user') { // Preserve user-edited names mergedNames[speakerId] = info; } }); } ``` **Why it's correct:** - If speaker cleared: `state.speakerNames[speakerId]` is undefined - Loop skips undefined entries - Auto-detected name fills the gap ✓ **No changes needed here!** ## Behavior Examples ### Example 1: Clear and Auto-Detect **Initial State:** ```javascript state.speakerNames = { 0: { name: "John", confidence: "user" }, 1: { name: "Alice", confidence: "user" } } Transcript: [John] Hello everyone... [Alice] Hi there... [John] Today we'll discuss... ``` **User Action 1:** Clear Speaker 0 ``` User clicks "John" tag, deletes all text, presses Enter state.speakerNames = { 1: { name: "Alice", confidence: "user" } } // Speaker 0 removed from state Transcript: [Speaker 1] Hello everyone... ← Shows default [Alice] Hi there... ← User-edited preserved [Speaker 1] Today we'll discuss... ← Shows default ``` **User Action 2:** Click "Detect Speaker Names" ``` Auto-detection returns: { 0: {name: "Dr. Smith", confidence: "high"} } Merge logic: - Speaker 0: No user edit → Use "Dr. Smith" ✓ - Speaker 1: User edited → Keep "Alice" ✓ state.speakerNames = { 0: { name: "Dr. Smith", confidence: "high" }, 1: { name: "Alice", confidence: "user" } } Transcript: [Dr. Smith] Hello everyone... ← Auto-detected [Alice] Hi there... ← User-edited preserved [Dr. Smith] Today we'll discuss... ← Auto-detected ``` ### Example 2: Edit, Clear, Edit Again **Initial State:** ```javascript state.speakerNames = {} Transcript: [Speaker 1] Hello... [Speaker 2] Hi... ``` **Step 1:** User edits ``` Edit Speaker 1 → "Wrong Name" state.speakerNames = { 0: { name: "Wrong Name", confidence: "user" } } Transcript: [Wrong Name] Hello... ``` **Step 2:** User realizes mistake, clears ``` Edit "Wrong Name" → "" (empty) state.speakerNames = {} // Speaker 0 removed Transcript: [Speaker 1] Hello... ← Back to default ``` **Step 3:** User edits correctly ``` Edit Speaker 1 → "Correct Name" state.speakerNames = { 0: { name: "Correct Name", confidence: "user" } } Transcript: [Correct Name] Hello... ``` ### Example 3: Cancel vs Clear **Scenario A: Cancel Edit (Escape key)** ``` Tag shows "John" User clicks tag, deletes text, presses Escape → Input cancelled, "John" restored → state.speakerNames unchanged ``` **Scenario B: Clear Edit (Enter key)** ``` Tag shows "John" User clicks tag, deletes text, presses Enter → Edit saved with empty value → state.speakerNames[speakerId] deleted → Tag shows "Speaker 1" (default) ``` ## Edge Cases ### Edge Case 1: Clear Non-Existent Name ``` Speaker has default name "Speaker 1" (not in state) User clicks tag, clears (empty input), presses Enter Check: state.speakerNames[0] exists? → NO: Nothing to delete → Result: No error, shows "Speaker 1" (unchanged) ``` ### Edge Case 2: Clear During Transcription ``` Live transcription in progress User clears speaker name Result: - Name cleared from state ✓ - Re-render triggered ✓ - New utterances show default name ✓ - Incremental rendering preserved ✓ ``` ### Edge Case 3: Clear All Names ``` User clears all speaker names state.speakerNames = {} Auto-detection: - All speakers available for detection ✓ - Can fill all empty names ✓ ``` ### Edge Case 4: Whitespace-Only Input ``` User enters " " (spaces only) Validation: " ".trim() === "" → Treated as empty input → Name cleared from state ✓ ``` ## Testing Scenarios ### ✅ Test 1: Clear User-Edited Name 1. Edit "Speaker 1" → "John" 2. Verify: Tag shows "John" 3. Edit "John" → "" (empty) 4. Verify: Tag shows "Speaker 1" 5. Verify: `state.speakerNames[0]` is undefined ### ✅ Test 2: Auto-Detection After Clear 1. Edit "Speaker 1" → "Wrong" 2. Click "Detect Speaker Names" 3. Verify: "Wrong" preserved (not overwritten) 4. Clear "Wrong" → "" 5. Click "Detect Speaker Names" again 6. Verify: Auto-detected name appears ### ✅ Test 3: Cancel Does Not Clear 1. Tag shows "John" 2. Click tag, delete text 3. Press Escape 4. Verify: Tag shows "John" (restored) 5. Verify: `state.speakerNames[0]` unchanged ### ✅ Test 4: Empty Edit Triggers Re-Render 1. Tag shows "John" 2. Clear name → "" 3. Verify: All tags for speaker 0 show "Speaker 1" 4. Verify: Timeline segments updated 5. Verify: Stats panel updated ### ✅ Test 5: Clear and Re-Edit 1. Edit "Speaker 1" → "First" 2. Clear "First" → "" 3. Edit "Speaker 1" → "Second" 4. Verify: All tags show "Second" 5. Verify: Protected from auto-detection ### ✅ Test 6: Whitespace Handling 1. Edit "Speaker 1" → " " (spaces) 2. Press Enter 3. Verify: Treated as empty, name cleared 4. Verify: Tag shows "Speaker 1" ## User Experience Improvements ### Visual Feedback Consider adding visual hints: ```css /* Indicate clearable/editable state */ .speaker-tag.editable-speaker { cursor: text; border-style: dashed; /* Hint: editable */ } .speaker-tag.editable-speaker:hover::after { content: " ✎"; /* Pencil icon */ opacity: 0.5; } ``` ### Tooltip Enhancement ```javascript // In createUtteranceElement() if (speakerInfo?.confidence === 'user') { speakerTag.title = 'User-edited name (click to edit or clear)'; } else if (speakerInfo?.confidence === 'high') { speakerTag.title = 'Auto-detected name (click to override)'; } else { speakerTag.title = 'Click to edit speaker name'; } ``` ### Clear Button (Optional) Add explicit "Clear" button in edit mode: ```html ``` ## Implementation Checklist - [x] Analyze current merge logic - [x] Design clear/reset mechanism - [x] Document three speaker name states - [ ] Implement empty input handling in `startSpeakerEdit()` - [ ] Test manual clear functionality - [ ] Test auto-detection after clear - [ ] Test cancel vs clear behavior - [ ] Verify timeline and stats panel sync - [ ] Update documentation - [ ] Commit changes ## Files to Modify ### `/home/luigi/VoxSum/frontend/app.js` - **Function:** `startSpeakerEdit()` (lines ~665-690) - **Change:** Add `else` branch for empty input to delete from state - **Impact:** ~10 lines modified ### No Other Files Required - Merge logic already correct (no changes needed) - UI rendering already supports undefined names (shows default) ## Performance Considerations - **Delete operation:** O(1) - fast - **Re-render trigger:** Same as edit (~10-50ms) - **Memory:** Reduces state size (removes cleared entries) ## Backward Compatibility - ✅ Existing user-edited names preserved - ✅ Auto-detection logic unchanged - ✅ Default name display unchanged - ✅ No breaking changes to API ## Conclusion The enhancement allows users to intentionally clear speaker names to enable auto-detection, providing a clear "reset" path while maintaining protection for user edits. The implementation is simple (one `else` branch), robust, and maintains all existing functionality.