# Speaker Name Conflict Resolution - Enhancement
## Date
October 1, 2025
## Overview
Enhancement to handle conflicts between user-edited speaker names and automatically detected names, allowing users to intentionally clear names to enable automatic detection to fill them again.
## Current Behavior Analysis
### Existing Merge Logic
**Location:** `frontend/app.js:handleSpeakerNameDetection()` (lines ~1030-1037)
```javascript
// Merge detected names with existing user-edited names (preserve user edits)
const mergedNames = { ...speakerNames };
if (state.speakerNames) {
Object.entries(state.speakerNames).forEach(([speakerId, info]) => {
if (info.confidence === 'user') {
// Preserve user-edited names
mergedNames[speakerId] = info;
}
});
}
```
**Current Logic:**
- ✅ Preserves ALL user-edited names (confidence === 'user')
- ✅ Prevents auto-detection from overwriting user edits
- ❌ Does NOT allow user to "reset" a name to enable auto-detection
### Existing Edit Logic
**Location:** `frontend/app.js:startSpeakerEdit()` (lines ~665-680)
```javascript
const finishEdit = (save = true) => {
const newName = input.value.trim();
if (save && newName) {
// Update state with user-edited name
state.speakerNames[speakerId] = {
name: newName,
confidence: 'user',
reason: 'User edited'
};
// Force re-render...
} else {
// Restore original name (doesn't remove from state)
const originalName = state.speakerNames?.[speakerId]?.name
|| `Speaker ${speakerId + 1}`;
speakerTag.textContent = originalName;
}
};
```
**Current Logic:**
- ✅ Saves non-empty names as user-edited
- ❌ Empty input (after trim) restores original name, doesn't clear it
- ❌ No way to "clear" a user-edited name to allow auto-detection
## Problem Statement (Bug 2.4.3)
### User Story
**As a user**, I want to:
1. Manually edit speaker names when I know them
2. Have my edits protected from auto-detection override
3. **Clear/reset a name to allow auto-detection to try again**
4. Have auto-detection only fill empty speaker names
### Current Limitations
#### Scenario 1: User Cannot Clear Name
```
1. User edits: "Speaker 1" → "John"
state.speakerNames[0] = { name: "John", confidence: "user" }
2. User realizes this is wrong, tries to clear it
- Edits tag, deletes all text, presses Enter
- Expected: Name cleared, tag shows "Speaker 1"
- Actual: Name restored to "John" (not cleared)
3. User clicks "Detect Speaker Names"
- Expected: Auto-detection fills "Speaker 1"
- Actual: "John" preserved (auto-detection blocked)
4. User is stuck with wrong name!
```
#### Scenario 2: No Way to Reset
```
User workflow:
1. Manually name speakers: 0="Alice", 1="Bob"
2. Later realize they're wrong
3. Want auto-detection to try again
4. No way to "reset" to allow auto-detection
Current workaround: None (would need to reload page or manually correct)
```
## Solution Design
### Design Principles
1. **Explicit Intent:** Empty input should signal "clear this name"
2. **Selective Override:** Auto-detection should only fill empty names
3. **User Control:** User can always override auto-detection
4. **Clear Reset Path:** User can clear name to enable auto-detection
### State Management Strategy
#### Three States for Speaker Names
```javascript
state.speakerNames[speakerId] =
// State 1: User-Edited (Protected)
{ name: "John", confidence: "user", reason: "User edited" }
// State 2: Auto-Detected (Overridable)
{ name: "Alice", confidence: "high", reason: "Self-introduction" }
// State 3: Cleared (Allows Auto-Detection)
undefined // Speaker removed from state.speakerNames
```
**Key Decision:** When user clears a name, **remove it from `state.speakerNames`** entirely.
### Logic Flow
#### Flow 1: Manual Edit (Non-Empty)
```
User edits tag: "" or "Speaker 1" → "John"
↓
Input validation: newName.trim() !== ""
↓
state.speakerNames[0] = { name: "John", confidence: "user" }
↓
Re-render UI: All tags show "John"
↓
Auto-detection: Skips speaker 0 (user-edited)
```
#### Flow 2: Manual Clear (Empty)
```
User edits tag: "John" → "" (empty)
↓
Input validation: newName.trim() === ""
↓
delete state.speakerNames[0] // Remove from state
↓
Re-render UI: Tags show "Speaker 1" (default)
↓
Auto-detection: Can fill speaker 0 (no longer user-edited)
```
#### Flow 3: Auto-Detection Merge
```
Auto-detection returns: { 0: {name: "Alice", ...}, 1: {name: "Bob", ...} }
↓
Merge logic checks each speaker:
- Speaker 0: state.speakerNames[0] exists?
→ YES (confidence="user"): Keep "John", skip "Alice"
→ NO: Use "Alice"
- Speaker 1: state.speakerNames[1] exists?
→ YES (confidence="user"): Keep user name
→ NO: Use "Bob"
↓
Re-render UI with merged names
```
## Implementation
### Change 1: Update Manual Edit Logic
**File:** `frontend/app.js:startSpeakerEdit()` (lines ~665-680)
```javascript
const finishEdit = (save = true) => {
const newName = input.value.trim();
if (save) {
if (newName) {
// Non-empty name: Save as user-edited
if (!state.speakerNames) state.speakerNames = {};
state.speakerNames[speakerId] = {
name: newName,
confidence: 'user',
reason: 'User edited'
};
} else {
// Empty name: Clear from state (allow auto-detection)
if (state.speakerNames && state.speakerNames[speakerId]) {
delete state.speakerNames[speakerId];
}
}
// Force re-render to update all UI elements
renderTranscript(true);
renderTimelineSegments();
renderDiarizationStats();
} else {
// Cancel edit: Restore current name
const originalName = state.speakerNames?.[speakerId]?.name
|| `Speaker ${speakerId + 1}`;
speakerTag.textContent = originalName;
speakerTag.classList.add('editable-speaker');
}
};
```
**Key Changes:**
- Added `else` branch for empty input
- `delete state.speakerNames[speakerId]` removes from state
- Triggers re-render even for empty input (to show default name)
- Cancel (Escape) still restores original name without clearing
### Change 2: Enhance Merge Logic (Already Correct!)
**File:** `frontend/app.js:handleSpeakerNameDetection()` (lines ~1030-1037)
```javascript
// Current code is already correct!
const mergedNames = { ...speakerNames };
if (state.speakerNames) {
Object.entries(state.speakerNames).forEach(([speakerId, info]) => {
if (info.confidence === 'user') {
// Preserve user-edited names
mergedNames[speakerId] = info;
}
});
}
```
**Why it's correct:**
- If speaker cleared: `state.speakerNames[speakerId]` is undefined
- Loop skips undefined entries
- Auto-detected name fills the gap ✓
**No changes needed here!**
## Behavior Examples
### Example 1: Clear and Auto-Detect
**Initial State:**
```javascript
state.speakerNames = {
0: { name: "John", confidence: "user" },
1: { name: "Alice", confidence: "user" }
}
Transcript:
[John] Hello everyone...
[Alice] Hi there...
[John] Today we'll discuss...
```
**User Action 1:** Clear Speaker 0
```
User clicks "John" tag, deletes all text, presses Enter
state.speakerNames = {
1: { name: "Alice", confidence: "user" }
}
// Speaker 0 removed from state
Transcript:
[Speaker 1] Hello everyone... ← Shows default
[Alice] Hi there... ← User-edited preserved
[Speaker 1] Today we'll discuss... ← Shows default
```
**User Action 2:** Click "Detect Speaker Names"
```
Auto-detection returns:
{ 0: {name: "Dr. Smith", confidence: "high"} }
Merge logic:
- Speaker 0: No user edit → Use "Dr. Smith" ✓
- Speaker 1: User edited → Keep "Alice" ✓
state.speakerNames = {
0: { name: "Dr. Smith", confidence: "high" },
1: { name: "Alice", confidence: "user" }
}
Transcript:
[Dr. Smith] Hello everyone... ← Auto-detected
[Alice] Hi there... ← User-edited preserved
[Dr. Smith] Today we'll discuss... ← Auto-detected
```
### Example 2: Edit, Clear, Edit Again
**Initial State:**
```javascript
state.speakerNames = {}
Transcript:
[Speaker 1] Hello...
[Speaker 2] Hi...
```
**Step 1:** User edits
```
Edit Speaker 1 → "Wrong Name"
state.speakerNames = {
0: { name: "Wrong Name", confidence: "user" }
}
Transcript:
[Wrong Name] Hello...
```
**Step 2:** User realizes mistake, clears
```
Edit "Wrong Name" → "" (empty)
state.speakerNames = {} // Speaker 0 removed
Transcript:
[Speaker 1] Hello... ← Back to default
```
**Step 3:** User edits correctly
```
Edit Speaker 1 → "Correct Name"
state.speakerNames = {
0: { name: "Correct Name", confidence: "user" }
}
Transcript:
[Correct Name] Hello...
```
### Example 3: Cancel vs Clear
**Scenario A: Cancel Edit (Escape key)**
```
Tag shows "John"
User clicks tag, deletes text, presses Escape
→ Input cancelled, "John" restored
→ state.speakerNames unchanged
```
**Scenario B: Clear Edit (Enter key)**
```
Tag shows "John"
User clicks tag, deletes text, presses Enter
→ Edit saved with empty value
→ state.speakerNames[speakerId] deleted
→ Tag shows "Speaker 1" (default)
```
## Edge Cases
### Edge Case 1: Clear Non-Existent Name
```
Speaker has default name "Speaker 1" (not in state)
User clicks tag, clears (empty input), presses Enter
Check: state.speakerNames[0] exists?
→ NO: Nothing to delete
→ Result: No error, shows "Speaker 1" (unchanged)
```
### Edge Case 2: Clear During Transcription
```
Live transcription in progress
User clears speaker name
Result:
- Name cleared from state ✓
- Re-render triggered ✓
- New utterances show default name ✓
- Incremental rendering preserved ✓
```
### Edge Case 3: Clear All Names
```
User clears all speaker names
state.speakerNames = {}
Auto-detection:
- All speakers available for detection ✓
- Can fill all empty names ✓
```
### Edge Case 4: Whitespace-Only Input
```
User enters " " (spaces only)
Validation: " ".trim() === ""
→ Treated as empty input
→ Name cleared from state ✓
```
## Testing Scenarios
### ✅ Test 1: Clear User-Edited Name
1. Edit "Speaker 1" → "John"
2. Verify: Tag shows "John"
3. Edit "John" → "" (empty)
4. Verify: Tag shows "Speaker 1"
5. Verify: `state.speakerNames[0]` is undefined
### ✅ Test 2: Auto-Detection After Clear
1. Edit "Speaker 1" → "Wrong"
2. Click "Detect Speaker Names"
3. Verify: "Wrong" preserved (not overwritten)
4. Clear "Wrong" → ""
5. Click "Detect Speaker Names" again
6. Verify: Auto-detected name appears
### ✅ Test 3: Cancel Does Not Clear
1. Tag shows "John"
2. Click tag, delete text
3. Press Escape
4. Verify: Tag shows "John" (restored)
5. Verify: `state.speakerNames[0]` unchanged
### ✅ Test 4: Empty Edit Triggers Re-Render
1. Tag shows "John"
2. Clear name → ""
3. Verify: All tags for speaker 0 show "Speaker 1"
4. Verify: Timeline segments updated
5. Verify: Stats panel updated
### ✅ Test 5: Clear and Re-Edit
1. Edit "Speaker 1" → "First"
2. Clear "First" → ""
3. Edit "Speaker 1" → "Second"
4. Verify: All tags show "Second"
5. Verify: Protected from auto-detection
### ✅ Test 6: Whitespace Handling
1. Edit "Speaker 1" → " " (spaces)
2. Press Enter
3. Verify: Treated as empty, name cleared
4. Verify: Tag shows "Speaker 1"
## User Experience Improvements
### Visual Feedback
Consider adding visual hints:
```css
/* Indicate clearable/editable state */
.speaker-tag.editable-speaker {
cursor: text;
border-style: dashed; /* Hint: editable */
}
.speaker-tag.editable-speaker:hover::after {
content: " ✎"; /* Pencil icon */
opacity: 0.5;
}
```
### Tooltip Enhancement
```javascript
// In createUtteranceElement()
if (speakerInfo?.confidence === 'user') {
speakerTag.title = 'User-edited name (click to edit or clear)';
} else if (speakerInfo?.confidence === 'high') {
speakerTag.title = 'Auto-detected name (click to override)';
} else {
speakerTag.title = 'Click to edit speaker name';
}
```
### Clear Button (Optional)
Add explicit "Clear" button in edit mode:
```html
```
## Implementation Checklist
- [x] Analyze current merge logic
- [x] Design clear/reset mechanism
- [x] Document three speaker name states
- [ ] Implement empty input handling in `startSpeakerEdit()`
- [ ] Test manual clear functionality
- [ ] Test auto-detection after clear
- [ ] Test cancel vs clear behavior
- [ ] Verify timeline and stats panel sync
- [ ] Update documentation
- [ ] Commit changes
## Files to Modify
### `/home/luigi/VoxSum/frontend/app.js`
- **Function:** `startSpeakerEdit()` (lines ~665-690)
- **Change:** Add `else` branch for empty input to delete from state
- **Impact:** ~10 lines modified
### No Other Files Required
- Merge logic already correct (no changes needed)
- UI rendering already supports undefined names (shows default)
## Performance Considerations
- **Delete operation:** O(1) - fast
- **Re-render trigger:** Same as edit (~10-50ms)
- **Memory:** Reduces state size (removes cleared entries)
## Backward Compatibility
- ✅ Existing user-edited names preserved
- ✅ Auto-detection logic unchanged
- ✅ Default name display unchanged
- ✅ No breaking changes to API
## Conclusion
The enhancement allows users to intentionally clear speaker names to enable auto-detection, providing a clear "reset" path while maintaining protection for user edits. The implementation is simple (one `else` branch), robust, and maintains all existing functionality.