# State Persistence Bug - Analysis and Fix

## Date
October 1, 2025

## Overview
Bug where speaker names, summary, and title from the first audio file persist and incorrectly display when loading a different audio source (file upload, YouTube, or podcast).

## Problem Statement (Bug 2.4.4)

### User Story
**As a user**, when I:
1. Load an audio file and transcribe it
2. Edit/detect speaker names (e.g., "Alice", "Bob")
3. Generate summary and title
4. Load a DIFFERENT audio file (upload, YouTube, podcast)
5. **Expected:** Clean slate - no speaker names, no summary, no title
6. **Actual:** Previous speaker names, summary, and title still visible

### Visual Example

**Scenario:**
```
Step 1: Load "podcast_interview.mp3"
  - Transcribe → 2 speakers detected
  - Edit names: Speaker 0 = "Alice", Speaker 1 = "Bob"
  - Generate summary: "Interview about AI..."
  - Title: "AI Discussion with Alice"

Step 2: Load "meeting_recording.mp3" (different audio)
  - Audio player shows new file ✓
  - Transcript: EMPTY (not yet transcribed) ✓
  - Speaker names: Still shows "Alice", "Bob" from previous file ✗
  - Summary: Still shows "Interview about AI..." ✗
  - Title: Still shows "AI Discussion with Alice" ✗

Step 3: Transcribe new audio
  - New transcript appears with 3 speakers
  - Tags show: "Alice", "Bob", "Speaker 3" (mixed old/new!) ✗
  - Summary: Still old summary ✗
```

### Impact
- **Confusion:** Users see speaker names from different audio files
- **Data Integrity:** Mixed data from multiple sessions
- **Trust Issue:** Users can't trust the displayed information
- **UX Problem:** Must manually clear/reset before each new file

## Root Cause Analysis

### Current State Management

**State Object:**
```javascript
const state = {
  config: { moonshine: {}, sensevoice: {}, llms: {} },
  backend: 'sensevoice',
  utterances: [],
  diarizedUtterances: null,
  diarizationStats: null,
  speakerNames: {},        // ❌ NOT reset when source changes
  summary: '',             // ❌ NOT reset when source changes
  title: '',               // ❌ NOT reset when source changes
  audioUrl: null,
  sourcePath: null,
  uploadedFile: null,
  transcribing: false,
  summarizing: false,
  detectingSpeakerNames: false,
  transcriptionController: null,
  summaryController: null,
};
```

### Existing Reset Function

**Location:** `frontend/app.js:resetTranscriptionState()` (lines 265-273)

```javascript
function resetTranscriptionState() {
  state.utterances = [];
  state.diarizedUtterances = null;
  state.diarizationStats = null;
  activeUtteranceIndex = -1;
  elements.transcriptList.innerHTML = '';
  elements.utteranceCount.textContent = '';
  elements.diarizationPanel.classList.add('hidden');
  // ❌ MISSING: state.speakerNames = {};
  // ❌ MISSING: state.summary = '';
  // ❌ MISSING: state.title = '';
  // ❌ MISSING: Clear summary/title UI elements
}
```

**Called only by:** `handleTranscription()` (line 302)

### Source Change Functions

#### Function 1: `handleFileUpload()` (lines 1119-1127)
```javascript
function handleFileUpload(event) {
  const file = event.target.files?.[0];
  if (!file) return;
  state.uploadedFile = file;
  state.audioUrl = null;
  const objectUrl = URL.createObjectURL(file);
  elements.audioPlayer.src = objectUrl;
  setStatus(`Loaded ${file.name}`, 'info');
  // ❌ MISSING: No call to reset state
}
```

#### Function 2: `handleYoutubeFetch()` (lines 1129-1147)
```javascript
async function handleYoutubeFetch() {
  // ... fetch logic ...
  state.audioUrl = data.audioUrl;
  state.uploadedFile = null;
  elements.audioPlayer.src = data.audioUrl;
  setStatus('YouTube audio ready', 'success');
  // ❌ MISSING: No call to reset state
}
```

#### Function 3: `downloadEpisode()` (lines 1226-1258)
```javascript
async function downloadEpisode(audioUrl, title, triggerButton = null) {
  // ... download logic ...
  state.audioUrl = data.audioUrl;
  state.uploadedFile = null;
  elements.audioPlayer.src = data.audioUrl;
  setStatus('Episode ready', 'success');
  // ❌ MISSING: No call to reset state
}
```

### Why It Happens

**Problem Flow:**
```
1. User loads Audio A
   → state.speakerNames, summary, title are empty

2. User transcribes Audio A
   → resetTranscriptionState() called (clears transcript, but NOT speaker names)
   → Transcription creates new utterances
   → state.speakerNames gets populated

3. User edits speaker names, generates summary
   → state.speakerNames = { 0: "Alice", 1: "Bob" }
   → state.summary = "Interview..."
   → state.title = "AI Discussion"

4. User loads Audio B (via upload, YouTube, or podcast)
   → handleFileUpload/handleYoutubeFetch/downloadEpisode called
   → Audio player source changed ✓
   → state.audioUrl/uploadedFile updated ✓
   → BUT state.speakerNames, summary, title NOT cleared ✗

5. User transcribes Audio B
   → resetTranscriptionState() called
   → Clears utterances, diarization stats ✓
   → BUT does NOT clear speakerNames, summary, title ✗
   → New transcription with old speaker names appears!
```

## Solution Design

### Design Principles
1. **Complete Reset:** Clear ALL session-specific data when source changes
2. **Clear Intent:** Reset should happen immediately when new source loaded
3. **Separation of Concerns:** 
   - Transcription reset: Clear transcription-related data
   - Session reset: Clear ALL session data including summary, title, speaker names
4. **Consistent Behavior:** Same reset logic for all source types (upload, YouTube, podcast)

### Two-Level Reset Strategy

#### Level 1: Reset Transcription Data (Existing)
**When:** Before starting new transcription  
**What:** Utterances, diarization stats, transcript UI

```javascript
function resetTranscriptionState() {
  state.utterances = [];
  state.diarizedUtterances = null;
  state.diarizationStats = null;
  activeUtteranceIndex = -1;
  elements.transcriptList.innerHTML = '';
  elements.utteranceCount.textContent = '';
  elements.diarizationPanel.classList.add('hidden');
}
```

#### Level 2: Reset Complete Session (NEW)
**When:** When new audio source is loaded  
**What:** Everything from Level 1 + speaker names + summary + title

```javascript
function resetCompleteSession() {
  // Level 1: Reset transcription data
  resetTranscriptionState();
  
  // Level 2: Reset speaker names
  state.speakerNames = {};
  
  // Level 3: Reset summary and title
  state.summary = '';
  state.title = '';
  elements.summaryOutput.innerHTML = '';
  elements.titleOutput.textContent = '';
  
  // Level 4: Reset timeline segments
  renderTimelineSegments();  // Will be empty with no utterances
  
  // Optional: Hide detect speaker names button
  elements.detectSpeakerNamesBtn.classList.add('hidden');
}
```

## Implementation

### Change 1: Create `resetCompleteSession()` Function

**File:** `frontend/app.js` (after `resetTranscriptionState()`)

```javascript
function resetCompleteSession() {
  // Reset transcription data
  resetTranscriptionState();
  
  // Reset speaker names
  state.speakerNames = {};
  
  // Reset summary and title
  state.summary = '';
  state.title = '';
  
  // Clear summary and title UI
  elements.summaryOutput.innerHTML = '';
  elements.titleOutput.textContent = '';
  
  // Reset timeline visualization
  renderTimelineSegments();
  
  // Hide speaker name detection button
  elements.detectSpeakerNamesBtn.classList.add('hidden');
  
  // Reset status
  setStatus('Ready for new transcription', 'info');
}
```

### Change 2: Call Reset on File Upload

**File:** `frontend/app.js:handleFileUpload()` (lines ~1119-1127)

```javascript
function handleFileUpload(event) {
  const file = event.target.files?.[0];
  if (!file) return;
  
  // Reset complete session when new file loaded
  resetCompleteSession();
  
  state.uploadedFile = file;
  state.audioUrl = null;
  const objectUrl = URL.createObjectURL(file);
  elements.audioPlayer.src = objectUrl;
  setStatus(`Loaded ${file.name}`, 'info');
}
```

### Change 3: Call Reset on YouTube Fetch

**File:** `frontend/app.js:handleYoutubeFetch()` (lines ~1129-1147)

```javascript
async function handleYoutubeFetch() {
  if (!elements.youtubeUrl.value.trim()) return;
  setStatus('Downloading audio from YouTube...', 'info');
  try {
    const res = await fetch('/api/youtube/fetch', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ url: elements.youtubeUrl.value.trim() }),
    });
    if (!res.ok) throw new Error('YouTube download failed');
    const data = await res.json();
    
    // Reset complete session when new YouTube audio loaded
    resetCompleteSession();
    
    state.audioUrl = data.audioUrl;
    state.uploadedFile = null;
    elements.audioPlayer.src = data.audioUrl;
    setStatus('YouTube audio ready', 'success');
  } catch (err) {
    console.error(err);
    setStatus(err.message, 'error');
  }
}
```

### Change 4: Call Reset on Podcast Episode Download

**File:** `frontend/app.js:downloadEpisode()` (lines ~1226-1258)

```javascript
async function downloadEpisode(audioUrl, title, triggerButton = null) {
  setStatus('Downloading episode...', 'info');
  let originalLabel = null;
  if (triggerButton) {
    originalLabel = triggerButton.innerHTML;
    triggerButton.disabled = true;
    triggerButton.classList.add('loading');
    triggerButton.textContent = 'Downloading…';
  }
  try {
    const res = await fetch('/api/podcast/download', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ audioUrl, title }),
    });
    if (!res.ok) throw new Error('Episode download failed');
    const data = await res.json();
    
    // Reset complete session when new episode loaded
    resetCompleteSession();
    
    state.audioUrl = data.audioUrl;
    state.uploadedFile = null;
    elements.audioPlayer.src = data.audioUrl;
    setStatus('Episode ready', 'success');
    // ... rest of the function
  } catch (err) {
    // ... error handling
  }
}
```

## Behavior After Fix

### Example Scenario

**Step 1: Load and Process First Audio**
```
1. Upload "interview.mp3"
   → resetCompleteSession() called
   → Clean slate: no utterances, speaker names, summary, title

2. Transcribe
   → resetTranscriptionState() called (redundant but harmless)
   → Transcript appears, 2 speakers detected

3. Edit speaker names
   → state.speakerNames = { 0: "Alice", 1: "Bob" }

4. Generate summary
   → state.summary = "Interview about AI..."
   → state.title = "AI Discussion"
```

**Step 2: Load Different Audio**
```
1. Upload "meeting.mp3"
   → resetCompleteSession() called ✓
   → state.speakerNames = {} (cleared) ✓
   → state.summary = '' (cleared) ✓
   → state.title = '' (cleared) ✓
   → Summary UI cleared ✓
   → Title UI cleared ✓
   → Timeline cleared ✓
   → Status: "Loaded meeting.mp3"

2. Transcribe
   → Fresh transcript with 3 speakers
   → Speaker tags show: "Speaker 1", "Speaker 2", "Speaker 3" ✓
   → No contamination from previous audio ✓
```

**Step 3: Generate New Summary**
```
1. Click "Generate Summary"
   → New summary generated for current audio ✓
   → Replaces old summary (already cleared) ✓
   → New title generated ✓
```

## Edge Cases

### Edge Case 1: Upload Same File Twice
```
1. Upload "audio.mp3"
   → resetCompleteSession() called
2. Transcribe and edit
3. Upload same "audio.mp3" again
   → resetCompleteSession() called (data cleared)
   → User must transcribe again
   
Decision: Acceptable - user explicitly chose to reload
```

### Edge Case 2: Change Source During Transcription
```
1. Start transcription of "audio1.mp3"
2. Mid-transcription, upload "audio2.mp3"
   → resetCompleteSession() called
   → Partial transcription cleared
   → New audio loaded

Decision: Acceptable - user action indicates intent to switch
Note: Transcription abort handling already exists
```

### Edge Case 3: YouTube Fetch While Audio Playing
```
1. Upload file, play audio
2. Fetch YouTube audio
   → resetCompleteSession() called
   → Audio player source changed
   → Playback stops (normal behavior)

Decision: Acceptable - expected behavior when changing source
```

### Edge Case 4: Multiple Podcast Episodes in Sequence
```
1. Download episode 1
   → resetCompleteSession()
2. Transcribe episode 1
3. Download episode 2
   → resetCompleteSession() (episode 1 data cleared)
4. Transcribe episode 2

Decision: Correct behavior - each episode is independent
```

## UI Elements to Reset

### Complete Checklist

**State Variables:**
- [x] `state.utterances` (via resetTranscriptionState)
- [x] `state.diarizedUtterances` (via resetTranscriptionState)
- [x] `state.diarizationStats` (via resetTranscriptionState)
- [x] `state.speakerNames` (NEW)
- [x] `state.summary` (NEW)
- [x] `state.title` (NEW)
- [x] `activeUtteranceIndex` (via resetTranscriptionState)

**DOM Elements:**
- [x] `elements.transcriptList` (via resetTranscriptionState)
- [x] `elements.utteranceCount` (via resetTranscriptionState)
- [x] `elements.diarizationPanel` (via resetTranscriptionState)
- [x] `elements.diarizationMetrics` (via renderDiarizationStats after reset)
- [x] `elements.speakerBreakdown` (via renderDiarizationStats after reset)
- [x] `elements.summaryOutput` (NEW)
- [x] `elements.titleOutput` (NEW)
- [x] `elements.timelineSegments` (via renderTimelineSegments)
- [x] `elements.detectSpeakerNamesBtn` visibility (NEW)

## Testing Scenarios

### ✅ Test 1: Upload → Edit → Upload New
1. Upload "audio1.mp3"
2. Transcribe, edit speaker names to "Alice", "Bob"
3. Generate summary "Summary 1"
4. Upload "audio2.mp3"
5. **Verify:** Speaker names cleared, summary cleared, title cleared
6. Transcribe
7. **Verify:** Speaker tags show "Speaker 1", "Speaker 2" (not Alice/Bob)

### ✅ Test 2: YouTube → Summary → Podcast
1. Fetch YouTube audio
2. Transcribe, generate summary
3. Download podcast episode
4. **Verify:** YouTube summary cleared
5. Transcribe podcast
6. **Verify:** Independent transcript and summary

### ✅ Test 3: Podcast → Names → YouTube
1. Download podcast
2. Transcribe, detect speaker names
3. Fetch YouTube audio
4. **Verify:** Podcast speaker names cleared
5. Transcribe YouTube
6. **Verify:** No podcast names visible

### ✅ Test 4: Rapid Source Changes
1. Upload file
2. Immediately fetch YouTube (before transcription)
3. **Verify:** File data cleared, YouTube ready
4. Immediately download podcast
5. **Verify:** YouTube data cleared, podcast ready

### ✅ Test 5: Same Source Reload
1. Upload "audio.mp3", transcribe, edit
2. Upload same "audio.mp3" again
3. **Verify:** Previous edits cleared (fresh start)

### ✅ Test 6: Timeline Visualization
1. Upload audio, transcribe (timeline segments appear)
2. Upload different audio
3. **Verify:** Timeline segments cleared (empty)
4. Transcribe new audio
5. **Verify:** New timeline segments appear

## Performance Considerations
- **resetCompleteSession():** O(1) - fast state/DOM clearing
- **Called only on source change:** Infrequent user action
- **Impact:** Negligible (<1ms)

## Backward Compatibility
- ✅ Existing `resetTranscriptionState()` unchanged
- ✅ New function adds capability, doesn't break existing code
- ✅ No API changes required
- ✅ No breaking changes to user workflow

## Implementation Checklist

- [ ] Create `resetCompleteSession()` function
- [ ] Update `handleFileUpload()` to call reset
- [ ] Update `handleYoutubeFetch()` to call reset
- [ ] Update `downloadEpisode()` to call reset
- [ ] Test all source change scenarios
- [ ] Verify UI elements cleared
- [ ] Verify no data contamination between sessions
- [ ] Update documentation
- [ ] Commit changes

## Related Bugs
- Bug 2.4.1: Manual speaker name propagation (Fixed)
- Bug 2.4.2: Auto-detection UI update (Fixed)
- Bug 2.4.3: Clear name to enable detection (Fixed)
- Bug 2.4.4: State persistence across audio files (This bug)

## Files to Modify

### `/home/luigi/VoxSum/frontend/app.js`
- **New Function:** `resetCompleteSession()` (after line 273)
- **Modify:** `handleFileUpload()` (line ~1122)
- **Modify:** `handleYoutubeFetch()` (line ~1141)
- **Modify:** `downloadEpisode()` (line ~1239)
- **Impact:** ~40 lines added/modified

## Conclusion
The bug is caused by incomplete state reset when audio sources change. The solution is to create a comprehensive `resetCompleteSession()` function that clears ALL session data (transcription, speaker names, summary, title) and call it whenever a new audio source is loaded (file upload, YouTube, podcast). This ensures a clean slate for each audio file and prevents data contamination between sessions.