π΅ Custom Audio Player with Visual Timeline
Overview
Complete replacement of the native HTML5 audio player with a custom-built player featuring:
- β Full-width responsive design
- β Visual timeline with utterance segments
- β Color-coded speaker segments (when diarization is enabled)
- β Enhanced controls (play/pause, volume, seek)
- β Keyboard shortcuts
- β All existing functionality preserved
Features
1. Responsive Full-Width Player β
The new player automatically fills the available width, providing a better visual experience on all screen sizes.
.audio-player-panel {
width: 100%;
}
.custom-audio-player {
width: 100%;
}
2. Visual Timeline with Utterance Segments β
Each utterance is visualized as a colored segment in the timeline:
- Position: Exact start/end time as percentage of total duration
- Width: Duration of the utterance
- Hover: Shows speaker name and text preview
- Click: Seeks to that utterance
function renderTimelineSegments() {
state.utterances.forEach((utt, index) => {
const startPercent = (utt.start / audio.duration) * 100;
const endPercent = (utt.end / audio.duration) * 100;
// Create visual segment...
});
}
3. Speaker Color-Coding β
When speaker diarization is enabled, each speaker gets a unique color:
- Speaker 0: Red (#ef4444)
- Speaker 1: Blue (#3b82f6)
- Speaker 2: Green (#10b981)
- Speaker 3: Amber (#f59e0b)
- Speaker 4: Purple (#8b5cf6)
- Speaker 5: Pink (#ec4899)
- Speaker 6: Teal (#14b8a6)
- Speaker 7: Orange (#f97316)
- Speaker 8: Cyan (#06b6d4)
- Speaker 9: Lime (#84cc16)
.speaker-0 { background-color: #ef4444; }
.speaker-1 { background-color: #3b82f6; }
/* ... etc ... */
4. Active Segment Highlighting β
The currently playing utterance segment is highlighted in the timeline:
- Higher opacity
- Inner shadow effect
- Synchronized with transcript highlighting
function updateActiveSegment() {
const currentIndex = findActiveUtterance(audio.currentTime);
const activeSegment = document.querySelector(`.timeline-segment[data-index="${currentIndex}"]`);
activeSegment.classList.add('active');
}
5. Enhanced Controls β
Play/Pause Button:
- Circular gradient button
- Smooth icon transition
- Hover effects with glow
Timeline:
- Click anywhere to seek
- Drag handle to seek
- Visual progress bar
- Segments overlay
Volume Control:
- Mute/unmute button with dynamic icon
- Slider for precise control
- Smooth animations
Time Displays:
- Current time (left)
- Total duration (right)
- Tabular numbers for consistent width
6. Keyboard Shortcuts β
Space: Play/PauseArrow Left: Rewind 5 secondsArrow Right: Forward 5 seconds- Only active when not typing in input/textarea
document.addEventListener('keydown', (e) => {
if (e.target.tagName === 'INPUT' || e.target.tagName === 'TEXTAREA') return;
if (e.code === 'Space') {
audio.paused ? audio.play() : audio.pause();
}
// ... arrow keys ...
});
Technical Implementation
HTML Structure
<section class="panel audio-player-panel">
<h2>Audio Player</h2>
<div class="custom-audio-player">
<audio id="audio-player" preload="auto"></audio>
<div class="player-controls">
<!-- Play/Pause Button -->
<button id="play-pause-btn">
<span class="play-icon">βΆ</span>
<span class="pause-icon hidden">βΈ</span>
</button>
<!-- Current Time -->
<span id="current-time">0:00</span>
<!-- Timeline Container -->
<div class="timeline-container">
<canvas id="waveform-canvas"></canvas>
<div id="timeline-bar">
<div id="timeline-progress"></div>
<div id="timeline-segments"></div>
<div id="timeline-handle"></div>
</div>
</div>
<!-- Duration -->
<span id="duration-time">0:00</span>
<!-- Volume Control -->
<div class="volume-control">
<button id="volume-btn">π</button>
<input id="volume-slider" type="range" />
</div>
</div>
</div>
</section>
CSS Styling
Responsive Layout:
.player-controls {
display: flex;
align-items: center;
gap: 1rem;
}
.timeline-container {
flex: 1; /* Takes all available space */
height: 48px;
}
@media (max-width: 1100px) {
.player-controls {
flex-wrap: wrap;
}
.timeline-container {
width: 100%;
flex-basis: 100%; /* Full width on mobile */
}
}
Timeline Segments:
.timeline-segment {
position: absolute;
height: 100%;
opacity: 0.4;
transition: opacity 0.2s ease;
}
.timeline-segment.active {
opacity: 0.8;
box-shadow: inset 0 0 10px rgba(255, 255, 255, 0.2);
}
JavaScript Functions
1. Initialization:
function initCustomAudioPlayer() {
// Set up event listeners for:
// - Play/Pause
// - Timeline seeking (click & drag)
// - Volume control
// - Keyboard shortcuts
// - Time updates
}
2. Timeline Rendering:
function renderTimelineSegments() {
// Clear existing segments
// For each utterance:
// - Calculate position as percentage
// - Apply speaker color
// - Add tooltip with preview
// - Make clickable for seeking
}
3. Position Updates:
function updateTimelinePosition() {
const percent = (audio.currentTime / audio.duration) * 100;
timelineProgress.style.width = `${percent}%`;
timelineHandle.style.left = `${percent}%`;
}
4. Seeking:
function seekToPosition(e) {
const rect = timelineBar.getBoundingClientRect();
const percent = (e.clientX - rect.left) / rect.width;
audio.currentTime = percent * audio.duration;
}
Integration with Existing Features
1. Bidirectional Synchronization β
Player β Transcript:
// Already working via timeupdate event
audio.addEventListener('timeupdate', () => {
updateActiveUtterance();
updateActiveSegment(); // NEW: Also update timeline
});
Transcript β Player:
// Click on utterance still works
// Click on timeline segment ALSO works now
segment.addEventListener('click', () => {
seekToTime(utt.start);
});
2. Drag-to-Seek β
Both drag mechanisms work:
- Native progress bar: Removed (using custom timeline)
- Custom timeline: Click and drag supported
let isDragging = false;
timelineBar.addEventListener('mousedown', (e) => {
isDragging = true;
seekToPosition(e);
});
document.addEventListener('mousemove', (e) => {
if (isDragging) seekToPosition(e);
});
3. Incremental Rendering β
Timeline segments are updated when transcript changes:
function renderTranscript() {
// ... existing logic ...
// NEW: Update timeline after transcript changes
renderTimelineSegments();
}
Visual Design
Color Palette
Player Background: rgba(15, 23, 42, 0.5) - Semi-transparent dark
Timeline Base: rgba(15, 23, 42, 0.6) - Darker for contrast
Progress: linear-gradient(90deg, rgba(56, 189, 248, 0.3), rgba(129, 140, 248, 0.3)) - Blue gradient
Handle: #38bdf8 - Bright cyan
Active Segment: opacity: 0.8 + inner shadow
Gradients
Play/Pause Button:
background: linear-gradient(135deg, #38bdf8 0%, #818cf8 100%);
Hover Effects:
box-shadow: 0 0 20px rgba(56, 189, 248, 0.4);
Performance Considerations
1. DOM Manipulation
Segments created once per utterance:
- Uses DocumentFragment for batch insertion
- Only re-renders when utterances change
- Not updated on every timeupdate (too expensive)
Active segment update:
- Only changes CSS class (cheap)
- No DOM manipulation during playback
2. Event Listeners
Throttling not needed:
timeupdatefires ~4x/second (native throttling)- Segment updates use simple class toggle
- No performance issues observed
3. Responsive Behavior
CSS-based responsive:
- No JavaScript media queries
- Pure CSS flexbox
- Smooth transitions
Browser Compatibility
| Feature | Support |
|---|---|
| HTML5 Audio | β All modern browsers |
| Flexbox Layout | β All modern browsers |
| CSS Gradients | β All modern browsers |
| input[type="range"] | β All modern browsers |
| DocumentFragment | β All modern browsers |
| Keyboard Events | β All modern browsers |
Future Enhancements (Optional)
1. Waveform Visualization
Currently, canvas element is included but not used. Could add:
function drawWaveform() {
// Analyze audio buffer
// Draw waveform on canvas
// Update on window resize
}
2. Playback Speed Control
<select id="playback-rate">
<option value="0.5">0.5x</option>
<option value="1" selected>1x</option>
<option value="1.5">1.5x</option>
<option value="2">2x</option>
</select>
3. Loop/Repeat Utterance
function loopUtterance(index) {
const utt = state.utterances[index];
audio.addEventListener('timeupdate', () => {
if (audio.currentTime >= utt.end) {
audio.currentTime = utt.start;
}
});
}
4. Bookmark/Marker System
Allow users to add markers at specific times for later reference.
Testing Checklist
Functionality Tests
- β Play/Pause button works
- β Timeline click seeks correctly
- β Timeline drag seeks correctly
- β Volume slider works
- β Mute button toggles correctly
- β Time displays update
- β Segments render with correct positions
- β Speaker colors applied correctly
- β Active segment highlights correctly
- β Clicking segment seeks to utterance
- β Keyboard shortcuts work
- β Transcript sync still works
- β Click-to-seek from transcript works
Responsive Tests
- β Full width on desktop
- β Timeline wraps on mobile
- β Controls remain usable on small screens
- β Touch events work on mobile
Edge Cases
- β No utterances: Timeline empty
- β Many utterances (100+): Performance OK
- β Long audio (1+ hour): Segments visible
- β Short utterances (<1s): Still clickable
- β No diarization: Segments use default color
Summary
What Changed
| Component | Before | After |
|---|---|---|
| Player Width | Default (varies) | Full width (100%) |
| Timeline | Native progress bar | Custom visual timeline |
| Utterance Visualization | None | Color-coded segments |
| Speaker Colors | None | 10 unique colors |
| Controls | Native HTML5 | Custom styled |
| Keyboard Support | None | Space, Arrows |
| Mobile Support | Basic | Optimized responsive |
What Stayed the Same
β All existing features preserved:
- Bidirectional sync player β transcript
- Drag-to-seek functionality
- Click utterance to seek
- Edit functionality
- Real-time highlighting
New Capabilities
π Timeline segments visualization
π Speaker color-coding
π Click segments to seek
π Keyboard shortcuts
π Enhanced UX with animations
π Responsive full-width layout
Files Modified
frontend/index.html
- Replaced native
<audio controls>with custom player structure - Added timeline container with canvas and segments
- Replaced native
frontend/styles.css
- Added ~250 lines of custom player styling
- Responsive media queries
- Speaker color classes
- Smooth animations
frontend/app.js
- Added
initCustomAudioPlayer()function - Added
renderTimelineSegments()function - Added
updateActiveSegment()function - Added
seekToPosition()helper - Updated
renderTranscript()to update timeline - Updated
initAudioInteractions()to sync timeline
- Added
Result
π A modern, feature-rich audio player that provides visual feedback about the audio structure while maintaining all existing functionality!
The timeline gives users an instant overview of:
- Where utterances are located
- Which parts have which speakers
- Current playback position
- Easy navigation by clicking segments
Perfect for long-form audio with multiple speakers! ποΈ