Spaces:

mtg-upf
/

audio-difficulty

Running on Zero

App Files Files Community

audio-difficulty / README.md

PRamoneda

probando

ff9ad90 6 months ago

preview code

raw

history blame contribute delete

1.59 kB

	---
	title: Audio Difficulty Estimator
	emoji: 🎹
	colorFrom: purple
	colorTo: pink
	sdk: gradio
	sdk_version: "4.26.0"
	python_version: 3.10.13
	app_file: app.py
	pinned: false
	tags:
	- music
	- audio
	- piano
	- difficulty-estimation
	short_description: Estimate piano difficulty from audio
	hardware: "a100-large"
	---

	# 🎼 Music Difficulty Estimator

	This Gradio app estimates the difficulty of piano pieces based on uploaded audio (MP3/MP4) or YouTube links. It uses pretrained models to generate a MIDI transcription and predict difficulty from three musical perspectives:

	- CQT-based representation
	- Piano roll representation
	- Multimodal embeddings

	## 🛠 How it works

	1. You upload an audio or video file, or paste a YouTube link.
	2. The audio is transcribed to MIDI using a piano transcription model.
	3. Three different difficulty models analyze the audio and generate predictions.
	4. You can listen to the extracted MP3 and the generated MIDI.

	## 📦 Model loading

	All models are stored separately in the [pramoneda/audio](https://huggingface.co/pramoneda/audio) model repository and are downloaded dynamically via `huggingface_hub`.

	## 📁 Input formats

	- MP3 audio
	- MP4 video (audio extracted automatically)
	- YouTube links

	## ✨ Built with

	- `gradio` for the interface
	- `pydub` and `yt_dlp` for audio processing
	- `huggingface_hub` to load model checkpoints
	- `ffmpeg-python` for format conversion

	## 🔗 Related

	- [Model repo: pramoneda/audio](https://huggingface.co/pramoneda/audio)
	- [More projects by pramoneda](https://huggingface.co/pramoneda)

	---