Spaces:
Running
Running
| title: Voxtral | |
| emoji: ⚡ | |
| colorFrom: gray | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.38.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Chat and transcribe audio files with AI, powered by Voxtral. | |
| # Voxtral Pro Interface | |
| <div align="center"> | |
|  | |
|  | |
|  | |
| <a href="https://huggingface.co/spaces/hasanbasbunar/Voxtral"></a> | |
| </div> | |
| <p align="center"> | |
| An advanced, feature-rich Gradio UI to explore the full power of Mistral AI's multimodal model, `voxtral`. | |
| </p> | |
| <p align="center"> | |
| <img src="image.png" alt="Voxtral Pro Demo" width="80%"> | |
| </p> | |
| <p align="center"> | |
| <img src="image-1.png" alt="Voxtral Pro Demo" width="80%"> | |
| </p> | |
| ## 🚀 About The Project | |
| Voxtral Pro was created to explore and showcase the full range of capabilities of Mistral AI's powerful multimodal model, `voxtral`. This application goes beyond a simple chat interface to provide a comprehensive toolkit for interacting with audio and text, demonstrating features like high-quality transcription, multi-turn multimodal conversation, and agent-like tool use. | |
| This project serves as a practical example of how to build robust, user-friendly, and production-ready applications on top of state-of-the-art foundation models. | |
| ## ✨ Key Features | |
| * **🎙️ High-Quality Transcription:** Transcribe large audio files with exceptional accuracy using the Mistral API. | |
| * **📄 SRT Subtitle Generation:** Automatically generate and export `.srt` subtitle files with precise segment timestamps, perfect for content creators. | |
| * **💬 Multimodal Chat:** Engage in rich, multi-turn conversations combining both text and audio inputs simultaneously. | |
| * **🤖 Tool Use / Function Calling:** Demonstrates the model's ability to call external functions to retrieve information (e.g., getting city data), showcasing its agent-like capabilities. | |
| * **🔐 Secure API Key Handling:** Your Mistral API key is stored securely in your browser's session storage and is never exposed or saved elsewhere. | |
| * **🎨 Modern UI:** A clean, responsive, and aesthetically pleasing interface built with Gradio. | |
| ## 🛠️ Tech Stack | |
| This project is built with a modern, asynchronous Python stack: | |
| * **Backend:** [Python](https://www.python.org/) | |
| * **Web Framework:** [Gradio](https://www.gradio.app/) | |
| * **API Client:** [httpx](https://www.python-httpx.org/) with `asyncio` for non-blocking API calls. | |
| * **Deployment:** [Hugging Face Spaces](https://huggingface.co/spaces) | |
| ## 🏁 Getting Started | |
| Follow these instructions to get a local copy up and running. | |
| ### Prerequisites | |
| * Python 3.9+ | |
| * Git | |
| ### Installation & Configuration | |
| 1. **Clone the repository:** | |
| git clone [https://huggingface.co/spaces/hasanbasbunar/Voxtral](https://huggingface.co/spaces/hasanbasbunar/Voxtral) && cd Voxtral | |
| 2. **Create and activate a virtual environment:** | |
| ```sh | |
| python3 -m venv .venv | |
| source .venv/bin/activate | |
| ``` | |
| 3. **Install dependencies:** | |
| ```sh | |
| pip install -r requirements.txt | |
| ``` | |
| 4. **Configure your API Key:** | |
| Create a file named `.env` in the root of the project and add your Mistral API key: | |
| ``` | |
| MISTRAL_API_KEY="your_api_key_here" | |
| ``` | |
| *The application is also designed to let you enter the key directly in the UI if you prefer not to use an `.env` file.* | |
| ### Running the Application | |
| 1. **Launch the app:** | |
| ```sh | |
| python app.py | |
| ``` | |
| 2. Open your browser and navigate to `http://127.0.0.1:7860`. | |
| ## 🚢 Deployment | |
| This app is designed to be easily deployed. It is currently live on [Hugging Face Spaces](https://huggingface.co/spaces/hasanbasbunar/Voxtral). | |
| To deploy your own version, you can use any platform that supports Python applications. For a production environment, ensure `debug=False` in `app.py`. | |
| Example for platforms that use a `PORT` environment variable: | |
| ```python | |
| # in app.py | |
| demo.launch(server_port=int(os.environ.get("PORT", 7860)), debug=False) |