Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
microsoft
/
Phi-4-multimodal-instruct
like
1.56k
Follow
Microsoft
17.6k
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2503.01743
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
xet
Community
86
Deploy
Use this model
28148ed
Phi-4-multimodal-instruct
933 MB
14 contributors
History:
28 commits
cyrilvallez
HF Staff
Upload folder using huggingface_hub
28148ed
verified
10 months ago
examples
Add examples
11 months ago
figures
Added model files
11 months ago
.gitattributes
1.61 kB
added technical report
11 months ago
CODE_OF_CONDUCT.md
444 Bytes
Added model files
11 months ago
LICENSE
1.14 kB
Added model files
11 months ago
README.md
65.1 kB
Update readme
10 months ago
SECURITY.md
2.66 kB
Added model files
11 months ago
SUPPORT.md
1.24 kB
Added model files
11 months ago
adapter_config.json
742 Bytes
Upload folder using huggingface_hub
10 months ago
adapter_model.safetensors
923 MB
xet
Upload folder using huggingface_hub
10 months ago
configuration_phi4mm.py
11 kB
Added model files
11 months ago
merges.txt
2.42 MB
Added model files
11 months ago
modeling_phi4mm.py
116 kB
fixes the asserion error when num_beams > 1 (#42)
10 months ago
phi_4_mm.tech_report.02252025.pdf
5.3 MB
xet
added technical report
11 months ago
processing_phi4mm.py
32.8 kB
Added model files
11 months ago
sample_finetune_speech.py
16.7 kB
Fix bug with safe suffix removal (#34)
10 months ago
sample_finetune_vision.py
19.6 kB
Added model files
11 months ago
sample_inference_phi4mm.py
10.5 kB
Added model files
11 months ago
speech_conformer_encoder.py
111 kB
Added model files
11 months ago
vision_siglip_navit.py
78.2 kB
Added model files
11 months ago