ensembleai
/

resnet50nd-grayscale

Model card Files Files and versions

resnet50nd-grayscale / README.md

zhongfang-zhuang's picture

zhongfang-zhuang

Update README.md

eaf3f6f verified 4 months ago

|

history blame contribute delete

2.33 kB

	---
	license: apache-2.0
	language:
	- en
	---
	# FaceNet Triplet ResNet Model (Grayscale, 112x112, Mobile-friendly)

	This repository provides a FaceNet-style triplet embedding model using ResNet backbones, optimized for mobile and edge devices:
	- Input: Grayscale images (`3` channels)
	- Resolution: 112x112 pixels
	- Output: Embeddings suitable for face recognition and verification

	## Model Details

	- Architecture: ResNet50 with NdLinear
	- Embedding Dimension: 512
	- Input: 3x112x112 grayscale images (NCHW format)
	- Exported weights: `model.safetensors`
	- Config: `config.json`

	## Usage

	### 1. Clone or Download Files

	Download/copy the `models/` directory and dependencies (`ndlinear.py`, etc.) to your project.

	### 2. Install requirements

	```bash
	pip install torch safetensors
	```

	### 3. Load the model

	```python
	from models.resnet import Resnet50Triplet # or your chosen variant

	model = Resnet50Triplet.from_pretrained(".", safe_serialization=True)
	model.eval()
	```

	### 4. Use for Face Recognition

	Obtain a face embedding from an input image, and compare embeddings (e.g., with cosine similarity) to recognize or verify identities.

	```python
	import torch

	# Example: batch of 1 grayscale image of 112x112
	images = torch.randn(1, 3, 112, 112) # (batch_size, channels, height, width)

	with torch.no_grad():
	embedding = model(images) # embedding output suitable for face recognition
	print(embedding.shape) # (batch_size, embedding_dim)
	```

	To perform recognition or verification, compare the embedding against a database of known face embeddings using distance/similarity metrics.

	## Files

	- `model.safetensors` - Model weights
	- `config.json` - Loader configuration
	- `models/` - Model definition files
	- `README.md` - This file

	## Notes

	- Model is optimized for runtime on edge/mobile devices (reduced input size, grayscale input for lower computational load)
	- Make sure your image preprocess pipeline produces three identical grayscaled channels, 112x112 images.

	## Credits

	- Backbone based on [PyTorch torchvision ResNet](https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet50.html)
	- Architecture inspired by [Facenet PyTorch](https://github.com/timesler/facenet-pytorch)

	---

	For contributions or issues, open a discussion or pull request.