While foundation models such as DINOv2 offer remarkable generalization for zero-shot image segmentation, they are computationally demanding and often produce coarse, noisy embeddings. Here, we introduce a lightweight approach for zero-shot segmentation of mitochondria in electron microscopy (EM) images by distilling DINOv2 features into a compact convolutional network specialized for the target dataset. The distilled model reduces the number of parameters by over 99.4% and achieves a 27x faster inference than DINOv2, while preserving the semantic richness in the embeddings. When integrated into the DINOSim framework, the distilled features yield smoother embeddings and substantially improve segmentation accuracy on two popular public EM datasets. This strategy enables efficient and accurate zero-shot segmentation in microscopy, and paves the way for future exploration of high-resolution intermediate features for fine-grained structure analysis.