🆕 Define `DeepFeatureExtractor` #963

shaneahmed · 2025-10-22T09:22:01Z

🚀 Summary

This PR introduces an updated, DeepFeatureExtractor engine, to the TIAToolbox framework. It enables extraction of deep CNN features from whole slide images (WSIs) or image patches for downstream tasks such as clustering, visualization, or training other models. The PR also includes a command-line interface (CLI) for this engine, along with comprehensive tests.

✨ Key Features

New Engine: DeepFeatureExtractor
- Extracts intermediate CNN features from WSIs and patches.
- Outputs features and coordinates in Zarr or dict format.
- Supports memory-aware caching for large-scale processing.
CLI Integration
- Adds deep-feature-extractor command to TIAToolbox CLI.
- Supports input/output paths, model selection, batch size, device, and more.
Unit Tests
- Covers patch-based and WSI-based inference.
- Tests multi-GPU support and CLI functionality.
- Validates compatibility with CNNBackbone and TimmBackbone models.
Codebase Integration
- Registers the engine and CLI in __init__.py files.
- Updates CLI and model registries to include the new extractor.

codecov · 2025-10-22T09:39:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.92%. Comparing base (c535eab) to head (5c07200).

Additional details and impacted files

@@                    Coverage Diff                     @@
##           dev-define-engines-abc     #963      +/-   ##
==========================================================
+ Coverage                   94.72%   94.92%   +0.20%     
==========================================================
  Files                          73       75       +2     
  Lines                        9234     9344     +110     
  Branches                     1208     1214       +6     
==========================================================
+ Hits                         8747     8870     +123     
+ Misses                        452      439      -13     
  Partials                       35       35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…actor

… into dev-define-DeepFeatureExtractor

tests/engines/test_feature_extractor.py

Results are inconsistent as the model is redefined on a different device.

Copilot

Pull Request Overview

This PR introduces the DeepFeatureExtractor engine to TIAToolbox, enabling extraction of deep CNN feature representations from whole slide images (WSIs) and image patches for downstream tasks like clustering and visualization.

Key Changes

New DeepFeatureExtractor class extending SemanticSegmentor for feature extraction from WSIs and patches
CLI integration with deep-feature-extractor command supporting various configuration options
Comprehensive test suite covering patch-based inference, WSI processing, multi-GPU support, and CLI functionality

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
tiatoolbox/models/engine/deep_feature_extractor.py	Core engine implementation with memory-aware caching and Zarr output support
tiatoolbox/models/engine/init.py	Registration of new deep_feature_extractor module
tiatoolbox/models/init.py	Export of DeepFeatureExtractor class
tiatoolbox/cli/deep_feature_extractor.py	CLI interface for the feature extractor with parameter handling
tiatoolbox/cli/init.py	Registration of deep_feature_extractor CLI command
tiatoolbox/models/engine/semantic_segmentor.py	Fixed docstring to correctly reference SemanticSegmentor instead of PatchPredictor
tests/engines/test_feature_extractor.py	Comprehensive tests for patch/WSI inference, multi-GPU, and CLI
tests/engines/test_semantic_segmentor.py	Updated test docstring for clarity

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tiatoolbox/models/engine/deep_feature_extractor.py

tests/engines/test_feature_extractor.py

tiatoolbox/models/engine/deep_feature_extractor.py

tiatoolbox/cli/deep_feature_extractor.py

tests/engines/test_feature_extractor.py

tiatoolbox/models/engine/deep_feature_extractor.py

shaneahmed · 2025-11-14T11:29:14Z

tiatoolbox/models/engine/deep_feature_extractor.py

+        # Default Memory threshold percentage is 80.
+        memory_threshold = kwargs.get("memory_threshold", 80)
+        vm = psutil.virtual_memory()
+        keys = ["probabilities", "coordinates"]


Suggested change

keys = ["probabilities", "coordinates"]

keys = ["features", "coordinates"]

Using features in inference will require major change in the base class.

…actor

measty

I've found a few issues that would need to be addressed before merging

measty · 2025-11-07T15:59:13Z

tiatoolbox/models/engine/deep_feature_extractor.py

+            )
+
+        raw_predictions["coordinates"] = da.concatenate(coordinates, axis=0)
+        raw_predictions["probabilities"] = da.concatenate(probabilities, axis=0)


various references to probabilities throughout this function, but feature extractor isnt doing anything with probabilities so it's a bit of a misleading name.

Either change it to something more appropriate in here, or maybe change it in the base SemanticSegmentor to something more generic?

measty · 2025-11-28T03:22:19Z

tiatoolbox/models/engine/deep_feature_extractor.py

+        """
+        super().__init__(
+            model=model,
+            batch_size=batch_size,


when a string is passed as a model, it should really go look in the lists of architectures available for TIMMBackbone and CNNBackbone (in architecture.vanilla.py) and build the appropriate model. At the moment it looks in the models defined in pretrainedModel.yaml which is patch prediction/segmentation models. And those don't work in here.

We may even want to consolidate these, so that there is a single unified way of getting models, whatever type they are.

Ideally, the feature extractor should work with both TimmBackbone("efficientnet_b0", pretrained=True) and "efficientnet_b0" options.

measty · 2025-11-28T03:37:30Z

tiatoolbox/models/engine/deep_feature_extractor.py

+        save_dir: os.PathLike | Path | None = None,
+        overwrite: bool = False,
+        output_type: str = "dict",
+        **kwargs: Unpack[SemanticSegmentorRunParams],


This has some usability issues. As a user (and especially trying to put myself into the shoes of someone relatively new to tiatoolbox) I would just want to specify a model, a patch size, a resolution, some data to run it on, and somewhere to save. It should be intuitive how to provide those.

In practice, some of those things are hidden away - i either have to go off into IOSegmentorConfig and figure out how to make one of those, or hunt around in SemanticSegmentorRunParams to find out from these how to specify them

Explicitly specify input_resolutions and patch_input_shape instead of wrapping it into SemanticSegmentorRunParams.

measty · 2025-11-28T04:26:23Z

tests/engines/test_feature_extractor.py

+    runner = CliRunner()
+    models_wsi_result = runner.invoke(
+        cli.main,
+        [


This is not a correct test.

what this is actually doing:
As default model is fcn-tissue-mask, it is running that within the DeepFeatureExtractor engine and saving the patches that come out of that (the probability maps for tissue/non-tissue) in a wierd zarr array that has shape (512*num_patches, 512, 2)

In practice, the deep-feature-extractor cli isnt useable because of the issue i pointed out in one of my other comments with providing model as string; you cant pass name of a valid CNNBackbone or TimmBackbone in there.

measty · 2025-11-28T09:53:10Z

tiatoolbox/models/engine/deep_feature_extractor.py

+"""Deep Feature Extraction Engine for Digital Pathology.
+
+This module defines the `DeepFeatureExtractor` class, which extends
+`SemanticSegmentor` to extract intermediate CNN feature representations


It isn't just CNN features (most foundation models are transformer)

measty · 2025-11-28T09:54:18Z

tiatoolbox/models/engine/deep_feature_extractor.py

+
+
+class DeepFeatureExtractor(SemanticSegmentor):
+    r"""Generic CNN-based feature extractor for digital pathology images.


remove CNN-based

measty · 2025-11-28T11:06:13Z

tiatoolbox/models/engine/deep_feature_extractor.py

+Example:
+--------
+>>> from tiatoolbox.models.engine.deep_feature_extractor import DeepFeatureExtractor
+>>> extractor = DeepFeatureExtractor(model="resnet50-kather100k")


This example wouldn't work as resnet50-kather100k isnt a feature extraction model (and wouldnt run as patch_output_shape isnt defined in it's ioconfig)

shaneahmed · 2025-11-28T11:18:27Z

tiatoolbox/models/engine/deep_feature_extractor.py

+"""Deep Feature Extraction Engine for Digital Pathology.
+
+This module defines the `DeepFeatureExtractor` class, which extends
+`SemanticSegmentor` to extract intermediate CNN feature representations


Suggested change

`SemanticSegmentor` to extract intermediate CNN feature representations

`SemanticSegmentor` to extract intermediate feature representations

shaneahmed · 2025-11-28T11:19:51Z

tiatoolbox/models/engine/deep_feature_extractor.py

+
+
+class DeepFeatureExtractor(SemanticSegmentor):
+    r"""Generic CNN-based feature extractor for digital pathology images.


Suggested change

r"""Generic CNN-based feature extractor for digital pathology images.

r"""Generic deep feature extractor for digital pathology images.

shaneahmed · 2025-11-28T11:39:35Z

tiatoolbox/models/engine/deep_feature_extractor.py

+                Whether to overwrite existing output files. Default is False.
+            output_type (str):
+                Desired output format. Must be "zarr" or "dict".
+            **kwargs (SemanticSegmentorRunParams):


Expand the docstring for all the options.

🆕 Define DeepFeatureExtractor

cd368bd

shaneahmed self-assigned this Oct 22, 2025

shaneahmed added this to the Release v2.0.0 milestone Oct 22, 2025

shaneahmed added the enhancement New feature or request label Oct 22, 2025

shaneahmed added 6 commits November 5, 2025 14:16

🔥 Remove incorrect docstring

f14fdaa

Merge branch 'dev-define-engines-abc' into dev-define-DeepFeatureExtr…

3fa9136

…actor

Merge remote-tracking branch 'origin/dev-define-DeepFeatureExtractor'…

d3d0650

… into dev-define-DeepFeatureExtractor

🧪 Initial implementation

aa4c812

🧪 Initial implementation

4df8ea4

✅ Add tests for DeepFeatuureExtractor

8460a2d

shaneahmed commented Nov 7, 2025

View reviewed changes

tests/engines/test_feature_extractor.py Show resolved Hide resolved

shaneahmed added 2 commits November 7, 2025 15:40

🐛 Fix error due to inconsistent results

c9f0e59

Results are inconsistent as the model is redefined on a different device.

✅ Add tests for coverage and update docstrings.

35c964b

shaneahmed requested review from Jiaqi-Lv, YijieZhu15, adamshephard, behnazelhaminia, eshasadia, gozdeg, measty and mostafajahanifar November 7, 2025 16:44

shaneahmed added 5 commits November 10, 2025 16:57

✅ Add cache support for large WSIs.

4b6df14

✅ Add support for dict output.

998ddcb

[skip ci] 📝 Update docstring

38f84fb

✨ Add command line interface to deep feature extractor

3ab5f68

✅ Improve coverage

227e317

Jiaqi-Lv requested a review from Copilot November 12, 2025 12:17

Copilot started reviewing on behalf of Jiaqi-Lv November 12, 2025 12:18 View session

Copilot finished reviewing on behalf of Jiaqi-Lv November 12, 2025 12:20

Copilot AI reviewed Nov 12, 2025

View reviewed changes

shaneahmed added 2 commits November 12, 2025 13:58

🐛 Address Co-Pilot suggestions.

62cfe01

🐛 Fix test assertion

4e62d4a

shaneahmed commented Nov 14, 2025

View reviewed changes

shaneahmed added 2 commits November 14, 2025 12:30

📝 Use features instead of probabilities in the ouptut.

6c3b821

Using features in inference will require major change in the base class.

Merge branch 'dev-define-engines-abc' into dev-define-DeepFeatureExtr…

5c07200

…actor

measty requested changes Nov 28, 2025

View reviewed changes

measty reviewed Nov 28, 2025

View reviewed changes

shaneahmed commented Nov 28, 2025

View reviewed changes

	keys = ["probabilities", "coordinates"]
	keys = ["features", "coordinates"]



		class DeepFeatureExtractor(SemanticSegmentor):
		r"""Generic CNN-based feature extractor for digital pathology images.

	`SemanticSegmentor` to extract intermediate CNN feature representations
	`SemanticSegmentor` to extract intermediate feature representations

	r"""Generic CNN-based feature extractor for digital pathology images.
	r"""Generic deep feature extractor for digital pathology images.

🆕 Define DeepFeatureExtractor #963

Are you sure you want to change the base?

🆕 Define DeepFeatureExtractor #963

Conversation

shaneahmed commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Summary

✨ Key Features

Uh oh!

codecov bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

measty left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

measty Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🆕 Define `DeepFeatureExtractor` #963

🆕 Define `DeepFeatureExtractor` #963

shaneahmed commented Oct 22, 2025 •

edited

Loading

codecov bot commented Oct 22, 2025 •

edited

Loading

measty Nov 28, 2025 •

edited

Loading