-
Notifications
You must be signed in to change notification settings - Fork 98
Add support for multi-modal search #1167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds an experimental-features API path and two Client methods (GET/PATCH), extends RestEmbedder with indexing/search fragment fields, adds unit and end-to-end multimodal tests and dataset, and a multimodal code sample; minor docs formatting tweaks. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Tests
participant Client as SDK_Client
participant HTTP as HTTP_Layer
participant API as Meilisearch_API
rect rgb(220,235,255)
Note right of Client: Experimental features flow
Test->>Client: get_experimental_features()
Client->>HTTP: GET /experimental-features
HTTP->>API: GET /experimental-features
API-->>HTTP: 200 { features dict }
HTTP-->>Client: dict
Client-->>Test: dict
end
rect rgb(235,255,220)
Note right of Client: Update features
Test->>Client: update_experimental_features({multimodal: true})
Client->>HTTP: PATCH /experimental-features {body}
HTTP->>API: PATCH /experimental-features
API-->>HTTP: 200 { updated dict }
HTTP-->>Client: dict
Client-->>Test: dict
end
sequenceDiagram
participant Test as Multimodal_Tests
participant HTTP as HTTP_Layer
participant API as Meilisearch_API
participant Embedder as External_Embedder
Note over Test,API: Multimodal setup & search flow
Test->>HTTP: PATCH /embedders {fragments config}
HTTP->>API: PATCH /embedders
API-->>HTTP: 202 { task_id }
HTTP-->>Test: task_id
Test->>API: poll task until completed
API-->>Test: task done
Test->>HTTP: POST /indexes/{uid}/documents {docs}
HTTP->>API: POST /indexes/{uid}/documents
API-->>HTTP: enqueued
Note right of Embedder: external embedding generation
Embedder-->>API: embeddings ready
API-->>Test: embeddings available
Test->>HTTP: POST /indexes/{uid}/search {query + media}
HTTP->>API: search executed using multimodal embeddings
API-->>HTTP: results
HTTP-->>Test: results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
meilisearch/client.py (4)
1000-1002: Enhance docstring to match established patterns.The new methods should follow the comprehensive docstring format used throughout this file, including Parameters, Returns, and Raises sections. This improves maintainability and API documentation consistency.
Apply this diff to improve the docstring and type hint:
- def get_experimental_features(self) -> dict: - """Get current experimental features settings.""" + def get_experimental_features(self) -> Dict[str, Any]: + """Get current experimental features settings. + + Returns + ------- + features: + Dictionary containing the current experimental features configuration. + + Raises + ------ + MeilisearchApiError + An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://www.meilisearch.com/docs/reference/errors/error_codes#meilisearch-errors + """ return self.http.get(self.config.paths.experimental_features)
1004-1006: Enhance docstring and add parameter type hint.The method should include a type hint for the
featuresparameter and follow the comprehensive docstring format used throughout this file.Apply this diff:
- def update_experimental_features(self, features: dict) -> dict: - """Update experimental features settings.""" + def update_experimental_features(self, features: Mapping[str, Any]) -> Dict[str, Any]: + """Update experimental features settings. + + Parameters + ---------- + features: + Dictionary containing experimental features to enable or disable (ex: {"multimodal": True}). + + Returns + ------- + features: + Dictionary containing the updated experimental features configuration. + + Raises + ------ + MeilisearchApiError + An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://www.meilisearch.com/docs/reference/errors/error_codes#meilisearch-errors + """ return self.http.patch(self.config.paths.experimental_features, body=features)
1008-1010: Enhance docstring to match established patterns.The convenience method should include comprehensive documentation following the file's conventions.
Apply this diff:
- def enable_multimodal(self) -> dict: - """Enable multimodal experimental feature.""" + def enable_multimodal(self) -> Dict[str, Any]: + """Enable the multimodal experimental feature. + + Returns + ------- + features: + Dictionary containing the updated experimental features configuration. + + Raises + ------ + MeilisearchApiError + An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://www.meilisearch.com/docs/reference/errors/error_codes#meilisearch-errors + """ return self.update_experimental_features({"multimodal": True})
1012-1014: Enhance docstring to match established patterns.The convenience method should include comprehensive documentation following the file's conventions.
Apply this diff:
- def disable_multimodal(self) -> dict: - """Disable multimodal experimental feature.""" + def disable_multimodal(self) -> Dict[str, Any]: + """Disable the multimodal experimental feature. + + Returns + ------- + features: + Dictionary containing the updated experimental features configuration. + + Raises + ------ + MeilisearchApiError + An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://www.meilisearch.com/docs/reference/errors/error_codes#meilisearch-errors + """ return self.update_experimental_features({"multimodal": False})tests/client/test_experimental_features.py (1)
4-55: Consider adding test cleanup and error handling tests.The tests modify global experimental features state but don't restore the original configuration afterward. This could affect other tests. Additionally, error handling paths are not covered.
Consider these improvements:
- Add cleanup to restore original state:
import pytest @pytest.fixture def cleanup_experimental_features(client): """Fixture to restore experimental features after test.""" original_features = client.get_experimental_features() yield client.update_experimental_features(original_features) def test_enable_multimodal(client, cleanup_experimental_features): """Test enabling multimodal experimental feature.""" response = client.enable_multimodal() assert isinstance(response, dict) assert response.get("multimodal") is True # Verify it's enabled features = client.get_experimental_features() assert features.get("multimodal") is True
- Add error handling tests:
def test_update_experimental_features_with_invalid_feature(client): """Test updating with an invalid feature name.""" # This behavior depends on the Meilisearch API - it may ignore unknown keys # or return an error. Verify the expected behavior. response = client.update_experimental_features({"invalid_feature": True}) assert isinstance(response, dict)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.code-samples.meilisearch.yaml(1 hunks)meilisearch/client.py(1 hunks)meilisearch/config.py(1 hunks)meilisearch/models/embedders.py(2 hunks)tests/client/test_experimental_features.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/client/test_experimental_features.py (3)
tests/conftest.py (1)
client(15-16)meilisearch/client.py (4)
get_experimental_features(1000-1002)update_experimental_features(1004-1006)enable_multimodal(1008-1010)disable_multimodal(1012-1014)meilisearch/_httprequests.py (1)
get(95-96)
meilisearch/client.py (1)
meilisearch/_httprequests.py (2)
get(95-96)patch(110-118)
🔇 Additional comments (3)
.code-samples.meilisearch.yaml (1)
766-779: Code sample effectively demonstrates multimodal search.The new
search_parameter_reference_media_1example correctly illustrates hybrid search with media parameters. The syntax is accurate, formatting is consistent with surrounding examples, and the parameter combination (empty query with media and hybrid settings) properly showcases this feature.Minor observations:
- The embedder name
"voyage"assumes it's a configured embedder; if not documented elsewhere, a clarifying comment could help users unfamiliar with available embedders.- The example is otherwise clear and complete.
meilisearch/config.py (1)
50-50: LGTM!The new path constant follows the established pattern and naming convention consistently with other paths in the class.
meilisearch/models/embedders.py (1)
170-173: LGTM!The new optional fields for multi-modal search are well-documented and appropriately typed. Using
Noneas the default value maintains backward compatibility while extending the RestEmbedder API for fragment configuration.Also applies to: 192-193
Strift
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution!
085e398 to
f70bd20
Compare
955a7c9 to
97fb11e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/client/test_experimental_features.py (1)
78-79: Strengthen assertions to validate exact fragment keys.Using
>= 1only verifies that at least one fragment exists, but doesn't confirm that both expected fragments (textanddescription) are present. Consider asserting the exact keys as done intest_profile_picture_and_title_fragments.Apply this diff to strengthen the assertions:
- assert len(e.indexing_fragments) >= 1 - assert len(e.search_fragments) >= 1 + expected_keys = {"text", "description"} + assert set(e.indexing_fragments.keys()) == expected_keys + assert set(e.search_fragments.keys()) == expected_keys
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.code-samples.meilisearch.yaml(2 hunks)meilisearch/client.py(1 hunks)meilisearch/config.py(1 hunks)meilisearch/models/embedders.py(2 hunks)tests/client/test_experimental_features.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- meilisearch/config.py
- meilisearch/models/embedders.py
- .code-samples.meilisearch.yaml
- meilisearch/client.py
🧰 Additional context used
🪛 GitHub Actions: Tests
tests/client/test_experimental_features.py
[warning] 24-24: pylint: Unused argument 'multimodal_enabled' (unused-argument)
[warning] 52-52: pylint: Unused argument 'multimodal_enabled' (unused-argument)
[warning] 82-82: pylint: Unused argument 'multimodal_enabled' (unused-argument)
[warning] 125-125: pylint: Catching too general exception Exception (broad-exception-caught)
[warning] 129-129: pylint: Unused argument 'multimodal_enabled' (unused-argument)
[warning] 163-163: pylint: Unused argument 'multimodal_enabled' (unused-argument)
[warning] 9-9: pylint: Unused import pytest (unused-import)
🪛 Ruff (0.14.5)
tests/client/test_experimental_features.py
24-24: Unused function argument: multimodal_enabled
(ARG001)
52-52: Unused function argument: multimodal_enabled
(ARG001)
82-82: Unused function argument: multimodal_enabled
(ARG001)
125-126: try-except-pass detected, consider logging the exception
(S110)
125-125: Do not catch blind exception: Exception
(BLE001)
129-129: Unused function argument: multimodal_enabled
(ARG001)
163-163: Unused function argument: multimodal_enabled
(ARG001)
🔇 Additional comments (2)
tests/client/test_experimental_features.py (2)
16-22: LGTM!The helper function properly encapsulates the update-wait-verify pattern, promoting code reuse across tests.
163-193: LGTM!This test demonstrates a realistic multimodal use case and uses strong assertions with exact key matching via set comparison.
Strift
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the follow-up.
This PR is still lacking tests related to the multi-modal search. I would add three test cases:
- multi-modal search with a text query
- multi-modal search with an image
- multi-modal search with both text and image
I recommend that you take a look at how we implemented this in the JavaScript SDK. I wrote similar test cases in tests/multi_modal_search.test.ts.
meilisearch/models/embedders.py
Outdated
| indexing_fragments: Optional[Dict[str, Dict[str, str]]] = None | ||
| search_fragments: Optional[Dict[str, Dict[str, str]]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type definition Dict[str, Dict[str, str]] for indexing_fragments and search_fragments seems too restrictive.
These fragments often contain complex nested structures (like lists of objects) wrapped in a value key.
For example, a valid fragment configuration looks like this:
"textAndPoster": {
"value": {
"content": [
{"type": "text", "text": "..."},
{"type": "image_url", "image_url": "..."}
]
}
}Would the Optional[Dict[str, Dict[str, Any]]] type be more fitting?
|
sure i will look into this today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
tests/client/test_multimodal.py (5)
146-150: Add specific exception handling and optional logging.The pipeline and static analysis flag overly broad exception handling. Consider catching
MeilisearchApiError(or a more specific exception) and optionally logging for debugging during test failures.+from meilisearch.errors import MeilisearchApiError +import logging + +logger = logging.getLogger(__name__)Then update the exception handling:
if index.uid != INDEX_UID: try: task = client.index(index.uid).delete() client.wait_for_task(task.task_uid) - except Exception: - pass + except MeilisearchApiError as e: + logger.debug("Failed to delete index %s: %s", index.uid, e)
161-165: Catch specific exception when checking for index existence.Similar to the above, use a more specific exception type. For index-not-found scenarios,
MeilisearchApiErrorwith a specific error code is expected.try: task = client.index(INDEX_UID).delete() client.wait_for_task(task.task_uid) - except Exception: + except MeilisearchApiError: pass # Index doesn't exist, which is fine
198-202: Use a custom or more specific exception type.Raising generic
Exceptionmakes it harder to handle errors specifically. Consider usingpytest.fail()for test setup failures or define a custom exception.if task.status != "succeeded": error_msg = f"Embedder setup failed: status={task.status}" if task.error: error_msg += f", error={task.error}" - raise Exception(error_msg) + pytest.fail(error_msg)
215-219: Usepytest.fail()for test setup failures.Same as above—use
pytest.fail()instead of raising a genericException.if task.status != "succeeded": error_msg = f"Document indexing failed: status={task.status}" if task.error: error_msg += f", error={task.error}" - raise Exception(error_msg) + pytest.fail(error_msg)
278-281: Clarify query parameter usage.The first positional argument is
Nonebut"q": queryis also passed in the options dict. This appears redundant. Consider removing one to clarify intent.response = self.search_client.index(INDEX_UID).search( - None, + query, { - "q": query, "media": {Or if the intent is to rely solely on the options dict:
response = self.search_client.index(INDEX_UID).search( None, { - "q": query, "media": {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.code-samples.meilisearch.yaml(1 hunks)meilisearch/client.py(1 hunks)meilisearch/models/embedders.py(2 hunks)tests/client/test_experimental_features.py(1 hunks)tests/client/test_multimodal.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/client/test_experimental_features.py
- meilisearch/client.py
- meilisearch/models/embedders.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/client/test_multimodal.py (3)
meilisearch/client.py (5)
Client(39-1150)get_indexes(128-159)index(227-243)update_experimental_features(1130-1150)get_index(185-204)tests/conftest.py (1)
client(15-16)meilisearch/index.py (1)
search(282-319)
🪛 GitHub Actions: Tests
tests/client/test_multimodal.py
[warning] 149-149: W0718: Catching too general exception Exception (broad-exception-caught)
[warning] 164-164: W0718: Catching too general exception Exception (broad-exception-caught)
[warning] 181-181: W3101: Missing timeout argument for method 'requests.patch' can cause your program to hang indefinitely (missing-timeout)
[warning] 202-202: W0719: Raising too general exception: Exception (broad-exception-raised)
[warning] 219-219: W0719: Raising too general exception: Exception (broad-exception-raised)
[error] 235-235: E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member)
[error] 257-257: E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member)
[error] 278-278: E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member)
🪛 Ruff (0.14.5)
tests/client/test_multimodal.py
149-150: try-except-pass detected, consider logging the exception
(S110)
149-149: Do not catch blind exception: Exception
(BLE001)
164-165: try-except-pass detected, consider logging the exception
(S110)
164-164: Do not catch blind exception: Exception
(BLE001)
181-181: Probable use of requests call without timeout
(S113)
202-202: Create your own exception
(TRY002)
219-219: Create your own exception
(TRY002)
🔇 Additional comments (2)
.code-samples.meilisearch.yaml (1)
770-788: LGTM!The code sample demonstrates multimodal search with text and image inputs correctly, including the
mediapayload withtextAndPosterfragment and thehybridembedder configuration. This aligns with the multimodal feature being added.tests/client/test_multimodal.py (1)
227-230: I'll help you verify the pytest class attribute assignment pattern. Let me start by examining the test file and understanding the fixture setup.
<function_calls>
#!/bin/bashExamine the test file around the fixture and class definition
cd /tmp && git clone https://github.com/meilisearch/meilisearch-python.git repo 2>&1 | head -5
cd repo && git log --oneline -1
</function_calls>Now let me look at the specific test file:
<function_calls>
#!/bin/bashCheck the test_multimodal.py file structure
wc -l tests/client/test_multimodal.py
</function_calls><function_calls>
#!/bin/bashRead the fixture setup and class definition
head -250 tests/client/test_multimodal.py | tail -100
</function_calls><function_calls>
#!/bin/bashLook at the beginning of the file for imports and fixture definition
head -50 tests/client/test_multimodal.py
</function_calls><function_calls>
#!/bin/bashSearch for the setup_index fixture and TestMultimodalSearch class
sed -n '200,280p' tests/client/test_multimodal.py
</function_calls><function_calls>
#!/bin/bashSearch for all request.cls patterns in the test directory
rg -n "request.cls." tests/
</function_calls><function_calls>
#!/bin/bashCheck where self.search_client is actually used in the test methods
rg -n "self.search_client" tests/client/test_multimodal.py -A 2 -B 2
</function_calls><function_calls>
#!/bin/bashLook for type hints in the test class to see if they're already defined
sed -n '220,250p' tests/client/test_multimodal.py
</function_calls><function_calls>
#!/bin/bashGet the full class definition to see all attributes and methods
sed -n '220,350p' tests/client/test_multimodal.py
</function_calls><function_calls>
#!/bin/bashSearch for how other test classes handle this pattern
rg -n "class Test" tests/client/ -A 5 | head -60
</function_calls><function_calls>
#!/bin/bashCheck if there are any pylint disable comments elsewhere in test files
rg -n "pylint.*disable" tests/
</function_calls><function_calls>
#!/bin/bashCheck the imports to understand what Client and Index types are
sed -n '1,30p' tests/client/test_multimodal.py
</function_calls>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/client/test_multimodal.py (1)
148-153: Consider narrowing the exception handling or adding logging.The broad
Exceptioncatch silently swallows all errors. While acceptable for test cleanup, consider either:
- Removing the generic
Exceptionand keeping onlyMeilisearchApiError, or- Adding minimal logging to aid debugging if cleanup fails unexpectedly.
- except (MeilisearchApiError, Exception): # pylint: disable=broad-exception-caught - # Ignore errors when deleting indexes (may not exist) - pass + except MeilisearchApiError: + # Index may not exist or already deleted + pass
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/client/test_multimodal.py(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Tests
tests/client/test_multimodal.py
[error] 226-226: Pylint E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member). Occurred during 'pipenv run pylint meilisearch tests'.
[error] 248-248: Pylint E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member). Occurred during 'pipenv run pylint meilisearch tests'.
[error] 269-269: Pylint E1101: Instance of 'TestMultimodalSearch' has no 'search_client' member (no-member). Occurred during 'pipenv run pylint meilisearch tests'.
🪛 Ruff (0.14.5)
tests/client/test_multimodal.py
151-153: try-except-pass detected, consider logging the exception
(S110)
151-151: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (4)
tests/client/test_multimodal.py (4)
28-35: LGTM!The helper function is clean and appropriate for loading test fixture images.
38-129: LGTM!The embedder configuration is well-structured and appropriately mirrors the JS test configuration for multimodal embeddings with text and image fragment types.
155-221: LGTM!The fixture properly handles index setup with appropriate timeouts and documents the rationale for using raw HTTP requests. The 5-minute timeout for embedding generation is reasonable given external API calls.
264-288: Verify expected result for mixed text+image query.The test searches for "a futuristic movie" with a Yoda image but asserts the result is "Captain Marvel". This seems potentially incorrect—wouldn't a Star Wars film be expected given the Yoda image?
Please verify this is the intended behavior. If the test dataset doesn't include Star Wars movies, this assertion makes sense; otherwise, the expected result may need adjustment.
| # ---------------- Tests ---------------- | ||
| @pytest.mark.skipif(not VOYAGE_API_KEY, reason="Voyage API key not set") | ||
| class TestMultimodalSearch: | ||
| """Multi-modal search tests""" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare class-level attributes to fix Pylint no-member errors.
The pipeline is failing with Pylint E1101 because search_client, client, and index are dynamically assigned via request.cls in the fixture but not declared at the class level. The inline # pylint: disable=no-member comments on methods are not preventing the errors.
Declare these as class attributes with type annotations to satisfy Pylint:
@pytest.mark.skipif(not VOYAGE_API_KEY, reason="Voyage API key not set")
class TestMultimodalSearch:
"""Multi-modal search tests"""
+
+ # Class attributes populated by setup_index fixture
+ client: Client
+ index: object # meilisearch.index.Index
+ search_client: Client
@pytest.fixture(autouse=True)
def clear_indexes(self, client):After adding these declarations, you can also remove the # pylint: disable=no-member comments from the test methods (lines 223, 242, 264).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # ---------------- Tests ---------------- | |
| @pytest.mark.skipif(not VOYAGE_API_KEY, reason="Voyage API key not set") | |
| class TestMultimodalSearch: | |
| """Multi-modal search tests""" | |
| # ---------------- Tests ---------------- | |
| @pytest.mark.skipif(not VOYAGE_API_KEY, reason="Voyage API key not set") | |
| class TestMultimodalSearch: | |
| """Multi-modal search tests""" | |
| # Class attributes populated by setup_index fixture | |
| client: Client | |
| index: object # meilisearch.index.Index | |
| search_client: Client | |
| @pytest.fixture(autouse=True) | |
| def clear_indexes(self, client): |
🤖 Prompt for AI Agents
In tests/client/test_multimodal.py around lines 132 to 136, declare class-level
attributes for search_client, client, and index with appropriate type
annotations on the TestMultimodalSearch class to satisfy Pylint E1101
(no-member) since those attributes are assigned via request.cls in fixtures; add
lines like "search_client: <appropriate-type>", "client: <appropriate-type>",
and "index: <appropriate-type>" at the top of the class, and then remove the
now-unnecessary "# pylint: disable=no-member" comments from the test methods at
lines 223, 242, and 264.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/client/test_multimodal.py (1)
132-136: Declare class-level attributes to satisfy Pylint.The previous review suggested declaring class attributes with type annotations instead of using
# pylint: disable=no-member. This would improve type safety and IDE support:@pytest.mark.skipif(not VOYAGE_API_KEY, reason="Voyage API key not set") -class TestMultimodalSearch: # pylint: disable=no-member +class TestMultimodalSearch: """Multi-modal search tests""" + + # Class attributes populated by setup_index fixture + client: Client + index: object # meilisearch.index.Index + search_client: Client
🧹 Nitpick comments (1)
tests/client/test_multimodal.py (1)
148-153: Simplify exception handling and consider logging.Catching both
MeilisearchApiErrorandExceptionis redundant sinceExceptionencompasses all exceptions. Static analysis flags this as problematic (BLE001, S110). Consider:try: task = client.index(index.uid).delete() client.wait_for_task(task.task_uid) - except (MeilisearchApiError, Exception): # pylint: disable=broad-exception-caught - # Ignore errors when deleting indexes (may not exist) - pass + except Exception: # pylint: disable=broad-exception-caught + # Index may not exist or be in use; safe to ignore during cleanup + continueAlternatively, if you only expect
MeilisearchApiError, catch that specifically and let unexpected exceptions propagate.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/client/test_multimodal.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/client/test_multimodal.py (2)
meilisearch/client.py (3)
Client(39-1150)index(227-243)update_experimental_features(1130-1150)tests/conftest.py (1)
client(15-16)
🪛 Ruff (0.14.5)
tests/client/test_multimodal.py
151-153: try-except-pass detected, consider logging the exception
(S110)
151-151: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (6)
tests/client/test_multimodal.py (6)
1-25: LGTM!Imports, constants, and path construction are appropriate. The test gating via
VOYAGE_API_KEYensures tests only run when the API key is available.
28-35: LGTM!The helper function correctly loads and base64-encodes fixture images.
38-129: LGTM!The embedder configuration correctly defines the Voyage multimodal API structure with indexing and search fragments.
155-222: LGTM!The setup fixture is well-structured with appropriate error handling, timeouts, and verification. The 5-minute timeout for embedding generation appropriately accounts for external API latency.
265-289: Consider the assertion's robustness.The test combines "a futuristic movie" text with a Master Yoda image but expects "Captain Marvel" as the top hit. This assertion may be fragile if embedding model behavior changes or if the semantic relationship shifts.
Consider either:
- Asserting that a Star Wars related movie appears in the results (given the Yoda image)
- Using
assert response["hits"][0]["title"] in ["Captain Marvel", "Star Wars", ...]for more resilient assertions- Adding a comment explaining why "Captain Marvel" is the expected result
Otherwise, the test structure and multimodal query configuration look correct.
224-263: LGTM!Both test methods have clear semantic connections between the query and expected results. The pattern of using
self.search_client.index(INDEX_UID).search()with hybrid embedder configuration is consistent and correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/client/test_multimodal.py (2)
154-156: Consider logging exceptions in cleanup code.While catching broad exceptions in cleanup is sometimes necessary, consider logging them or being more specific to avoid silently hiding unexpected errors.
- except (MeilisearchApiError, Exception): # pylint: disable=broad-exception-caught - # Ignore errors when deleting indexes (may not exist) - pass + except MeilisearchApiError: + # Index doesn't exist or already deleted, which is fine + pass + except Exception as e: # pylint: disable=broad-exception-caught + # Log unexpected errors but continue cleanup + print(f"Warning: Unexpected error during cleanup: {e}")
266-290: Consider consistent parameter passing for clarity.The test works correctly, but passing
Noneas the query parameter while including"q": queryin opt_params (lines 272-274) is inconsistent withtest_text_querywhich passes the query directly as the first parameter. While both approaches work due to dict merging, using the same pattern would improve clarity.response = self.search_client.index(INDEX_UID).search( - None, + query, { - "q": query, "media": {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
tests/fixtures/master-yoda.jpegis excluded by!**/*.jpeg
📒 Files selected for processing (1)
tests/client/test_multimodal.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/client/test_multimodal.py (3)
meilisearch/client.py (1)
index(227-243)meilisearch/_httprequests.py (2)
patch(110-118)get(95-96)meilisearch/index.py (3)
add_documents(451-483)get_stats(262-279)search(282-319)
🪛 Ruff (0.14.5)
tests/client/test_multimodal.py
154-156: try-except-pass detected, consider logging the exception
(S110)
154-154: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (5)
tests/client/test_multimodal.py (5)
1-36: LGTM! Clean setup and helper function.The imports, environment variable handling, path resolution, and base64 encoding helper are all implemented correctly.
38-129: LGTM! Well-structured embedder configuration.The complex nested configuration for Voyage AI multimodal embeddings is properly structured with separate indexing and search fragments for text, poster, and combined modalities.
158-223: LGTM! Comprehensive test setup with proper timeout.The fixture correctly enables the experimental feature, configures the embedder, and handles the long-running embedding generation with appropriate timeouts. The timeout on the HTTP request (line 194) properly addresses the previous review feedback.
225-242: LGTM! Text-based search test is well-structured.The test correctly demonstrates text-based multimodal search with the hybrid embedder configuration.
244-264: LGTM! Image-based search test is correct.The test properly demonstrates image-only search by passing
Noneas the query and using the poster URL in the media parameter.
|
@Strift You can take second look. Override clear_indexes fixture to preserve multimodal test index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the modification. This looks great 🙌
Unfortuantely, I tried running the tests locally, but had many Meilisearch errors:
{"detail":"Request to model 'voyage-multimodal-3' failed. The image URL you have provided is invalid. Please verify the URL and try again.
I think it would be safer to use a subset of the dataset, with image URLs that are known to be valid.
|
@Strift I added small dataset with valid urls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/client/test_multimodal.py (1)
266-289: Usesearch’squeryparameter instead of injecting"q"inopt_paramsIn
test_text_and_image_query, you currently passNoneas the first argument and then set"q"insideopt_params. Given the implementation ofIndex.search(body = {"q": query, **opt_params}), you can pass the text query as the first argument and drop the redundant"q"key; the resulting request body is the same but clearer and better aligned with the public API.def test_text_and_image_query(self): """Test text and image query""" query = "a futuristic movie" master_yoda_base64 = load_image_base64("master-yoda.jpeg") response = self.search_client.index(INDEX_UID).search( - None, - { - "q": query, + query, + { "media": { "textAndPoster": { "text": query, "image": { "mime": "image/jpeg", "data": master_yoda_base64,This keeps behavior identical while making the test intent more obvious.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
datasets/multimodal_movies.json(1 hunks)tests/client/test_multimodal.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/client/test_multimodal.py (3)
meilisearch/client.py (3)
Client(39-1150)index(227-243)update_experimental_features(1130-1150)tests/conftest.py (1)
client(15-16)meilisearch/index.py (3)
add_documents(451-483)get_stats(262-279)search(282-319)
🪛 Ruff (0.14.6)
tests/client/test_multimodal.py
154-156: try-except-pass detected, consider logging the exception
(S110)
154-154: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (5)
datasets/multimodal_movies.json (1)
1-9: Dataset structure looks good for multimodal testsThe JSON array is well‑formed, with consistent fields (
id,title,poster,overview,release_date) matching how the tests consume this data. No issues from a test-data perspective.tests/client/test_multimodal.py (4)
29-35: Image-to-base64 helper is clear and sufficient
load_image_base64cleanly usesPath.read_bytes()andbase64.b64encode(...).decode("utf-8"); simple, deterministic, and appropriate for test fixtures.
40-129: Embedder config matches a realistic multimodal setupThe
EMBEDDER_CONFIGclosely mirrors a real Voyage multimodal embedder: it keepsindexingFragments/searchFragments,request.inputs, andresponsemapping in raw JSON, which fits the directrequests.patchusage and avoids Pydantic validation issues. No issues spotted here.
158-225: Index setup flow for multimodal search is well-structuredThe class-scoped
setup_indexfixture correctly:
- Enables the
multimodalexperimental feature viaclient.update_experimental_features.- Recreates the test index once per class.
- Applies the complex embedder settings with a raw
requests.patch(now with atimeout), then waits for the returned task.- Adds documents and waits with extended timeouts appropriate for remote embedding generation.
- Verifies document count and exposes
search_clientviarequest.cls.This is a solid, self-contained setup for these slower E2E tests.
225-265: Query tests cover the main multimodal entry pointsThe three tests (pure text, pure image, and combined text+image) exercise the key multimodal paths with clear, deterministic assertions on the top hit titles. Combined with the dataset, this gives good end-to-end coverage of the new feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/client/test_multimodal.py (2)
140-157: Index cleanup override behaves as intendedOverriding the global
clear_indexesfixture to keepINDEX_UIDwhile deleting all others matches the described intent and avoids silently swallowing unexpected errors by narrowing toMeilisearchApiError. If you ever need richer diagnostics, consider swapping
158-225: Setup fixture is robust; consider asserting presence oftaskUidThe overall setup flow (enable experimental feature, recreate index, PATCH settings with a timeout, wait for tasks, add docs, assert stats, and expose
search_client) looks solid and well‑bounded with timeouts.One small robustness improvement: guard against a missing
taskUidin the settings PATCH response so failures are easier to diagnose than whateverwait_for_taskdoes withNone:- task_data = response.json() - task_uid = task_data.get("taskUid") - client.wait_for_task(task_uid, timeout_in_ms=60_000) + task_data = response.json() + task_uid = task_data.get("taskUid") + if task_uid is None: + raise AssertionError(f"Expected 'taskUid' in settings update response, got: {task_data}") + client.wait_for_task(task_uid, timeout_in_ms=60_000)This keeps the happy path unchanged while surfacing malformed responses clearly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/client/test_multimodal.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/client/test_multimodal.py (3)
meilisearch/client.py (4)
get_indexes(128-159)index(227-243)update_experimental_features(1130-1150)get_index(185-204)meilisearch/_httprequests.py (2)
patch(110-118)get(95-96)meilisearch/index.py (2)
add_documents(451-483)search(282-319)
🔇 Additional comments (4)
tests/client/test_multimodal.py (4)
29-35: Image-to-base64 helper is clear and correctUsing
Path.read_bytes()withbase64.b64encode(...).decode("utf-8")is straightforward and appropriate for these tests. No changes needed.
40-129: Inline EMBEDDER_CONFIG is verbose but appropriate for parity with JS testsKeeping the full multimodal embedder configuration inline here makes the test self-contained and mirrors the JS test shape, which is useful for cross-SDK parity. Given this is test-only code, the current structure is acceptable.
225-243: Text-only multimodal search test is straightforwardThe text-query test is clear: it exercises semantic search via the configured embedder and asserts on the top hit title. Nothing to change here.
244-264: Image-only search test cleanly targets poster-based retrievalLooking up the Dumbo poster URL from the dataset and then issuing an image-only search with
semanticRatio=1is a good focused check of the poster fragment path. This test reads well and is consistent with the API shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/client/test_multimodal.py (1)
244-264: Consider defensive lookup for test data.Line 247 uses
next(m for m in MOVIES ...)which will raiseStopIterationif "Dumbo" is not found in the dataset. While the dataset is controlled, adding a default value would make the test more robust:- dumbo_movie = next(m for m in MOVIES if m["title"] == "Dumbo") + dumbo_movie = next((m for m in MOVIES if m["title"] == "Dumbo"), None) + assert dumbo_movie is not None, "Dumbo not found in test dataset"The image-only search pattern (passing
Noneas query) is correct for this use case.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/client/test_multimodal.py(1 hunks)
🔇 Additional comments (7)
tests/client/test_multimodal.py (7)
1-26: LGTM!The imports, environment setup, and path configurations are well-structured. Loading the dataset at module level is appropriate for test fixtures.
28-36: LGTM!The helper function correctly encodes images to base64 strings. Error handling for missing files is not critical in controlled test fixtures.
38-129: LGTM!The embedder configuration correctly defines the complex nested structures required for Voyage's multimodal API, including both indexing and search fragments with appropriate template syntax.
132-156: LGTM!The test class setup properly addresses previous review feedback:
- Class attribute
search_clientis declared (line 138)- Exception handling now catches specific
MeilisearchApiErrorwith warning output (lines 154-156)- The fixture correctly preserves the multimodal test index while cleaning up others
158-223: LGTM!The setup fixture properly addresses previous review feedback by including a timeout on the
requests.patchcall (line 194). The extended timeouts for embedding generation (5 minutes) and reduced polling interval (1 second) are well-justified by comments and appropriate for external API operations.
225-242: LGTM!The test correctly passes the query as a positional argument (line 229) rather than in the options dict, following the documented API pattern. The search configuration appropriately uses the text fragment and hybrid search parameters.
266-289: LGTM!The test correctly passes the query as a positional argument (line 272) and structures the multimodal search request properly with both text and base64-encoded image data. This addresses the previous review concern about query parameter handling.
Add support for multi-modal search
#1134
Summary by CodeRabbit
New Features
Tests
Data
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.