Improve test points: flaky and verbose

ccreutzi · ccreutzi · commit ca57ae3fb423 · 2025-03-18T14:07:54.000+01:00
* Make test point less brittle

`tazureChat/generateWithStructuredOutput` has somewhat frequent spurious failures of this type:

```
  ================================================================================
  Verification failed in tazureChat/generateWithStructuredOutput.
      ---------------------
      Framework Diagnostic:
      ---------------------
      OrConstraint failed.
      --&gt; + [First Condition]:
           |   OrConstraint failed.
           |   --&gt; + [First Condition]:
           |        |   IsEqualTo failed.
           |        |   --&gt; StringComparator failed.
           |        |       --&gt; The strings are not equal.
           |        |
           |        |       Actual Value:
           |        |           "western honey bee"
           |        |       Expected Value:
           |        |           "honeybee"
           |   --&gt; OR
           |       + [Second Condition]:
           |        |   IsEqualTo failed.
           |        |   --&gt; StringComparator failed.
           |        |       --&gt; The strings are not equal.
           |        |
           |        |       Actual Value:
           |        |           "western honey bee"
           |        |       Expected Value:
           |        |           "honey bee"
           |       -+---------------------
      --&gt; OR
          + [Second Condition]:
           |   IsEqualTo failed.
           |   --&gt; StringComparator failed.
           |       --&gt; The strings are not equal.
           |
           |       Actual Value:
           |           "western honey bee"
           |       Expected Value:
           |           "bee"
          -+---------------------
```

Rather than trying to add more and more valid alternatives, it is probably better to relax the requirements a bit.

* Make test less noisy

`texampleTests/testAnalyzeTextDataUsingParallelFunctionCallwithOllama` was lacking the `evalc` against noise cluttering the output.
diff --git a/tests/hstructuredOutput.m b/tests/hstructuredOutput.m
@@ -10,14 +10,13 @@
     methods(Test)
         % Test methods
         function generateWithStructuredOutput(testCase)
-            import matlab.unittest.constraints.IsEqualTo
+            import matlab.unittest.constraints.ContainsSubstring
             import matlab.unittest.constraints.StartsWithSubstring
             res = generate(testCase.structuredModel,"Which animal produces honey?",...
                 ResponseFormat = struct(commonName = "dog", scientificName = "Canis familiaris"));
             testCase.assertClass(res,"struct");
             testCase.verifySize(fieldnames(res),[2,1]);
-            testCase.verifyThat(lower(res.commonName), IsEqualTo("honeybee") | IsEqualTo("honey bee")  | ...
-                IsEqualTo("bee"));
+            testCase.verifyThat(lower(res.commonName), ContainsSubstring("bee"));
             testCase.verifyThat(res.scientificName, StartsWithSubstring("Apis"));
         end
 
diff --git a/tests/texampleTests.m b/tests/texampleTests.m
@@ -77,7 +77,7 @@ function testAnalyzeTextDataUsingParallelFunctionCallwithChatGPT(testCase)
 
         function testAnalyzeTextDataUsingParallelFunctionCallwithOllama(testCase)
             testCase.startCapture("AnalyzeTextDataUsingParallelFunctionCallwithOllama");
-            AnalyzeTextDataUsingParallelFunctionCallwithOllama;
+            evalc("AnalyzeTextDataUsingParallelFunctionCallwithOllama");
         end
 
         function testCreateSimpleChatBot(testCase,ChatBotExample)