Skip to content

Conversation

@ephamhung-oss
Copy link
Contributor

No description provided.

Signed-off-by: Eric Pham-Hung <ephamhung@ephamhung-mlt.client.nvidia.com>
@ephamhung-oss ephamhung-oss marked this pull request as ready for review October 20, 2025 20:20
Signed-off-by: Eric Pham-Hung <ephamhung@nvidia.com>
Signed-off-by: Eric Pham-Hung <ephamhung@nvidia.com>
Copy link

@nina-xu nina-xu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall it looks good! thanks so much for putting this together

"id": "630e3e17",
"metadata": {},
"source": [
"# 🎛️ NeMo Safe Synthesizer 101: Extrinsic Evaluation\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

102?

"metadata": {},
"outputs": [],
"source": [
"# This script defines a scikit-learn pipeline for a classification task.\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the extrinsic evaluation portion, there’s a bit of code repetition. Suggest to DRY it up by defining the train + eval steps into a function, and call that function twice with train_and_evaluate_logistic_regression(df, test_df); train_and_evaluate_logistic_regression(synthetic_df, test_df). This also makes it very clear to a user what we are doing here.

"from sklearn.metrics import classification_report, accuracy_score, roc_auc_score\n",
"\n",
"original_pipeline = full_pipeline \n",
"print(\"\\n--- Training Benchmark Model on Original Data (1000 rows) ---\")\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the 1000 here is accurate here?

Comment on lines +467 to +470
"| Accuracy | 0.9404 | 0.9278 |\n",
"| ROC AUC Score | 0.9782 | 0.9762 |\n",
"| Precision (Class 1) | 0.9626 | 0.9423 |\n",
"| Recall (Class 1) | 0.9646 | 0.9714 |\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing results. out of curiorsity what was the SQS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants