Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import type { MongoDBFieldType } from '@mongodb-js/compass-generative-ai';
import { MockDataGeneratorStep } from './types';

export const StepButtonLabelMap = {
Expand All @@ -10,3 +11,72 @@ export const StepButtonLabelMap = {

export const DEFAULT_DOCUMENT_COUNT = 1000;
export const MAX_DOCUMENT_COUNT = 100000;

/**
* Map of MongoDB types to available Faker methods.
* Not all Faker methods are included here.
* More can be found in the Faker.js API: https://fakerjs.dev/api/
Copy link
Collaborator

@jcobis jcobis Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is linking to faker v10, while we're currently on v9— maybe we should just take this opportunity to bump to v10. Think we could just update in package.json: "@faker-js/faker": "^10.0.0", I think should be straightforward, only change I see on first pass would be internet.userName -> internet.username in this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compass repo is set up to use common JS and not ESM, which is what faker v10 requires. I'm wondering if we can tackle this in a separate PR if we want to upgrade to v10 so that we can get this ticket in first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compass repo can handle esm in application code just fine, but I think you downgraded this to 9 because that's what the model generates better, at least based on this comment #7295 (comment)

Would be nice if you document this somewhere so that the reason is known

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcobis What do you think about this? Can you document which version we want in the docs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good to me, we can stick with v9 for now

*/
export const MONGO_TYPE_TO_FAKER_METHODS: Record<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave Augment link to https://v9.fakerjs.dev/api/ and had it come up with some more additions. (If you upgrade to v10 can do the same with the new API link):

String: [
  // Current methods are good, but could add:
  'internet.password',        // For password fields
  'internet.displayName',     // For display names/usernames
  'internet.emoji',           // For emoji/reaction fields
  'system.fileName',          // For file name fields
  'system.filePath',          // For file path fields
  'system.mimeType',          // For content type fields
  'book.title',               // For title fields
  'music.songName',           // For song/media titles
  'food.dish',                // For food-related strings
  'animal.type',              // For animal/pet fields
  'vehicle.model',            // For vehicle/product models
  'hacker.phrase',            // For tech-related descriptions
  'science.chemicalElement',  // For scientific data
]

Number: [
  // Current methods are good, but could add:
  'number.binary',            // For binary numbers
  'number.octal',             // For octal numbers
  'number.hex',               // For hexadecimal numbers
  'commerce.price',           // For price fields (returns number)
  'date.weekday',             // For day numbers
  'internet.port',            // For port numbers
]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these!

MongoDBFieldType,
Array<string>
> = {
String: [
'lorem.word',
'lorem.words',
'lorem.sentence',
'lorem.paragraph',
'person.firstName',
'person.lastName',
'person.fullName',
'person.jobTitle',
'internet.email',
'internet.url',
'internet.domainName',
'internet.userName',
'phone.number',
'location.city',
'location.country',
'location.streetAddress',
'location.zipCode',
'location.state',
'company.name',
'company.catchPhrase',
'color.human',
'commerce.productName',
'commerce.department',
'finance.accountName',
'finance.currencyCode',
'git.commitSha',
'string.uuid',
'string.alpha',
'string.alphanumeric',
],
Number: [
'number.int',
'number.float',
'finance.amount',
'location.latitude',
'location.longitude',
],
Int32: ['number.int', 'finance.amount'],
Long: ['number.int', 'number.bigInt'],
Decimal128: ['number.float', 'finance.amount'],
Boolean: ['datatype.boolean'],
Date: [
'date.recent',
'date.past',
'date.future',
'date.anytime',
'date.birthdate',
],
Timestamp: ['date.recent', 'date.past', 'date.future', 'date.anytime'],
ObjectId: ['database.mongodbObjectId'],
Binary: ['string.hexadecimal', 'string.binary'],
RegExp: ['lorem.word', 'string.alpha'],
Code: ['lorem.sentence', 'lorem.paragraph', 'git.commitMessage'],
MinKey: ['number.int'],
MaxKey: ['number.int'],
Symbol: ['lorem.word', 'string.symbol'],
DBRef: ['database.mongodbObjectId'],
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
import { expect } from 'chai';
import React from 'react';
import {
screen,
render,
cleanup,
waitFor,
userEvent,
} from '@mongodb-js/testing-library-compass';
import sinon from 'sinon';
import FakerMappingSelector from './faker-mapping-selector';
import { UNRECOGNIZED_FAKER_METHOD } from '../../modules/collection-tab';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Drive-by suggestion]: This made me realize, due to parallel implementation there's a slight mismatch with script-generation-utils.ts, we should change that file to use UNRECOGNIZED_FAKER_METHOD instead of "unrecognized", if you could make that tweak it'd be much appreciated, I think it's just in this one spot (and then update tests accordingly)

import { MONGO_TYPE_TO_FAKER_METHODS } from './constants';
import { MongoDBFieldTypeValues } from '@mongodb-js/compass-generative-ai';

const mockActiveJsonType = MongoDBFieldTypeValues.String;
const mockActiveFakerFunction = 'lorem.word';
const onJsonTypeSelectStub = sinon.stub();
const onFakerFunctionSelectStub = sinon.stub();

describe('FakerMappingSelector', () => {
afterEach(() => {
cleanup();
});

it('should display all MongoDB types in the dropdown', async () => {
// Check that all MongoDB types from the constant are present
const mongoTypes = Object.keys(MongoDBFieldTypeValues);

render(
<FakerMappingSelector
activeJsonType={mockActiveJsonType}
activeFakerFunction="lorem.word"
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

const jsonTypeSelect = screen.getByLabelText('JSON Type');
userEvent.click(jsonTypeSelect);

for (const type of mongoTypes) {
await waitFor(() => {
expect(screen.getByRole('option', { name: type })).to.exist;
});
}
});

describe('should display faker methods for each MongoDB type', () => {
Object.entries(MONGO_TYPE_TO_FAKER_METHODS).forEach(
([mongoType, methods]) => {
Copy link
Collaborator

@mabaasit mabaasit Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: this reads better

Suggested change
Object.entries(MONGO_TYPE_TO_FAKER_METHODS).forEach(
([mongoType, methods]) => {
for (const [mongoType, methods] of Object.entries(MONGO_TYPE_TO_FAKER_METHODS)) {
it(`should display faker methods for ${mongoType}`, () => {```

it(`should display faker methods for ${mongoType}`, () => {
const firstMethod = methods[0];

render(
<FakerMappingSelector
activeJsonType={
mongoType as keyof typeof MONGO_TYPE_TO_FAKER_METHODS
}
activeFakerFunction={firstMethod}
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

const fakerFunctionSelect = screen.getByLabelText('Faker Function');
userEvent.click(fakerFunctionSelect);

methods.forEach((method) => {
expect(screen.getByRole('option', { name: method })).to.exist;
});
});
}
);
});

it('should call onJsonTypeSelect when MongoDB type changes', async () => {
render(
<FakerMappingSelector
activeJsonType={mockActiveJsonType}
activeFakerFunction={mockActiveFakerFunction}
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

const jsonTypeSelect = screen.getByLabelText('JSON Type');
userEvent.click(jsonTypeSelect);

const numberOption = await screen.findByRole('option', { name: 'Number' });
userEvent.click(numberOption);

expect(onJsonTypeSelectStub).to.have.been.calledOnceWith('Number');
});

it('should call onFakerFunctionSelect when faker function changes', async () => {
render(
<FakerMappingSelector
activeJsonType={mockActiveJsonType}
activeFakerFunction={mockActiveFakerFunction}
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

const fakerFunctionSelect = screen.getByLabelText('Faker Function');
userEvent.click(fakerFunctionSelect);

const emailOption = await screen.findByRole('option', {
name: 'internet.email',
});
userEvent.click(emailOption);

expect(onFakerFunctionSelectStub).to.have.been.calledOnceWith(
'internet.email'
);
});

it('should show warning banner when faker method is unrecognized', () => {
render(
<FakerMappingSelector
activeJsonType={mockActiveJsonType}
activeFakerFunction={UNRECOGNIZED_FAKER_METHOD}
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

expect(
screen.getByText(
/Please select a function or we will default fill this field/
)
).to.exist;
});

it('should not show warning banner when faker method is recognized', () => {
render(
<FakerMappingSelector
activeJsonType={mockActiveJsonType}
activeFakerFunction={mockActiveFakerFunction}
onJsonTypeSelect={onJsonTypeSelectStub}
onFakerFunctionSelect={onFakerFunctionSelectStub}
/>
);

expect(
screen.queryByText(
/Please select a function or we will default fill this field/
)
).to.not.exist;
});
});
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@ import {
Select,
spacing,
} from '@mongodb-js/compass-components';
import React from 'react';
import React, { useMemo } from 'react';
import { UNRECOGNIZED_FAKER_METHOD } from '../../modules/collection-tab';
import type { MongoDBFieldType } from '@mongodb-js/compass-generative-ai';
import { MongoDBFieldTypeValues } from '@mongodb-js/compass-generative-ai';
import { MONGO_TYPE_TO_FAKER_METHODS } from './constants';

const fieldMappingSelectorsStyles = css({
width: '50%',
Expand All @@ -37,6 +39,17 @@ const FakerMappingSelector = ({
onJsonTypeSelect,
onFakerFunctionSelect,
}: Props) => {
const fakerMethodOptions = useMemo(() => {
const methods =
MONGO_TYPE_TO_FAKER_METHODS[activeJsonType as MongoDBFieldType] || [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just type activeJsonType as MongoDBFieldType so we don't have to cast

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good idea. I see that we already type onJsonTypeSelect with MongoDBFieldType


if (methods.includes(activeFakerFunction)) {
return methods;
}

return [activeFakerFunction, ...methods];
}, [activeJsonType, activeFakerFunction]);

return (
<div className={fieldMappingSelectorsStyles}>
<Body className={labelStyles}>Mapping</Body>
Expand All @@ -46,8 +59,7 @@ const FakerMappingSelector = ({
value={activeJsonType}
onChange={(value) => onJsonTypeSelect(value as MongoDBFieldType)}
>
{/* TODO(CLOUDP-344400) : Make the select input editable and render other options depending on the JSON type selected */}
{[activeJsonType].map((type) => (
{Object.values(MongoDBFieldTypeValues).map((type) => (
<Option key={type} value={type}>
{type}
</Option>
Expand All @@ -59,10 +71,9 @@ const FakerMappingSelector = ({
value={activeFakerFunction}
onChange={onFakerFunctionSelect}
>
{/* TODO(CLOUDP-344400): Make the select input editable and render other JSON types */}
{[activeFakerFunction].map((field) => (
<Option key={field} value={field}>
{field}
{fakerMethodOptions.map((method) => (
<Option key={method} value={method}>
{method}
</Option>
))}
</Select>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import FieldSelector from './schema-field-selector';
import FakerMappingSelector from './faker-mapping-selector';
import type { FakerSchema, MockDataGeneratorState } from './types';
import type { MongoDBFieldType } from '@mongodb-js/compass-generative-ai';
import { getDefaultFakerMethod } from './script-generation-utils';
Copy link

Copilot AI Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of getDefaultFakerMethod from ./script-generation-utils is added but this function is not defined or implemented in the visible changes. This will cause a compilation error.

Suggested change
import { getDefaultFakerMethod } from './script-generation-utils';

Copilot uses AI. Check for mistakes.

const containerStyles = css({
display: 'flex',
Expand Down Expand Up @@ -72,11 +73,15 @@ const FakerSchemaEditorContent = ({
const onJsonTypeSelect = (newJsonType: MongoDBFieldType) => {
const currentMapping = fakerSchemaFormValues[activeField];
if (currentMapping) {
// When MongoDB type changes, update the faker method to a default for that type
const defaultFakerMethod = getDefaultFakerMethod(newJsonType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to my comment about faker args, can we persist the LLM-suggested faker method so that if the user changes the Mongo Type and then reselects the original type, we default to the LLM-suggested faker method? Like with the args, we can store in state.

Quick AI mock-up just as an example, please review ofc:

// Add this state near the other useState declarations (around line 61)
const [originalLlmData, setOriginalLlmData] = useState<Record<string, {
  mongoType: MongoDBFieldType;
  fakerMethod: string;
  fakerArgs: any[];
}>>({});

// Add this useEffect after the existing state declarations
useEffect(() => {
  const originalData = Object.fromEntries(
    Object.entries(fakerSchema).map(([field, mapping]) => [
      field,
      {
        mongoType: mapping.mongoType,
        fakerMethod: mapping.fakerMethod,
        fakerArgs: mapping.fakerArgs
      }
    ])
  );
  setOriginalLlmData(originalData);
}, [fakerSchema]);

// Update the onJsonTypeSelect function (around line 73)
const onJsonTypeSelect = (newJsonType: MongoDBFieldType) => {
  const currentMapping = fakerSchemaFormValues[activeField];
  const originalData = originalLlmData[activeField];
  
  if (currentMapping) {
    // If switching back to original type, restore original LLM data
    const shouldRestoreOriginal = originalData && 
      newJsonType === originalData.mongoType &&
      originalData.fakerMethod !== UNRECOGNIZED_FAKER_METHOD;
    
    const fakerMethod = shouldRestoreOriginal 
      ? originalData.fakerMethod 
      : getDefaultFakerMethod(newJsonType);
      
    const fakerArgs = shouldRestoreOriginal 
      ? originalData.fakerArgs 
      : [];

    setFakerSchemaFormValues({
      ...fakerSchemaFormValues,
      [activeField]: {
        ...currentMapping,
        mongoType: newJsonType,
        fakerMethod,
        fakerArgs,
      },
    });
    resetIsSchemaConfirmed();
  }
};

// Update the onFakerFunctionSelect function (around line 91)
const onFakerFunctionSelect = (newFakerFunction: string) => {
  const currentMapping = fakerSchemaFormValues[activeField];
  const originalData = originalLlmData[activeField];
  
  if (currentMapping) {
    // Restore original args if user reselects original LLM method
    const fakerArgs = (originalData?.fakerMethod === newFakerFunction) 
      ? originalData.fakerArgs 
      : currentMapping.fakerArgs;

    setFakerSchemaFormValues({
      ...fakerSchemaFormValues,
      [activeField]: {
        ...currentMapping,
        fakerMethod: newFakerFunction,
        fakerArgs,
      },
    });
    resetIsSchemaConfirmed();
  }
};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can restore the original args if the user reselects an LLM suggested method. Since we will be restoring it back to the original args, we won't be needing state to store the original LLM mappings.

setFakerSchemaFormValues({
...fakerSchemaFormValues,
[activeField]: {
...currentMapping,
mongoType: newJsonType,
fakerMethod: defaultFakerMethod,
fakerArgs: [], // Reset args when changing type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we store these args so if the user re-selects the LLM-suggested method, they're re-displayed? I think we can store it in the state of faker-schema-editor-screen.tsx with a useEffect dependent on fakerSchema, and then in onFakerFunctionSelect restore original args if user reselects original LLM method

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick example code mock-up in this comment

},
});
resetIsSchemaConfirmed();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import {
BannerVariant,
Body,
DocumentList,
useDarkMode,
} from '@mongodb-js/compass-components';

import { usePreference } from 'compass-preferences-model/provider';
Expand All @@ -23,11 +24,14 @@ interface RawSchemaConfirmationScreenProps {
fakerSchemaGenerationStatus: MockDataGeneratorState['status'];
}

const documentContainerStyles = css({
backgroundColor: palette.gray.light3,
border: `1px solid ${palette.gray.light2}`,
borderRadius: spacing[400],
});
const getDocumentContainerStyles = (isDarkMode: boolean) =>
css({
backgroundColor: isDarkMode ? palette.gray.dark3 : palette.gray.light3,
border: `1px solid ${
isDarkMode ? palette.gray.dark2 : palette.gray.light2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 😎

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great fix. However, with this change, css() is called on every render and it is kinda expensive wrt to performance. Within Compass, we usually have two different styles, light and dark and then in component we merge the two based on darkmode. Something like this:

Suggested change
isDarkMode ? palette.gray.dark2 : palette.gray.light2
const documentContainerStyles = css({
border: `1px solid ${palette.gray.light2}`,
...
});
const documentContainerDarkStyles = css({
border: `1px solid ${palette.gray.dark2}`,
});
...
return (
<div className={cx(documentContainerStyles, isDarkmode && documentContainerDarkStyles)} />
);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This works, but maybe we can just memoize the CSS object? That way, we don't need to have both a light and dark mode documentContainerStyles.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can do that, but this would introduce another pattern in codebase. Would be better if we stick to how its already done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yeah, that's the only doubt I had. I'll stick to the current pattern then to keep it simple and readable!

}`,
borderRadius: spacing[400],
});

const documentStyles = css({
padding: `${spacing[400]}px ${spacing[900]}px`,
Expand All @@ -52,6 +56,7 @@ const RawSchemaConfirmationScreen = ({
const enableSampleDocumentPassing = usePreference(
'enableGenAISampleDocumentPassing'
);
const isDarkMode = useDarkMode();

const subtitleText = enableSampleDocumentPassing
? 'Sample Documents Collected'
Expand All @@ -69,7 +74,7 @@ const RawSchemaConfirmationScreen = ({
{subtitleText}
</Body>
<Body className={descriptionStyles}>{descriptionText}</Body>
<div className={documentContainerStyles}>
<div className={getDocumentContainerStyles(!!isDarkMode)}>
<DocumentList.Document
className={documentStyles}
editable={false}
Expand Down
3 changes: 1 addition & 2 deletions packages/compass-generative-ai/src/atlas-ai-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -235,8 +235,7 @@ export interface MockDataSchemaRequest {
*/
export type MongoDBFieldType = PrimitiveSchemaType['name'];

// TODO(CLOUDP-346699): Export this from mongodb-schema
enum MongoDBFieldTypeValues {
export enum MongoDBFieldTypeValues {
String = 'String',
Number = 'Number',
Boolean = 'Boolean',
Expand Down
2 changes: 2 additions & 0 deletions packages/compass-generative-ai/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,5 @@ export type {
MockDataSchemaResponse,
MongoDBFieldType,
} from './atlas-ai-service';

export { MongoDBFieldTypeValues } from './atlas-ai-service';
Loading