Skip to content

Conversation

@gurevichdmitry
Copy link

@gurevichdmitry gurevichdmitry commented Nov 12, 2025

Update Entity Store Performance Tooling

Summary

Adds performance testing and comparison for Entity Store, including baseline metrics, automated comparisons, and support for service and generic entities.

Features

Performance Metrics Collection

  • Collects transform stats (search, intake, processing latencies), node stats (CPU, memory), and cluster health
  • Configurable sampling interval with transform completion detection
    • Optional --noTransforms flag to skip transform-related operations for ESQL-based workflows

Baseline & Comparison System

  • create-baseline command: extracts and saves baseline metrics from log files
  • compare-metrics command: compares current run against baseline with configurable thresholds
  • Status types: improvement, degradation, warning, stable, insufficient, info
  • Formatted comparison reports with summary statistics
    • Supports baseline creation from runs without transform logs (for --noTransforms mode)

Service & Generic Entity Support

  • Added service and generic entity types to performance data generation
  • Configurable entity distributions (equal, standard)
  • Automatic Asset Inventory enablement for generic entities
  • Per-entity-type metrics tracking

Reliability Improvements

  • Entity-specific sample thresholds (service: 3, others: 10)
  • Adaptive thresholds for latency calculations
  • Informational status for volatile metrics (p99, max)
  • Minimum sample count validation

New CLI Commands

yarn start create-perf-data <name> -e <count> -l <logsPerEntity> [--distribution <type>]
yarn start upload-perf-data-interval <name> --interval <ms> --count <n> [--samplingInterval <ms>] [--noTransforms]
yarn start create-baseline <logPrefix> -e <count> -l <logsPerEntity> -n <baselineName>
yarn start compare-metrics <currentLogPrefix> -b <baseline> [--degradation <%>] [--warning <%>]

ESQL Workflow Support

  • --noTransforms option enables performance testing without entity transforms
  • Skips entity engine initialization, transform stats logging, and transform completion waiting
  • Baseline creation automatically handles missing transform logs with empty transform metrics

Metrics Tracked

Latency (search, intake, processing), CPU, Memory, Throughput, Transform metrics, Errors, Per-entity-type breakdowns

@gurevichdmitry gurevichdmitry marked this pull request as draft November 12, 2025 08:20
@gurevichdmitry gurevichdmitry marked this pull request as ready for review November 25, 2025 08:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive performance testing and comparison tooling for Entity Store, extending support beyond host and user entities to include service and generic entity types.

Key Changes:

  • Added baseline metrics extraction and comparison system for tracking performance across test runs
  • Implemented service and generic entity support with configurable distribution presets
  • Enhanced metrics collection with CPU/memory tracking and adaptive sampling intervals

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/utils/kibana_api.ts Extended engine initialization with Asset Inventory enablement, detailed error reporting, and support for service/generic entity types
src/index.ts Added CLI commands for baseline creation, listing, and metric comparison with configurable thresholds
src/constants.ts Added Kibana settings API endpoint constants
src/commands/utils/metrics_comparison.ts Implemented comprehensive metrics comparison logic with status categorization and adaptive thresholds
src/commands/utils/baseline_metrics.ts Created baseline metrics extraction from logs with percentile calculations and per-entity-type tracking
src/commands/entity_store_perf.ts Extended performance data generation with service/generic entity types, node stats logging, and transform completion detection
package.json Updated tsx dependency version
baselines/baseline-v1_0-standard-1-sec-logging-2025-11-25T08-35-51-081Z.json Added example baseline metrics file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

});

// Baseline metrics commands
program
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I'm testing, when should I use the command create-baseline?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the create-baseline command whenever you have an execution result that you want to set as the standard for future performance comparisons.

For example, if you run upload-perf-data-interval several times and determine that the second run represents the most stable or optimal performance, you would execute create-baseline on the logs from that second run. This command generates a JSON file containing the measured data from that specific execution, which is then used as the baseline for all subsequent performance tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This process details how to establish a performance baseline and compare subsequent runs against it.

1. Data Generation

Command yarn start create-perf-data standard 100000 10
Action Generates the performance data file.
Output Creates a file named "standard" under data/entity_store_perf_data.
Contents This file contains 1 million logs across 100,000 entities (33% hosts, 33% users, 33% generic, 1% services).

2. Initial Performance Test Run

Command yarn start upload-perf-data-interval standard --deleteData --interval 100 --count 1 --samplingInterval 1 --noTransforms
Action Executes the test by uploading the generated data.
Details Performs cleanup, starts the engines, uploads the "standard" file once, and waits for 100 seconds.
Logging Logs are sampled every second (--samplingInterval 1) with no data transforms (--noTransforms).

3. Creating the Baseline

Assumption: The run from Step 2 is the chosen stable reference run.

Command yarn start create-baseline standard-2025-11-25T16:13 -e 100000 -l 10 -n baseline-v1_0
Action Creates the performance baseline file.
Details Processes logs from the specified run (standard-2025-11-25T16:13).
Output Creates a JSON baseline file named baseline-v1_0. (The entity count -e and logs per entity -l are for informational purposes).

4. Running the Comparison Test

Command yarn start upload-perf-data-interval standard --deleteData --interval 100 --count 1 --samplingInterval 1 --noTransforms
Action Executes the exact same performance run as in Step 2 (after code changes).
Result A new set of logs is generated, with a different timestamp in the log file name.

5. Comparing Metrics

Command yarn start compare-metrics standard-<lastruntimestamp> -e 100000 -l 10 --degradation-threshold 35 --warning-threshold 25 --improvement-threshold 30 -b baselines/baseline-v1_0
Action Compares the latest test run against the established baseline.
Details Compares the latest run logs against the baseline file (baselines/baseline-v1_0).
Output Provides a report using the defined tolerances:
- Degradation threshold: 35%
- Warning threshold: 25%
- Improvement threshold: 30%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants