Connection Pooling, Split Tests, New Tests, Percona db added to tests #179

jaredmdobson · 2025-08-28T18:07:23Z

I'm not trying to cause problems i swear 😂 i know this is as beast and we can schedule a call to go over it or whatever is needed.

I really needed connection pooling and the test file was insanely big 😂 and so i had to 'yak shave' 🪒 to get there.

mysql:
  host: "localhost"
  port: 9307
  user: "root"
  password: "admin"
  pool_size: 3 # Reduced for tests to avoid connection exhaustion
  max_overflow: 2
  charset: "utf8mb4" # Explicit charset for MariaDB compatibility
  collation: "utf8mb4_unicode_ci" # Explicit collation for MariaDB compatibility

Also for mariadb this adds in breaking changes as you must specify the charset and the collation for the connection pool to work. So we'd need to increment a major or minor version etc.

I'm working on fixing the build.

- Tests for removal - Tests for update - README.md updated (added MySQL config and ClickHouse config settings)

* Fixed state save order (prevent skipping schema migration in case of exception) * Support for ALTER table CHANGE column

- Updated `CLAUDE.md` to clarify the usage of the test script for full suite testing. - Enhanced `run_tests.sh` to accept optional pytest parameters, allowing for more flexible test execution. - Introduced a new method in `DataTestMixin` for normalizing datetime comparisons between MySQL and ClickHouse, improving accuracy in test assertions. - Refactored integration tests to ensure proper setup and teardown for MariaDB configurations, addressing known timing issues in replication tests.

…xecution - Introduced a new `PARALLEL_TESTING.md` file detailing the implementation of parallel test execution, achieving significant runtime reduction from 60-90 minutes to 10-15 minutes. - Updated `docker-compose-tests.yaml` to optimize health check parameters for the MySQL service. - Enhanced `pytest.ini` with new markers for parallel-safe and serial-only tests. - Modified `requirements-dev.txt` to include `pytest-xdist` for enabling parallel execution. - Refactored `run_tests.sh` to support parallel execution and CI reporting, allowing for flexible test runs with various options. - Improved test isolation in `conftest.py` to ensure unique database names for each test, preventing conflicts during parallel execution. - Updated integration tests to utilize the new parallel testing framework and ensure proper database context handling.

- Adjusted health check parameters in `docker-compose-tests.yaml` for the MySQL service to improve reliability during testing. - Added `pytest-xdist` version 3.8.0 to `requirements-dev.txt` and `pyproject.toml` to support parallel test execution. - Updated `requirements-dev.txt` to ensure compatibility with the latest testing frameworks. - Refactored test setup in `conftest.py` to ensure unique database names for improved isolation in tests. - Removed obsolete integration test files to streamline the test suite and enhance maintainability.

…el execution - Updated `.gitignore` to include patterns for binlog data directories to prevent clutter. - Enhanced `CLAUDE.md` with detailed testing architecture and critical fixes for parallel execution, including database isolation and connection pooling configurations. - Modified `Dockerfile` to ensure proper permissions for the binlog directory, addressing Docker volume mount issues. - Refactored `run_tests.sh` to support intelligent parallel execution and CI reporting, optimizing test runs. - Implemented critical directory creation logic in `config.py` to ensure binlog directory writability, preventing race conditions during parallel test execution. - Updated various test files to utilize the new `IsolatedBaseReplicationTest` for improved test isolation and reliability. - Cleaned up obsolete files in the `binlog_json_parser` directory to streamline the codebase.

- Deleted `test-report.html` and `test-results.xml` as they are no longer needed. - Updated `.gitignore` to include new patterns for `test-report.html` and `test-results.xml` to prevent future clutter.

- Added `.pytest_cache/` to `.gitignore` to prevent caching files from cluttering the repository.

bakwc · 2025-09-01T15:30:13Z

Could you please split it into several PRs? Like tests separation - first PR, feature1 implementation - second PR, feature2 implementation 3d PR? Currently it's pretty hard to review so big PR.

- Resolved database timing issues by implementing a complete dynamic database isolation system, allowing tests to run safely in parallel. - Enhanced `CLAUDE.md` with detailed descriptions of the new isolation features and centralized configuration management. - Updated `docker-compose-tests.yaml` for improved MySQL service configuration, including health checks and volume management. - Refactored `run_tests.sh` to include pre-test infrastructure monitoring and support for intelligent parallel execution. - Improved test setup in `conftest.py` to ensure unique database names and streamlined cleanup processes. - Removed the obsolete `PARALLEL_TESTING.md` file and integrated its content into existing documentation. - Updated various integration tests to utilize the new isolation framework and ensure proper database context handling.

…liability - Implemented a centralized TestIdManager to resolve subprocess isolation issues, resulting in a 4x improvement in test pass rate (from 18.8% to 69.9%). - Updated CLAUDE.md to reflect the new status and improvements in test infrastructure, including detailed descriptions of recent fixes and enhancements. - Refactored run_tests.sh to streamline test execution and improve performance monitoring. - Enhanced dynamic configuration management to ensure proper isolation and prevent database context issues during parallel execution. - Migrated several integration tests to utilize the enhanced configuration framework, ensuring better reliability and consistency in test results. - Improved error handling and logging in various test files to facilitate debugging and maintainability.

…n and configuration - Refactored `run_tests.sh` to change the phase of infrastructure monitoring to post-startup and removed redundant Docker service startup command. - Updated `rules.mdc` to set `alwaysApply` to false, enhancing configuration management for test execution. - Improved code readability and organization in `converter.py` by standardizing string quotes and optimizing import statements.

… and organization - Expanded .gitignore to include additional log files, environment variables, and editor-specific directories to prevent clutter in the repository. - Updated CLAUDE.md to reflect recent changes in test infrastructure, including detailed descriptions of fixes and enhancements related to test reliability and performance. - Refactored run_tests.sh to improve performance monitoring and streamline test execution processes. - Enhanced comments and documentation throughout the codebase to clarify the purpose and functionality of various components, ensuring better maintainability.

- Marked tasks for improving source code documentation and fixing critical process startup issues as done. - Updated the status of individual failing tests to in-progress. - Refactored test runners in `conftest.py` to use `python3` and absolute paths for better compatibility in container environments. - Added debug logging in `BaseReplicationTest` to improve error handling and visibility during test execution.

- Updated `docker-compose-tests.yaml` to create a named volume for binlog data and ensure proper permissions for the binlog directory. - Improved directory creation logic in `binlog_replicator.py` and `db_replicator.py` to handle missing parent directories more robustly. - Refactored integration tests in `test_basic_process_management.py` and `test_parallel_initial_replication.py` to utilize isolated configurations for better test isolation and reliability. - Updated task status in `tasks.json` to reflect progress in fixing individual failing tests.

- Standardized string formatting across command initialization in `runner.py` for better consistency. - Enhanced the structure of the `DbReplicatorRunner` class by using multi-line arguments for improved readability. - Updated logging messages in both `runner.py` and `utils.py` to use consistent string formatting. - Improved import organization in `runner.py` and `utils.py` for better clarity and maintainability. - Added helper methods in `base_replication_test.py` to streamline replication setup and target database creation in tests.

- Added detailed documentation to the `run` method in `ProcessRunner` to clarify the importance of test isolation during pytest execution. - Implemented critical checks to ensure test ID logic only runs in testing environments, preventing unnecessary warnings in production. - Improved comments to explain the rationale behind the test isolation system and its impact on database operations during parallel test execution.

- Updated CLAUDE.md to reflect current test status: 126 passed, 47 failed, 11 skipped (68.5% pass rate). - Implemented critical fixes for process startup reliability, including increased timeouts and enhanced error diagnostics. - Improved database detection logic to handle temporary and final database transitions more effectively. - Enhanced dynamic isolation features for parallel test execution, ensuring worker-specific database management. - Removed outdated documentation files and consolidated relevant information into existing guides for clarity.

…ocesses - Marked multiple tasks as done in tasks.json, reflecting the completion of test categorization and error handling improvements. - Enhanced directory creation logic in binlog_replicator.py and db_replicator.py to ensure robust handling of parent directories, preventing startup failures. - Improved error diagnostics and logging for directory creation to facilitate better debugging during test execution. - Removed outdated and flaky tests to streamline the test suite and improve overall reliability.

- Added support for mapping the MySQL 'boolean' type to 'Bool' in the converter, improving type handling consistency.

…ltime - Added error handling for OperationalError (Error 1236) to detect binlog index file corruption. - Implemented automatic deletion of the corrupted binlog directory and clean exit for process restart. - Enhanced logging for better diagnostics during recovery attempts.

- Added a new module for handling MySQL binlog corruption (Error 1236) with automatic recovery functionality. - Integrated recovery logic into both DbReplicatorRealtime and BinlogReplicator to streamline error handling and process restart. - Updated .gitignore to exclude the binlog directory instead of files for better management.

… security and clarity - Updated DbReplicator to pass raw primary key values to mysql_api, eliminating manual quote handling for parameterized queries. - Enhanced MySQLApi to use parameterized queries for pagination, preventing SQL injection and improving query safety. - Added detailed logging for query execution and parameters to aid in debugging and error handling.

- Refactored directory creation handling to ensure robust creation of parent directories, preventing potential startup failures. - Enhanced logging for directory creation errors to provide clearer diagnostics during execution. - Cleaned up whitespace for better code readability.

…bReplicator - Enhanced the `recreate_database` method in ClickhouseApi to include retry logic for dropping and creating databases, improving robustness against concurrent operations. - Updated logging to provide clearer insights during database creation and error handling. - Modified DbReplicator to conditionally run real-time replication based on the `initial_only` flag, ensuring better control over replication processes. - Improved logging for replication completion to include execution time, aiding in performance monitoring.

- Updated the bug report for the critical replication issue, clarifying the status and latest findings regarding the infinite loop on the `api_key` table. - Improved logging in the `perform_initial_replication` method to track table processing and error handling, allowing for better diagnostics during replication. - Added exception handling to ensure that individual table failures do not halt the entire replication process, enhancing robustness. - Implemented detailed logging for worker processes, including primary key advancement tracking and iteration counts, to aid in debugging. - Enhanced SQL query logging in MySQLApi to provide better visibility into executed queries and parameters, improving overall error handling.

- Replaced print statements with logging calls in binlog_replicator.py, clickhouse_api.py, and other modules to enhance consistency and debuggability. - Improved error handling in ClickhouseApi to ensure database qualification is always required, preventing UNKNOWN_TABLE errors. - Enhanced logging in DbReplicatorInitial to track worker processes and primary key advancements, providing better diagnostics for replication issues. - Updated MySQLApi to log query results and primary key ranges for improved visibility into data operations. - Streamlined log forwarding from subprocesses to the main logger, ensuring real-time visibility of worker outputs.

- Simplified the initial replication process by removing unnecessary error handling for individual table failures, ensuring all tables are processed without interruption. - Enhanced logging to confirm successful completion of all tables during initial replication, improving visibility into the replication status. - Updated logging configuration in main.py to output to stdout for real-time visibility, addressing previous buffering issues with stderr.

bakwc added 30 commits August 18, 2024 11:22

Update README.md

b3667d5

Fixed bug in handling large string, removed unused config

7c1ab31

Support for one-time data copy

b8e96c5

Updated version

f434c9e

Updated README.md

9246636

More tests and README update

e773914

- Tests for removal - Tests for update - README.md updated (added MySQL config and ClickHouse config settings)

Fixed altering table with back-quoted columns

20d7b67

Support ALTER CHANGE (bakwc#2)

5d824c6

* Fixed state save order (prevent skipping schema migration in case of exception) * Support for ALTER table CHANGE column

Updated version

699e144

Use random server_id instead of the fixed one

93d3bfa

Skip filtered databases (bakwc#4)

78f7052

Updated version

459aaa3

Prevent binlog removal during initial replication (bakwc#5)

7c4ae07

Tables and databases filtering (bakwc#6)

c211c15

Updated version to 0.0.17

8877f7e

Handling text and blob data types, bakwc#3 (bakwc#8)

1a82499

New release

b78ed91

Write logs to file, split by database (bakwc#9)

3fd5d89

Increased batch size

48706c2

Always leave at least 5 last binlog files

d3ff89e

Write exceptions to log files

36c7e47

Added cpu_load metric to db_replicator

5c66d53

New release

96f1ee7

Description of the AWS RDS settings

05b6028

Increased timeouts for CH client

9dd093b

Settings validation (bakwc#11)

bc1ff4d

tinyint type support

0e88cbe

Update version

7ed85d8

Fixed datetime handling (bakwc#12)

94aa2f5

Updated version

a970371

jaredmdobson added 6 commits August 28, 2025 21:54

Remove obsolete test report files and update .gitignore

b80695b

- Deleted `test-report.html` and `test-results.xml` as they are no longer needed. - Updated `.gitignore` to include new patterns for `test-report.html` and `test-results.xml` to prevent future clutter.

Update .gitignore to include .pytest_cache directory

a13fe25

- Added `.pytest_cache/` to `.gitignore` to prevent caching files from cluttering the repository.

jaredmdobson added 20 commits September 2, 2025 11:48

Add log forwarding functionality to ProcessRunner for real-time logging

890aa15

Enhance type mapping in MysqlToClickhouseConverter

5821f7b

- Added support for mapping the MySQL 'boolean' type to 'Bool' in the converter, improving type handling consistency.

Better error message for enum failures

e92f134

Changes

2032e34

jaredmdobson force-pushed the master branch from 02eda87 to 3dd29e7 Compare November 5, 2025 04:21

jaredmdobson added 2 commits November 5, 2025 07:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Connection Pooling, Split Tests, New Tests, Percona db added to tests #179

Connection Pooling, Split Tests, New Tests, Percona db added to tests #179

jaredmdobson commented Aug 28, 2025 •

edited

Loading

Uh oh!

bakwc commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Connection Pooling, Split Tests, New Tests, Percona db added to tests #179

Are you sure you want to change the base?

Connection Pooling, Split Tests, New Tests, Percona db added to tests #179

Conversation

jaredmdobson commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bakwc commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

jaredmdobson commented Aug 28, 2025 •

edited

Loading