Skip to content

Conversation

@kartalbas
Copy link

Summary

This PR adds support for getting column names when using MySQL 5.7, which doesn't include column names in the binary log by default.

What Changed

  • Added new option use_column_name_cache to enable column name fetching
  • Fetches column names from INFORMATION_SCHEMA.COLUMNS when binlog metadata is missing
  • Uses in-memory caching to avoid repeated database queries
  • Only runs when explicitly enabled (opt-in feature)

Why This Is Needed

  • MySQL 5.7 does not support binlog_row_metadata=FULL setting
  • Without this feature, column names are unavailable in binlog events
  • Many legacy applications still run on MySQL 5.7 and cannot upgrade due to operational constraints, infrastructure dependencies, or business requirements
  • This feature enables these applications to continue using mysql-replication without requiring a database upgrade
  • Applications need column names to properly process row change events

How It Works

  • When use_column_name_cache=True is set, the library queries INFORMATION_SCHEMA for column names
  • Results are cached to improve performance
  • Falls back gracefully if the query fails
  • Default behavior remains unchanged (feature is disabled by default)

Testing

Tested with MySQL 5.7 databases where binlog metadata is not available.

- Query INFORMATION_SCHEMA for column names when not in binlog
- Module-level cache to prevent repeated queries
- Opt-in via use_column_name_cache parameter
- Handle both dict and tuple cursor types
- Backward compatible (disabled by default)
@julien-duponchelle
Copy link
Owner

Sorry you have a failure in the test :
pkt = <pymysql.protocol.MysqlPacket object at 0x00007f9b9fdcd2f0>

def create_binlog_packet_wrapper(pkt):
    return BinLogPacketWrapper(
        pkt,
        self.stream.table_map,
        self.stream._ctl_connection,
        self.stream.mysql_version,
        self.stream._BinLogStreamReader__use_checksum,
        self.stream._BinLogStreamReader__allowed_events_in_packet,
        self.stream._BinLogStreamReader__only_tables,
        self.stream._BinLogStreamReader__ignored_tables,
        self.stream._BinLogStreamReader__only_schemas,
        self.stream._BinLogStreamReader__ignored_schemas,
        self.stream._BinLogStreamReader__freeze_schema,
        self.stream._BinLogStreamReader__ignore_decode_errors,
        self.stream._BinLogStreamReader__verify_checksum,
      self.stream._BinLogStreamReader__optional_meta_data,
    )

E TypeError: init() missing 1 required positional argument: 'enable_logging'

test_basic.py:621: TypeError

(I know it's not easy to run the test :( )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants