|
| 1 | +.. _mysql57_support: |
| 2 | + |
| 3 | +MySQL 5.7, MySQL 8.0+ and `use_column_name_cache` |
| 4 | +================================================== |
| 5 | + |
| 6 | +In MySQL 5.7 and earlier, the binary log events for row-based replication do not include column name metadata. This means that `python-mysql-replication` cannot map column values to their names directly from the binlog event. |
| 7 | + |
| 8 | +Starting with MySQL 8.0.1, the `binlog_row_metadata` system variable was introduced to control the amount of metadata written to the binary log. The default value for this variable is `MINIMAL`, which provides the same behavior as MySQL 5.7. |
| 9 | + |
| 10 | +The Problem |
| 11 | +----------- |
| 12 | + |
| 13 | +When column metadata is not present in the binlog (as in MySQL 5.7 and earlier, or when `binlog_row_metadata` is set to `MINIMAL` in MySQL 8.0+), the `values` dictionary in a `WriteRowsEvent`, `UpdateRowsEvent`, or `DeleteRowsEvent` will contain integer keys corresponding to the column index, not the column names. |
| 14 | + |
| 15 | +For example, for a table `users` with columns `id` and `name`, an insert event might look like this: |
| 16 | + |
| 17 | +.. code-block:: python |
| 18 | +
|
| 19 | + {0: 1, 1: 'John Doe'} |
| 20 | +
|
| 21 | +This can make your replication logic harder to write and maintain, as you need to know the column order. |
| 22 | + |
| 23 | +The Solution: `use_column_name_cache` |
| 24 | +------------------------------------- |
| 25 | + |
| 26 | +To address this, `python-mysql-replication` provides the `use_column_name_cache` parameter for the `BinLogStreamReader`. |
| 27 | + |
| 28 | +When you set `use_column_name_cache=True`, the library will perform a query to the `INFORMATION_SCHEMA.COLUMNS` table to fetch the column names for a given table the first time it encounters an event for that table. The column names are then cached in memory for subsequent events for the same table, avoiding redundant queries. |
| 29 | + |
| 30 | +This allows you to receive row data with column names as keys. |
| 31 | + |
| 32 | +MySQL 8.0+ with `binlog_row_metadata=FULL` |
| 33 | +------------------------------------------ |
| 34 | + |
| 35 | +In MySQL 8.0.1 and later, you can set `binlog_row_metadata` to `FULL`. When this setting is enabled, the column names are included directly in the binlog events, and `use_column_name_cache` is not necessary. |
| 36 | + |
| 37 | +Example |
| 38 | +------- |
| 39 | + |
| 40 | +Here is how to enable the column name cache when needed: |
| 41 | + |
| 42 | +.. code-block:: python |
| 43 | +
|
| 44 | + from pymysqlreplication import BinLogStreamReader |
| 45 | +
|
| 46 | + mysql_settings = {'host': '127.0.0.1', 'port': 3306, 'user': 'root', 'passwd': ''} |
| 47 | +
|
| 48 | + # Enable the column name cache for MySQL 5.7 or MySQL 8.0+ with binlog_row_metadata=MINIMAL |
| 49 | + stream = BinLogStreamReader( |
| 50 | + connection_settings=mysql_settings, |
| 51 | + server_id=100, |
| 52 | + use_column_name_cache=True |
| 53 | + ) |
| 54 | +
|
| 55 | + for binlogevent in stream: |
| 56 | + if isinstance(binlogevent, WriteRowsEvent): |
| 57 | + # Now you can access values by column name |
| 58 | + user_id = binlogevent.rows[0]["values"]["id"] |
| 59 | + user_name = binlogevent.rows[0]["values"]["name"] |
| 60 | + print(f"New user: id={user_id}, name={user_name}") |
| 61 | +
|
| 62 | + stream.close() |
| 63 | +
|
| 64 | +Important Considerations |
| 65 | +------------------------ |
| 66 | + |
| 67 | +* **Performance:** Enabling `use_column_name_cache` will result in an extra query to the database for each new table encountered in the binlog. The results are cached, so the performance impact should be minimal after the initial query for each table. |
| 68 | +* **Permissions:** The MySQL user used for replication must have `SELECT` privileges on the `INFORMATION_SCHEMA.COLUMNS` table. |
| 69 | +* **Default Behavior:** This feature is disabled by default (`use_column_name_cache=False`) to maintain backward compatibility and to avoid making extra queries unless explicitly requested. |
0 commit comments