-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Bug report
Summary
Grafting fails during initialization with error "Unexpected null for non-null column" when the source subgraph contains tables with very few rows (typically < 10 rows). This occurs because PostgreSQL does not generate histogram_bounds statistics for tables with insufficient data, but the graph-node code assumes histogram_bounds is never NULL.
Steps to Reproduce
- Start syncing subgraph that relies on graft that has some entities with very few instances (e.g., 1-3 rows). For instance,
QmPGNM9A34PYrCpiEjcuQq6TJuB9PvNcqGU1hKa7oWERpZand its graftQmVXHLm8YBTxabrA4LG3LVnXxpB8mtVJa9KdAYrCtYE6cF. - Observe the error during grafting:
Unexpected null for non-null column
Expected Behavior
Grafting should succeed regardless of the number of rows in source tables. PostgreSQL legitimately returns NULL for histogram_bounds when tables have insufficient data for statistical analysis, and the code should handle this gracefully.
Actual Behavior
Grafting fails immediately after obtaining the copy lock.
Root Cause Analysis
The Bug Location
The bug is in store/postgres/src/catalog.rs in the histogram_bounds() function:
#[derive(Queryable, QueryableByName)]
struct Bounds {
#[diesel(sql_type = Array<BigInt>)]
bounds: Vec<i64>, // ❌ Non-nullable
}
sql_query(QUERY)
.bind::<Text, _>(namespace.as_str())
.bind::<Text, _>(table.as_str())
.bind::<Text, _>(column)
.get_result::<Bounds>(conn)
.optional()
.map(|bounds| bounds.map(|b| b.bounds).unwrap_or_default())
.map_err(StoreError::from)Why It Fails
-
PostgreSQL Behavior: PostgreSQL's
pg_stats.histogram_boundsis NULL for tables with very few rows (typically < 100 rows). This is documented behavior - histograms are only created when there's enough data to make them statistically meaningful. -
Code Assumption: The
Boundsstruct definesbounds: Vec<i64>(non-nullable), but PostgreSQL can legitimately return NULL. -
Diesel Deserialization: When Diesel tries to deserialize a NULL
histogram_boundsvalue into the non-nullableVec<i64>, it fails with "Unexpected null for non-null column". -
.optional()Limitation: The.optional()call only handles missing rows (when there's nopg_statsentry), not NULL column values within existing rows.
When This Occurs
This bug is triggered when:
- The source subgraph has tables with < ~100 rows
- These tables exist in both source and destination schemas (so they will be copied)
- The table is marked for copying during grafting initialization
Example Case
In my investigation:
- Table
sgd129.initializedhad exactly 1 row - PostgreSQL correctly returned NULL for histogram_bounds:
SELECT histogram_bounds FROM pg_stats WHERE schemaname = 'sgd129' AND tablename = 'initialized' AND attname = 'vid'; -- Result: NULL
- Graph-node code failed during
VidBatcher::load()→catalog::histogram_bounds()
The Fix
Change the bounds field to be nullable and handle the None case:
#[derive(Queryable, QueryableByName)]
struct Bounds {
#[diesel(sql_type = Nullable<Array<BigInt>>)]
bounds: Option<Vec<i64>>, // ✅ Nullable
}Workaround
Until this is fixed, I was able to work around the issue by temporarily adding dummy data to the table. Grafting will not copy them, because dummy rows are added beyond graft point.
-- Add dummy rows with block$ AFTER the graft point (so they won't be copied)
INSERT INTO sgd129.small_table (block$, id, ...)
SELECT
(graft_block + 1000000) as block$,
-- ... other required fields
FROM generate_series(1, 99) s;
-- Analyze to generate histogram
ANALYZE sgd129.small_table;
-- After grafting succeeds, remove dummy data
DELETE FROM sgd129.small_table WHERE block$ > graft_block;Environment
- Graph Node version: 0.41.1
- PostgreSQL version: 18.1
- Deployment method: Docker
Relevant log output
Dec 03 11:06:39.081 INFO Obtaining copy lock (this might take a long time if another process is still copying), dst: sgd146, runner_index: 35, sgd: 146, subgraph_id: QmPGNM9A34PYrCpiEjcuQq6TJuB9PvNcqGU1hKa7oWERpZ, component: SubgraphInstanceManager
Dec 03 11:06:39.281 ERRO Failed to start subgraph, code: SubgraphStartFailure, error: store error: Unexpected null for non-null column, runner_index: 35, sgd: 146, subgraph_id: QmPGNM9A34PYrCpiEjcuQq6TJuB9PvNcqGU1hKa7oWERpZ, component: SubgraphInstanceManagerIPFS hash
QmPGNM9A34PYrCpiEjcuQq6TJuB9PvNcqGU1hKa7oWERpZ
Subgraph name or link to explorer
Some information to help us out
- Tick this box if this bug is caused by a regression found in the latest release.
- Tick this box if this bug is specific to the hosted service.
- I have searched the issue tracker to make sure this issue is not a duplicate.
OS information
None