Skip to content

Conversation

@hantmac
Copy link
Contributor

@hantmac hantmac commented Nov 3, 2025

Add Databend connector.

Summary by Sourcery

Introduce a new Databend connector plugin for Trino, enabling JDBC-based connectivity, query pushdown, type mapping, and metadata operations against Databend databases.

New Features:

  • Add trino-databend plugin with DatabendClient, connector module, and native SQL table function support.

Enhancements:

  • Implement Databend-specific metadata handling (schemas, tables, engine and table properties), type mappings, SQL operations, and pushdown capabilities.

Build:

  • Integrate plugin/trino-databend module into Maven pom, server provisioning, and product-tests configuration.

CI:

  • Extend GitHub Actions workflows to include Databend plugin in build and test matrix.

Documentation:

  • Update connector index and add comprehensive Sphinx documentation for the Databend connector.

Tests:

  • Add unit and integration tests using TestingDatabendServer with Testcontainers and connector behavior tests.

@cla-bot cla-bot bot added the cla-signed label Nov 3, 2025
@sourcery-ai
Copy link

sourcery-ai bot commented Nov 3, 2025

Reviewer's Guide

This PR introduces a new Databend connector by adding a dedicated plugin module, wiring it into the build and CI pipelines, implementing the JDBC-based client with Databend-specific dialect support, and providing comprehensive documentation and test coverage.

Class diagram for new Databend connector types

classDiagram
  class DatabendPlugin {
    +DatabendPlugin()
  }
  class DatabendClientModule {
    +setup(Binder)
    +connectionFactory(BaseJdbcConfig, CredentialProvider, OpenTelemetry)
  }
  class DatabendClient {
    +DatabendClient(...)
    +topNFunction()
    +isTopNGuaranteed(ConnectorSession)
    +implementJoin(...)
    +getTables(...)
    +quoted(...)
    +copyTableSchema(...)
    +listSchemas(Connection)
    +getTableComment(ResultSet)
    +createTableSqls(...)
    +getTableProperties(...)
    +setTableProperties(...)
    +getColumnDefinitionSql(...)
    +createSchema(...)
    +dropSchema(...)
    +renameSchema(...)
    +addColumn(...)
    +setTableComment(...)
    +setColumnType(...)
    +dropNotNullConstraint(...)
    +getTableTypes()
    +renameTable(...)
    +limitFunction()
    +isLimitGuaranteed(ConnectorSession)
    +toColumnMapping(...)
    +toWriteMapping(...)
  }
  class DatabendTableProperties {
    +DatabendTableProperties()
    +getTableProperties()
    +static getEngine(Map)
    +static getOrderBy(Map)
  }
  class DatabendConfig {
    +getConnectionTimeout()
    +setConnectionTimeout(Duration)
  }
  class DatabendEngineType {
    +getEngineType()
    <<enum>>
  }
  class DatabendUtil {
    +convertToQuotedString(Object)
    +escape(String, char)
    <<static>>
  }
  DatabendPlugin --|> JdbcPlugin
  DatabendClientModule --|> AbstractConfigurationAwareModule
  DatabendClient --|> BaseJdbcClient
  DatabendTableProperties --|> TablePropertiesProvider
  DatabendClientModule --> DatabendClient
  DatabendClient --> DatabendConfig
  DatabendClient --> DatabendTableProperties
  DatabendTableProperties --> DatabendEngineType
  DatabendClient --> DatabendUtil
Loading

File-Level Changes

Change Details Files
Integrate Databend plugin into build and CI
  • Register artifactSet for databend in trino.xml
  • Add trino-databend module to root pom.xml
  • Update CI workflow to skip/include trino-databend jobs
core/trino-server/src/main/provisio/trino.xml
.github/workflows/ci.yml
pom.xml
Implement plugin wiring and configuration
  • Introduce DatabendPlugin extending JdbcPlugin
  • Add DatabendClientModule to bind JDBC client and connection factory
  • Define DatabendConfig, DatabendEngineType and TablePropertiesProvider
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendPlugin.java
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendClientModule.java
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendConfig.java
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendEngineType.java
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendTableProperties.java
Implement DatabendClient with Databend-specific SQL dialect
  • Extend BaseJdbcClient and override metadata and DDL operations
  • Implement custom SQL generation (LIMIT, TOP N, joins, schema/table handling)
  • Define type and write mappings for Databend types
plugin/trino-datbend/src/main/java/io/trino/plugin/databend/DatabendClient.java
Add documentation for the Databend connector
  • Register Databend in connector list
  • Add detailed connector guide in docs
  • Provide product-tests environment properties
docs/src/main/sphinx/connector.md
docs/src/main/sphinx/connector/databend.md
testing/trino-product-tests-launcher/src/main/resources/docker/trino-product-tests/conf/environment/multinode-all/databend.properties
Provide end-to-end test infrastructure and tests
  • Create TestingDatabendServer with Testcontainers for Databend and MinIO
  • Implement DatabendQueryRunner for integration tests
  • Add connector, type-mapping, and plugin unit tests
plugin/trino-datbend/src/test/java/io/trino/plugin/databend/TestingDatabendServer.java
plugin/trino-datbend/src/test/java/io/trino/plugin/databend/DatabendQueryRunner.java
plugin/trino-datbend/src/test/java/io/trino/plugin/databend/TestDatabendConnectorTest.java
plugin/trino-datbend/src/test/java/io/trino/plugin/databend/TestDatabendTypeMapping.java
plugin/trino-datbend/src/test/java/io/trino/plugin/databend/TestDatabendPlugin.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions github-actions bot added the docs label Nov 3, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The override of setTableProperties currently emits an empty ALTER TABLE MODIFY (no-op)—either implement engine/order_by modifications or remove it to fail fast on unsupported operations.
  • getTableProperties only reads the engine field but ignores the configured order_by property; update it to return all declared tableProperties to keep metadata in sync.
  • DatabendClient duplicates a lot of JDBC pushdown/join/limit logic seen in other connectors—consider refactoring shared behavior into BaseJdbcClient utilities to reduce boilerplate.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The override of setTableProperties currently emits an empty ALTER TABLE MODIFY (no-op)—either implement engine/order_by modifications or remove it to fail fast on unsupported operations.
- getTableProperties only reads the engine field but ignores the configured order_by property; update it to return all declared tableProperties to keep metadata in sync.
- DatabendClient duplicates a lot of JDBC pushdown/join/limit logic seen in other connectors—consider refactoring shared behavior into BaseJdbcClient utilities to reduce boilerplate.

## Individual Comments

### Comment 1
<location> `plugin/trino-databend/src/main/java/io/trino/plugin/databend/DatabendClient.java:142-151` </location>
<code_context>
+        }
+    }
+
+    @Override
+    public void setTableProperties(ConnectorSession session, JdbcTableHandle handle, Map<String, Optional<Object>> nullableProperties)
+    {
+        checkArgument(nullableProperties.values().stream().noneMatch(Optional::isEmpty), "Setting a property to null is not supported");
</code_context>

<issue_to_address>
**issue (bug_risk):** setTableProperties does not actually set any properties.

Currently, setTableProperties does not modify any table properties, which could mislead users. Please implement property modification logic or throw an exception to indicate this operation is unsupported.
</issue_to_address>

### Comment 2
<location> `plugin/trino-databend/src/main/java/io/trino/plugin/databend/DatabendTableProperties.java:65-71` </location>
<code_context>
+        return (DatabendEngineType) tableProperties.get(ENGINE_PROPERTY);
+    }
+
+    public static List<String> getOrderBy(Map<String, Object> tableProperties)
+    {
+        requireNonNull(tableProperties, "tableProperties is null");
+        @SuppressWarnings("unchecked")
+        List<String> orderBy = (List<String>) tableProperties.get("order_by");
+        return orderBy;
</code_context>

<issue_to_address>
**suggestion:** getOrderBy does not handle missing or null property gracefully.

Returning null can cause runtime errors if not handled. Returning an empty list would be safer and prevent NullPointerExceptions.

```suggestion
    public static List<String> getOrderBy(Map<String, Object> tableProperties)
    {
        requireNonNull(tableProperties, "tableProperties is null");
        @SuppressWarnings("unchecked")
        List<String> orderBy = (List<String>) tableProperties.get("order_by");
        if (orderBy == null) {
            return Collections.emptyList();
        }
        return orderBy;
    }
```
</issue_to_address>

### Comment 3
<location> `plugin/trino-databend/src/test/java/io/trino/plugin/databend/TestDatabendTypeMapping.java:45-46` </location>
<code_context>
+        };
+    }
+
+    @Test
+    @Override
+    public void testShowColumns()
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests for boolean type edge cases (nulls, invalid values).

Please include test cases for null and invalid boolean values to verify correct handling by the connector.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@hantmac hantmac force-pushed the feat/support-databend-trino branch from 12d1732 to e0ec291 Compare November 5, 2025 15:15
@github-actions github-actions bot added release-notes ui Web UI jdbc Relates to Trino JDBC driver hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector snowflake Snowflake connector cassandra Cassandra connector blackhole Blackhole connector clickhouse ClickHouse connector druid Druid connector duckdb DuckDB connector elasticsearch Elasticsearch connector exasol Exasol connector faker Faker connector google-sheets Google Sheets connector ignite Ignite connector kafka Kafka connector loki Loki connector mariadb MariaDB connector memory Memory connector labels Nov 5, 2025
@hantmac hantmac removed duckdb DuckDB connector elasticsearch Elasticsearch connector exasol Exasol connector faker Faker connector google-sheets Google Sheets connector ignite Ignite connector kafka Kafka connector loki Loki connector mariadb MariaDB connector memory Memory connector mysql MySQL connector opensearch OpenSearch connector oracle Oracle connector pinot Pinot connector postgresql PostgreSQL connector prometheus Prometheus connector redis Redis connector redshift Redshift connector singlestore SingleStore connector sqlserver SQLServer connector vertica Vertica connector lakehouse labels Nov 6, 2025
@hantmac
Copy link
Contributor Author

hantmac commented Nov 7, 2025

Hi @ebyhr , sorry to bother you, I want to ask a question that I run mvn test on my local machine to test databend connector plugin successfully, but why there has errors in github CI?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants