Skip to content

Conversation

@Bueller87
Copy link

@Bueller87 Bueller87 commented Nov 13, 2025

Summary

Adds support for deploying Cadence with PostgreSQL + Elasticsearch + Kafka for advanced visibility. This configuration enables powerful workflow search and filtering capabilities through Elasticsearch while using PostgreSQL as the primary persistence layer.

New Example Configuration: values.postgres-es7.yaml

This branch introduces a complete, production-ready example for deploying Cadence with:

  • PostgreSQL - Primary persistence for workflow state and history
  • Elasticsearch 7 - Advanced visibility with rich search capabilities
  • Kafka - Asynchronous visibility event processing for better performance
  • Kafka topic auto-provisioning - Eliminates manual setup and race conditions

Testing

  • Clean deployment with all pods healthy (no scheduling failures)
  • Kafka topics auto-provisioned with correct replication factors
  • Workflows visible in Cadence Web UI immediately after execution
  • No visibility-related errors in service logs
  • Repeatable deployments (tested multiple tear-down/redeploy cycles)

Caveats

  1. Single-broker Kafka: Configured for development/testing with replicationFactor: 1. For production, increase Kafka broker count and replication factors to 3+.
  2. Resource requirements: Deploys 9+ pods (Cadence services + PostgreSQL + Elasticsearch + Kafka). Ensure your cluster has adequate resources.
  3. Fresh deployments recommended: If upgrading from a previous configuration, a clean redeploy is recommended to ensure Kafka topics are properly provisioned.

Checklist

  • [ x] Chart.yaml: Chart version bumped
  • [ x] values.yaml: global.image.tag (should match with Chart.yaml appVersion)
  • [ x] Dependencies in Chart.yaml (check for updates with helm dependency update, aim for latest major-1 stable)
  • [ x] DCO signed
  • [ x] Test deployment locally
  • [ x] Run helm-docs to update documentation

shijiesheng and others added 11 commits November 14, 2025 10:52
)

Description of this PR
schema upgrade job should fail open on db creation to handle already exist errors
add example of cadence with postgres
Local Test

namespace and release name is still fixed unfortunately.

cd charts/cadence
helm install cadence-release . -n cadence-postgres --values examples/values.postgres.yaml

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
…dence-workflow#57)

* docs: Convert slack links to CNCF, add link to contributing guide

Signed-off-by: vpatil16@ext.uber.com <vpatil16@ext.uber.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: CosminL-DEV <82363564+CosminL-DEV@users.noreply.github.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
When using Elasticsearch for visibility, the PostgreSQL visibility
database is not needed. Removed visibilityDbname and initdb script
that were creating an unused database.

Also updated dynamic config to use es instead of es-visibility
to match Cadence's internal visibility store naming convention.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
The schema job should only create the SQL/Cassandra visibility
database when Elasticsearch is disabled. When using Elasticsearch
for advanced visibility, the SQL visibility database is not needed.

Changed condition from checking if visibilityDbname is set to
checking if ES_ENABLED is false, allowing ES-only deployments
to skip unnecessary database creation.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Changed visibilityIndex from 'cadence-visibility' to 'cadence-visibility-v1'
to match the Elasticsearch template pattern 'cadence-visibility-*'. This
ensures the template with proper keyword field mappings is applied to the
index, fixing visibility queries that require exact term matching on fields
like DomainID.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Set visibilityStore to empty string to prevent dual-write mode when using
Elasticsearch-only advanced visibility. This fixes 'Operation is not supported'
errors when Cadence tries to write to a non-existent SQL visibility database.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
The visibility store name in dynamic config should be 'es' (a special keyword
for advanced visibility), not the datastore name 'es-visibility'. This matches
the Cadence upstream configuration pattern where advancedVisibilityStore is
'es-visibility' but system.writeVisibilityStoreName is 'es'.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Removed advancedVisibilityStore from persistence config to enable Kafka-based
async visibility mode. When advancedVisibilityStore is set, Cadence tries to
use dual-write mode (SQL + ES) which fails without a SQL visibility database.
With Kafka mode, the history service writes to Kafka, and the worker service
reads from Kafka and writes to ES.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Must explicitly override the default value of 'es-visibility' with an empty
string to disable direct ES writes and enable Kafka-based async visibility.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Cadence requires at least one of visibilityStore or advancedVisibilityStore to
be set. We keep advancedVisibilityStore='es-visibility' to define the datastore,
and use dynamic config system.writeVisibilityStoreName='es' to enable Kafka-based
async visibility instead of direct ES writes.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Added Kafka provisioning configuration to pre-create required topics:
- __consumer_offsets: Internal Kafka topic for consumer offset tracking
- cadence-visibility: Main visibility events topic
- cadence-visibility-dlq: Dead letter queue for failed visibility events

This prevents race conditions where topics get auto-created with incorrect
replication factors before the broker config is fully loaded. All topics are
created with replication-factor: 1 to match our single-broker setup.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
numHistoryShards: {{ .Values.config.persistence.numHistoryShards | default 4 }}
defaultStore: {{ .Values.config.persistence.defaultStore | default "default" | quote }}
{{- if and .Values.config.persistence.visibilityStore (not .Values.config.persistence.elasticsearch.enabled) }}
{{- if .Values.config.persistence.visibilityStore }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do need this additional check (not .Values.config.persistence.elasticsearch.enabled)

Comment on lines -222 to +224
{{- if ne .Values.config.persistence.elasticsearch.enabled true }}
{{- if .Values.config.persistence.visibilityStore }}
# Visibility datastore
{{ if .Values.config.persistence.visibilityStore -}}
{{ .Values.config.persistence.visibilityStore | default "visibility" }}:
{{ .Values.config.persistence.visibilityStore }}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no issue with the existing checkers. basically it says set vibility table in primary storage if advanced storage is not enabled. Default value of "visibility" also makes sense to minimize the visibility

Copy link
Author

@Bueller87 Bueller87 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be confused as I don't know exactly how the code works. But I found if .Values.config.persistence.visibilityStore is set to anything. Cadence will write to that store and not the advancedVisibilityStore. So I changed the logic a little, but I should test that I didn't break the plain postgres example values.

Bueller87 and others added 2 commits November 20, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants