Skip to content
4 changes: 3 additions & 1 deletion docs/_include/card/timeseries-datashader.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ points from your backend systems to the browser's glass.
This notebook plots the venerable NYC Taxi dataset after importing it
into a CrateDB Cloud database cluster.

🚧 _Please note this notebook is a work in progress._ 🚧
```{todo}
🚧 This notebook is a work in progress. 🚧
```

{{ '{}[cloud-datashader-github]'.format(nb_github) }} {{ '{}[cloud-datashader-colab]'.format(nb_colab) }}
:::
Expand Down
8 changes: 6 additions & 2 deletions docs/_include/links.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
[HNSW paper]: https://arxiv.org/pdf/1603.09320
[HoloViews]: https://www.holoviews.org/
[Indexing, Columnar Storage, and Aggregations]: https://cratedb.com/product/features/indexing-columnar-storage-aggregations
[InfluxDB]: https://github.com/influxdata/influxdb
[inverted index]: https://en.wikipedia.org/wiki/Inverted_index
[JOIN]: inv:crate-reference#sql_joins
[JSON Database]: https://cratedb.com/solutions/json-database
Expand All @@ -38,9 +39,12 @@
[langchain-rag-sql-binder]: https://mybinder.org/v2/gh/crate/cratedb-examples/main?labpath=topic%2Fmachine-learning%2Flangchain%2Fcratedb-vectorstore-rag-openai-sql.ipynb
[langchain-rag-sql-colab]: https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/langchain/cratedb-vectorstore-rag-openai-sql.ipynb
[langchain-rag-sql-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/langchain/cratedb-vectorstore-rag-openai-sql.ipynb
[MongoDB CDC Relay]: https://cratedb-toolkit.readthedocs.io/io/mongodb/cdc.html
[MongoDB]: https://www.mongodb.com/docs/manual/
[MongoDB Atlas]: https://www.mongodb.com/docs/atlas/
[MongoDB CDC Relay]: inv:ctk:*:label#mongodb-cdc-relay
[MongoDB Change Streams]: https://www.mongodb.com/docs/manual/changeStreams/
[MongoDB Table Loader]: https://cratedb-toolkit.readthedocs.io/io/mongodb/loader.html
[MongoDB collections and databases]: https://www.mongodb.com/docs/php-library/current/databases-collections/
[MongoDB Table Loader]: inv:ctk:*:label#mongodb-loader
[Multi-model Database]: https://cratedb.com/solutions/multi-model-database
[nearest neighbor search]: https://en.wikipedia.org/wiki/Nearest_neighbor_search
[Nested Data Structure]: https://cratedb.com/product/features/nested-data-structure
Expand Down
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@
r"https://www.tableau.com/",
# Read timed out. (read timeout=15)
r"https://kubernetes.io/",
# Connection to renenyffenegger.ch timed out.
r"https://renenyffenegger.ch",
]

linkcheck_anchors_ignore_for_url += [
Expand Down
13 changes: 9 additions & 4 deletions docs/connect/configure.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
(connect-configure)=

# Configure

In order to connect to CrateDB, your application or driver needs to be
:::{include} /_include/links.md
:::

To connect to CrateDB properly, your application or driver needs to be
configured with corresponding connection properties. Please note that different
applications and drivers may obtain connection properties in different formats.

Expand Down Expand Up @@ -143,7 +145,10 @@ crate://crate@localhost:4200/?schema=doc
::::::


```{tip}
:::{rubric} Notes
:::

:::{div}
- CrateDB's fixed catalog name is `crate`, the default schema name is `doc`.
- CrateDB does not implement the notion of a database,
however tables can be created in different [schemas].
Expand All @@ -155,4 +160,4 @@ crate://crate@localhost:4200/?schema=doc
called `crate`, defined without a password.
- For authenticating properly, please learn about the available
[authentication] options.
```
:::
2 changes: 2 additions & 0 deletions docs/connect/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ Database connectivity information.
::::

::::{grid-item-card} {material-outlined}`lightbulb;2em` How to connect
:width: auto
- {ref}`connect-java`
- {ref}`connect-javascript`
- {ref}`connect-php`
Expand All @@ -105,6 +106,7 @@ CLI programs <cli>
ide
Drivers <drivers>
DataFrame libraries <df/index>
mcp/index
ORM libraries <orm>
```

Expand Down
File renamed without changes.
File renamed without changes.
27 changes: 13 additions & 14 deletions docs/integrate/mcp/index.md → docs/connect/mcp/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
(mcp)=
(connect-mcp)=
# Model Context Protocol (MCP)

```{toctree}
Expand All @@ -8,9 +10,7 @@ cratedb-mcp
Community servers <community>
```

## About

:::{rubric} Introduction
:::{rubric} About
:::

[MCP], the Model Context Protocol, is an open protocol that enables seamless
Expand All @@ -19,22 +19,14 @@ integration between LLM applications and external data sources and tools.
MCP is sometimes described as "OpenAPI for LLMs" or as "USB-C port for AI",
providing a uniform way to connect LLMs to resources they can use.

:::{rubric} Details
:::

The main entities of MCP are [prompts], [resources], and [tools].
The main entities of MCP are [Prompts], [Resources], and [Tools].
MCP clients call MCP servers, either by invoking them as a subprocess and
communicating via Standard Input/Output (stdio), Server-Sent Events (sse),
or HTTP Streams (streamable-http), see [transports].

:::{rubric} Discuss
:::{rubric} Usage
:::

To get in touch with us to discuss CrateDB and MCP, head over to GitHub at
[Model Context Protocol (MCP) @ CrateDB] or the [Community Forum].

## Usage

You can use MCP with [CrateDB] and [CrateDB Cloud], either by selecting the
**CrateDB MCP Server** suitable for Text-to-SQL and documentation retrieval,
or by using community MCP servers that are compatible with PostgreSQL databases.
Expand Down Expand Up @@ -66,9 +58,16 @@ GitHub Copilot, Mistral AI, OpenAI Agents SDK, VS Code, Windsurf,
and others.


[Community Forum]: https://community.cratedb.com/
:::{rubric} Discuss
:::

To get in touch with us to discuss CrateDB and MCP, please head over to
the CrateDB community forum at [Introducing the CrateDB MCP Server].


[CrateDB]: https://cratedb.com/database
[CrateDB Cloud]: https://cratedb.com/docs/cloud/
[Introducing the CrateDB MCP Server]: https://community.cratedb.com/t/introducing-the-cratedb-mcp-server/2043
[MCP]: https://modelcontextprotocol.io/
[MCP clients]: https://modelcontextprotocol.io/clients
[Model Context Protocol (MCP) @ CrateDB]: https://github.com/crate/crate-clients-tools/discussions/234
Expand Down
91 changes: 8 additions & 83 deletions docs/ingest/cdc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,95 +5,20 @@
:::

:::{div}
You have a variety of options to connect and integrate with 3rd-party
CrateDB provides many options to connect and integrate with third-party
CDC applications, mostly using [CrateDB's PostgreSQL interface].

CrateDB also provides a few native adapter components that can be used
to leverage its advanced features.
CrateDB also provides native adapter components to leverage advanced
features.

This documentation section lists corresponding CDC applications and
frameworks which can be used together with CrateDB, and outlines how
to use them optimally.
Please also have a look at support for [generic ETL](#etl) solutions.
:::

(cdc-dms)=
## AWS DMS

:::{div}
[AWS Database Migration Service (AWS DMS)] is a managed migration and replication
service that helps move your database and analytics workloads between different
kinds of databases quickly, securely, and with minimal downtime and zero data
loss.

AWS DMS supports migration between 20-plus database and analytics engines, either
on-premises, or per EC2 instance databases. Supported data migration sources are:
Amazon Aurora, Amazon DocumentDB, Amazon S3, IBM DB2, MariaDB, Azure SQL Database,
Microsoft SQL Server, MongoDB, MySQL, Oracle, PostgreSQL, SAP ASE.

The [AWS DMS Integration with CrateDB] uses Amazon Kinesis Data Streams as
a DMS target, combined with a CrateDB-specific downstream processor element.

CrateDB provides two variants how to conduct data migrations using AWS DMS.
Either use it standalone / on your own premises, or use it in a completely
managed environment with services of AWS and CrateDB Cloud.
AWS DMS supports both `full-load` and `cdc` operation modes, often used in
combination with each other (`full-load-and-cdc`).
Please also take a look at support for {ref}`generic ETL <etl>` solutions.
:::

(cdc-kinesis)=
## AWS Kinesis
You can use Amazon Kinesis Data Streams to collect and process large streams of data
records in real time. A typical Kinesis Data Streams application reads data from a
data stream as data records.

As such, a common application is to relay DynamoDB table change stream events to a
Kinesis Stream, and consume that from an adapter to a consolidation database.
:::{div}
- About: [Amazon Kinesis Data Streams]
- See: [](#cdc-dynamodb)
:::

## Debezium

- {ref}`aws-dms`
- {ref}`aws-dynamodb`
- {ref}`aws-kinesis`
- {ref}`debezium`

(cdc-dynamodb)=
## DynamoDB
:::{div}
Support for loading DynamoDB tables into CrateDB (full-load), as well as
[Amazon DynamoDB Streams] and [Amazon Kinesis Data Streams],
to relay CDC events from DynamoDB into CrateDB.

- [DynamoDB Table Loader]
- [DynamoDB CDC Relay]

If you are looking into serverless replication using AWS Lambda:
- [DynamoDB CDC Relay with AWS Lambda]
- Blog: [Replicating CDC events from DynamoDB to CrateDB]
:::

## MongoDB
:::{div}
Support for loading MongoDB collections and databases into CrateDB (full-load),
and [MongoDB Change Streams], to relay CDC events from MongoDB into CrateDB.

- [MongoDB Table Loader]
- [MongoDB CDC Relay]
:::

## StreamSets

The [StreamSets Data Collector] is a lightweight and powerful engine that
allows you to build streaming, batch and change-data-capture (CDC) pipelines
that can ingest and transform data from a variety of different sources.

StreamSets Data Collector Engine makes it easy to run data pipelines from Kafka,
Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, S3, ADLS, Kafka
and more. Data Collector Engine runs on-premises or any cloud, wherever your data
lives.

- {ref}`mongodb`
- {ref}`streamsets`


[StreamSets Data Collector]: https://www.softwareag.com/en_corporate/platform/integration-apis/data-collector-engine.html
Loading