Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for using DataConnectionService [SUP-523] #1361

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
= Using Data Connections in custom components
:description: Using the Data Connection Service gives access to the configured data connections in custom components.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a little more explanation of the concepts here, and/or a link to a section covering e.g. what data connections are and how they're used.

Should there be incoming links to this new section from anywhere? And are all of these sections in the TOC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link, hope it works in the description

d8f5a76

Won't they appear in the data connections section in the left menu?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if you don't add them to the right place in nav.adoc


{description}

== Retrieve Data Connection Service
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

Before working with data connections you need to retrieve an instance of the `DataConnectionService`. Use
https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/core/HazelcastInstance.html#getDataConnectionService()[`HazelcastInstance#getDataConnectionService()`]
to obtain an instance of `DataConnectionService`.

You can implement HazelcastInstanceAware in listeners, entry processors, tasks etc. to get access
to the `HazelcastInstance`.

In the pipeline API you can use
https://docs.hazelcast.org/docs/6.0.0-SNAPSHOT/javadoc/com/hazelcast/jet/core/ProcessorMetaSupplier.Context.html#dataConnectionService()[ProcessorMetaSupplier.Context#dataConnectionService()]
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved

NOTE: The Data Connection Service is only available on the member side.
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved

== Retrieve Configured DataConnection

Use the `DataConnectionService` to get an instance of previously configured data connection https://docs.hazelcast.org/docs/6.0.0-SNAPSHOT/javadoc/com/hazelcast/dataconnection/DataConnectionService.html#getAndRetainDataConnection(java.lang.String,java.lang.Class)[DataConnectionService#getAndRetainDataConnection(String, Class)]. For details how to configure a data connection, please refer
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved
to the xref:data-connections-configuration.adoc[Configuring Data Connections] page.

== Data Connection Scope

The data connection configuration is per-member. E.g. when a data connection is created
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved
with maximum pool size of 10 and the cluster has 3 members, there will be up to 30 connections
created.

== Data Connection Sharing

Data connection is shared by default. It means that when the data connection is requested in multiple places, the same
underlying resource (e.g. Jdbc pool, remote client) is used.
If you want to share the data connection configuration, but use a different instance of the underlying resource,
set the `DataConnectionConfig#setShared` to false.
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

== Typical Usage
oliverhowell marked this conversation as resolved.
Show resolved Hide resolved

The typical steps to use a data connection are as follows:

1. Obtain the data connection from the data connection service.
2. Retrieve the underlying resource from the `DataConnection` instance. This step varies based on the specific implementation of `DataConnection` (e.g., JdbcDataConnection provides `getConnection()` which returns a `java.sql.Connection`; `HazelcastDataConnection` provides `getClient()` which returns a HazelcastInstance).
3. Use the resource to perform the required operations.
4. Dispose of the resource (e.g., by calling `Connection#close` or `HazelcastInstance#destroy`).
5. Release the `DataConnection` instance (by calling `DataConnection#release()`).

Steps 2, 3, and 4 should be completed as quickly as possible to maximize the efficiency of connection pooling.

[source,java]
----
JdbcDataConnetion jdbcDataConnection = instance.getDataConnectionService()
.getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class);

try (Connection connection = jdbcDataConnection.getConnection()) {
// ... work with connection
} catch (SQLException e) {
throw new RuntimeException("Failed to load value for key=" + key, e);
}

jdbcDataConnection.release();
----

== Configuration Considerations

If the data connection is defined in the Hazelcast configuration, it remains immutable for the entire lifespan of the Hazelcast member. In this case, whether you retrieve the DataConnection instance once or each time before accessing the underlying resource, the result will be the same.

However, if the data connection is created dynamically via SQL, it can be replaced using CREATE OR REPLACE DATA CONNECTION (see xref:sql
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved
.adoc). In such cases, the DataConnection instance will stay valid until you release it, allowing you to retrieve the underlying resource as needed. This approach can be useful for adapting to changes in data connection configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still broken. I don't see a sql.adoc in the repo. I think you want create-data-connection.adoc.


For example, if you are running a batch job and want to use the same data connection throughout, request the connection at the start of the job. For a streaming job that may need updated configurations, retrieve both the data connection and the underlying resource just before use (e.g., when processing each item in the pipeline).
Original file line number Diff line number Diff line change
@@ -0,0 +1,273 @@
= Map Loader using Data Connection

:description: In this tutorial you build a custom map loader that uses a configured data connection to load the data not present in an IMap.
frant-hartm marked this conversation as resolved.
Show resolved Hide resolved

{description}

NOTE: This tutorial builds a custom implementation of MapLoader. For the most common use cases we also provide an out-of-the-box implementation xref:mapstore:configuring-a-generic-maploader.adoc[GenericMapLoader].

== Before you begin

To complete this tutorial, you need the following:

[cols="1a,1a"]
|===
|Prerequisites|Useful resources

|Java !! current java version !!
oliverhowell marked this conversation as resolved.
Show resolved Hide resolved
|
|Maven or Gradle
| https://maven.apache.org/install.html or https://gradle.org/install/
|Docker
|https://docs.docker.com/get-started/[Get Started on docker.com]

|===

=== Step 1. Create and Populate the Database

This tutorial uses Docker to run the Postgres database.

Run the following command to start Postgres:

[source, bash]
----
docker run --name postgres --rm -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres
----

Start `psql` client:

[source, bash]
----
docker exec -it postgres psql -U postgres
----

Create a table `my_table` and populate it with data:

[source,sql]
----
CREATE TABLE my_table(id INTEGER PRIMARY KEY, value VARCHAR(128));

INSERT INTO my_table VALUES (0, 'zero');
INSERT INTO my_table VALUES (1, 'one');
INSERT INTO my_table VALUES (2, 'two');
INSERT INTO my_table VALUES (3, 'three');
INSERT INTO my_table VALUES (4, 'four');
INSERT INTO my_table VALUES (5, 'five');
INSERT INTO my_table VALUES (6, 'six');
INSERT INTO my_table VALUES (7, 'seven');
INSERT INTO my_table VALUES (8, 'eight');
INSERT INTO my_table VALUES (9, 'nine');
----

== Step 2. Create a New Java Project

Create a blank Java project named pipeline-service-data-connection-example and copy the Gradle or Maven file into it:

[source,xml]
----
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>org.example</groupId>
<artifactId>maploader-data-connection-example</artifactId>
<version>1.0-SNAPSHOT</version>

<name>maploader-data-connection-example</name>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.release>17</maven.compiler.release>
</properties>

<dependencies>
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>6.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.24.1</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j2-impl</artifactId>
<version>2.24.1</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.4</version>
</dependency>
</dependencies>
</project>
----

== Step 2. MapLoader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Step 2. MapLoader
== Step 3. MapLoader

"Create a MapLoader" or similar? Not sure about the terminology, but all of these steps should have "do this thing" titles, like the first two. (The final "example app" step might be an exception, in which case it shouldn't be a numbered step.)


The following map loader implements the `com.hazelcast.map.MapLoader` and `com.hazelcast.map.MapLoaderLifecycleSupport`
interfaces.

[source,java]
----
public class SimpleMapLoader implements MapLoader<Integer, String>, MapLoaderLifecycleSupport {

private JdbcDataConnection jdbcDataConnection;

// ...
}
----

To implement the `MapLoaderLifecycleSupport` interface we need the following methods:

[source,java]
----
// ...

@Override
public void init(HazelcastInstance instance, Properties properties, String mapName) {
jdbcDataConnection = instance.getDataConnectionService()
.getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class);
}

@Override
public void destroy() {
jdbcDataConnection.release();
}

// ...
----

To implement the `MapLoader` interface we need the following methods:

[source,java]
----
@Override
public String load(Integer key) {
try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement("SELECT value FROM my_table WHERE id = ?")) {

statement.setInt(1, key);
ResultSet resultSet = statement.executeQuery();
String value = null;
if (resultSet.next()) {
value = resultSet.getString("value");
}
return value;
} catch (SQLException e) {
throw new RuntimeException("Failed to load value for key=" + key, e);
}
}

@Override
public Map<Integer, String> loadAll(Collection<Integer> keys) {
Map<Integer, String> resultMap = new HashMap<>();
StringBuilder queryBuilder = new StringBuilder("SELECT id, value FROM my_table WHERE id IN (");

// Construct query for batch retrieval
keys.forEach(key -> queryBuilder.append("?,"));
queryBuilder.setLength(queryBuilder.length() - 1); // Remove last comma
queryBuilder.append(")");

try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement(queryBuilder.toString())) {

int index = 1;
for (Integer key : keys) {
statement.setInt(index++, key);
}

ResultSet resultSet = statement.executeQuery();
while (resultSet.next()) {
resultMap.put(resultSet.getInt("id"), resultSet.getString("value"));
}
return resultMap;
} catch (SQLException e) {
throw new RuntimeException("Failed to load values", e);
}
}

@Override
public Iterable<Integer> loadAllKeys() {
List<Integer> keys = new ArrayList<>();
try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement("SELECT id FROM my_table");
ResultSet resultSet = statement.executeQuery()) {

while (resultSet.next()) {
keys.add(resultSet.getInt("id"));
}
return keys;
} catch (Exception e) {
throw new RuntimeException("Failed to load all keys", e);
}
}
----

== Step 4. MapLoader Example App

Configure the data connection:

[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
Config config = new Config();

DataConnectionConfig dcc = new DataConnectionConfig("my_data_connection");
dcc.setType("JDBC");
dcc.setProperty("jdbcUrl", "jdbc:postgresql://172.17.0.2/postgres");
dcc.setProperty("user", "postgres");
dcc.setProperty("password", "postgres");
config.addDataConnectionConfig(dcc);

}
}
----

Configure an IMap named `my_map` with the map loader:

[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
// ...

MapStoreConfig mapStoreConfig = new MapStoreConfig();
mapStoreConfig.setClassName(SimpleMapLoader.class.getName());

MapConfig mapConfig = new MapConfig("my_map");
mapConfig.setMapStoreConfig(mapStoreConfig);
config.addMapConfig(mapConfig);


}
}
----

Create a `HazelcastInstance` with the `Config`, get the IMap and read some data:
[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
// ...

HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
IMap<Integer, String> map = hz.getMap("my_map");

System.out.println("1 maps to " + map.get(1));
System.out.println("42 maps to " + map.get(10));
}
}
----

When you run this class you should see the following output:

[source,text]
----
1 maps to one
42 maps to null
----
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved
Loading
Loading