Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc/server/integration_test.go: added in a new test for the CDC server using the workload.Checker #1042

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

ryanluu12345
Copy link
Contributor

@ryanluu12345 ryanluu12345 commented Oct 11, 2024

Integrated the logic from the existing CDC server integration test and the latest developments for the pglogical integration tests to create a test based on the workload.Checker that validates the CDC server. This implementation also aims to refactor the Changefeed creation string logic so that it's reusable for other types of tests or the workload command.

Resolves: #906
Release Note: None


This change is Reviewable

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch 2 times, most recently from 2efbcdc to 202c581 Compare October 11, 2024 03:32
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 1 unresolved discussion (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 455 at r1 (raw file):

// changefeed. This makes it so this is more portable for future work that needs
// this: for example, integrating the e2e workload checker with this logic.
func CreateChangefeedStatement(

The plan here is that when we tackle #926, we refactor this to a common location that both can access so that we have a helper to create a changefeed that is compatible with the workload helpers.

@ryanluu12345
Copy link
Contributor Author

Discussed offline but @bobvawter and I were considering using Go templates or the cockroachdb-parser/tree in order to create the changefeed statements. This is not scoped as part of this PR, but can be adapted for when we port this over the workload demo.

Additionally, all helpers will subsequently be moved over to sinktest which is where the helpers all live right now.

@ryanluu12345
Copy link
Contributor Author

ryanluu12345 commented Oct 11, 2024

@bobvawter , I'm currently digging into these failures and I have identified some issues that I'm patching locally:

  1. ✅ Webhook option not found because of unsupported version -> fixed this by bringing in the min version checks from testIntegration

  2. ✅ Data seems to be inconsistent for V23.2 tests; I thought this may be a case of all the data not being written yet, but it looks like the correct number of items got written: 8 items (7 + 1 expected; 6 + 2 got) -> fixed this by making sure we see 7 parents before we continue our checks

workload.go:130: 
        	Error Trace:	/home/runner/work/replicator/replicator/internal/sinktest/all/workload.go:130
        	            				/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:740
        	            				/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:618
        	Error:      	Should be empty, but was [expected 7 parents, got 6 expected 1 children, got 2 parent 8003786112974678833 not found parent 2022766499421304972 not found parent 255718460[8026](https://github.com/cockroachdb/replicator/actions/runs/11285840194/job/31389309792?pr=1042#step:7:8027)950244 not found]
        	Test:       	TestWorkload
    integration_test.go:740: 
        	Error Trace:	/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:740
        	            				/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:618
        	Error:      	Should be true
        	Test:       	TestWorkload
  1. Unknown schema for the source, even though I can clearly see that it is defined:
  • Integration test: CRDB 24.1 and permuations of each CRDB version
  • Integration test: CRDB 24.1 and MySQL / MariaDB / Oracle

What I'm thinking here is that for some reason it's the wrong schema name and this should actually be the target.... I'm not exactly sure what to make of this and when I should be using the "target" schema for this and "source" schema.

{
  "message": "create changefeed statement is CREATE CHANGEFEED FOR TABLE \"src-15840-196\".\"public\".\"tbl-33\", \"src-15840-196\".\"public\".\"tbl-34\" INTO 'webhook-https://127.0.0.1:33853/tgt-15840-198/public?insecure_tls_skip_verify=true'  WITH updated,     resolved='1s',     webhook_auth_header='***'",
  "severity": "DEBUG",
  "timestamp": {
    "seconds": 1728618370,
    "nanos": 330893135
  }
}
    integration_test.go:718: 
        	Error Trace:	/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:718
        	            				/home/runner/work/replicator/replicator/internal/source/cdc/server/integration_test.go:618
        	Error:      	Received unexpected error:
        	            	unknown schema: "src-15840-196"."public"

Wondering if you have thoughts here on the schema I should be using? FWIW, this works for C2C case locally, so I'm wondering why it's off for heterogeneous DBs.

On the bright side, we are down to one last outstanding issue. The missing schema here haha.

@ryanluu12345
Copy link
Contributor Author

ryanluu12345 commented Oct 11, 2024

So I'm doing some testing locally just to better understand what may be wrong here. I've identified that I actually need to use the source accumulators and watchers so that the data ends up in the source (CRDB) instead of the target. However, I've done testing here and verified it matches what I see in this reference (link) and am now hitting a different error:

Error:          Received unexpected error:
                                update sent to unknown tables: src-91611-1.public.tbl-1
                                github.com/cockroachdb/replicator/internal/types.(*orderedAdapter).flush

Looking at the code, this appears to happen when order is an empty slice and acc is non-empty map. So basically we don't fully clear the acc map which means the mutations couldn't be applied.

func (t *orderedAdapter) flush(
	ctx context.Context,
	kind string,
	order []ident.Table,
	acc *ident.TableMap[[]Mutation],
	opts *AcceptOptions,
) error {

This is with this setup:

acc := types.OrderedAcceptorFrom(sourceFixture.ApplyAcceptor, sourceFixture.Watchers)

	for i := range maxIterations {
		batch := &types.MultiBatch{}
		sourceGeneratorWorkload.GenerateInto(batch, hlc.New(int64(i), i+1))

		r.NoError(err)
		r.NoError(acc.AcceptMultiBatch(ctx, batch, &types.AcceptOptions{TargetQuerier: tx}))
		r.NoError(tx.Commit())
	}

I've also done a sanity check and verified the tables are created on the source correctly:

source conn:  postgresql://root@localhost:26257/defaultdb
CREATE TABLE "src-91216-1"."public"."tbl-1" (parent BIGINT PRIMARY KEY, val BIGINT DEFAULT 0 NOT NULL)
CREATE TABLE "src-91216-1"."public"."tbl-2" (
child BIGINT PRIMARY KEY,
parent BIGINT NOT NULL,
val BIGINT DEFAULT 0 NOT NULL,
CONSTRAINT parent_fk FOREIGN KEY(parent) REFERENCES "src-91216-1"."public"."tbl-1"(parent)
)

If I understand correctly, it should accept the batch and be able to write it to the source. The thing that is a bit suspect is the source conn string, which points to defaultdb. I'm wondering if it's an issue because that source connection should be pointing towards the src schema instead. I'm not touching the source or target pool connection strings, so I wonder why it's not using the correct schema on the source.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 4b62f86 to fee1535 Compare October 14, 2024 22:10
@ryanluu12345
Copy link
Contributor Author

@bobvawter , addressed all major concerns mentioned above and also added the version gate (22.1) for union in recursive CTEs. This is ready for a final review. I'll clean up the commits and squash them after addressing any comments you have here.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 0bf928b to 98ae9ca Compare October 15, 2024 19:05
Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)


internal/source/cdc/server/integration_test.go line 455 at r1 (raw file):

Previously, ryanluu12345 wrote…

The plan here is that when we tackle #926, we refactor this to a common location that both can access so that we have a helper to create a changefeed that is compatible with the workload helpers.

Makes sense.


internal/source/cdc/server/integration_test.go line 765 at r4 (raw file):

		r.NoError(err)
		// We know we generate 7 items here in the parent.
		if ct >= 7 {

This test is brittle. Remove 7 in favor of checking for the number of expected values per the workload checker.


internal/target/schemawatch/watcher.go line 221 at r4 (raw file):

			var dbNameRaw string
			fmt.Println(parts[0].Raw())
			fmt.Println(tx.ConnectionString)

Remove printf.

Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, reflow to 100 columns.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 98ae9ca to 4a77fdb Compare October 22, 2024 22:10
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 765 at r4 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This test is brittle. Remove 7 in favor of checking for the number of expected values per the workload checker.

Done.


internal/target/schemawatch/watcher.go line 221 at r4 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Remove printf.

Done.

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 4a77fdb to 40b20e1 Compare October 22, 2024 22:15
@ryanluu12345
Copy link
Contributor Author

Hi @bobvawter , just wanted to bump this PR since I've addressed all outstanding comments here. I'll fix the issues on the other PRs you've noted.

Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)


internal/source/cdc/server/integration_test.go line 389 at r5 (raw file):

// Union recursive CTEs are only supported in v22.1 and later.
func supportsUnionRecursiveCTEs(version string) (bool, error) {

Sort.


internal/source/cdc/server/integration_test.go line 412 at r5 (raw file):

func getConfig(
	cfg *testConfig, fixture *all.Fixture, targetPool *types.TargetPool,
) (*Config, error) {

Why error ?


internal/source/cdc/server/integration_test.go line 458 at r5 (raw file):

// the input parameters as specific as possible so that the caller doesn't need
// to construct whole structs with unrelated information to create the
// changefeed. This makes it so this is more portable for future work that needs

Huh? Any use of a create changefeed template function is going to require more-or-less the same data to be passed in and this is a function signature that's only going to get larger. This change wants to be in a utility package and the CreateChangefeedStatement is just that type's String() method.


internal/source/cdc/server/integration_test.go line 646 at r5 (raw file):

func TestWorkload(t *testing.T) {
	testWorkload(t)

Silly delegation.


internal/source/cdc/server/integration_test.go line 660 at r5 (raw file):

	// Union for recursive CTEs is required in order to use the workload checker
	// since there is a UNION under the hood.

That's not quite correct. The checker itself is fine, the complication is the schema-inspection code that's part of setting up the fixture.


internal/source/cdc/server/integration_test.go line 668 at r5 (raw file):

	}

	cfg := &testConfig{webhook: true}

Should this cycle through other flavors?


internal/source/cdc/server/integration_test.go line 677 at r5 (raw file):

	supportsQueries, err := supportsQueries(sourceVersion)
	r.NoError(err)
	if cfg.queries && !supportsQueries {

This is always false.


internal/source/cdc/server/integration_test.go line 691 at r5 (raw file):

	child := ident.NewTable(sourceSchema, targetChecker.Child.Name().Table())
	sourceGeneratorWorkload := workload.NewGeneratorBase(parent, child)
	r.NoError(err)

This check is out of place.


internal/source/cdc/server/integration_test.go line 698 at r5 (raw file):

	sourcePool := targetFixture.SourcePool
	parent = sourceGeneratorWorkload.Parent
	child = sourceGeneratorWorkload.Child

Didn't you just set this a couple of lines above?


internal/source/cdc/server/integration_test.go line 723 at r5 (raw file):

	// Create the test server fixture.
	// Preflight sets default values that are not set in the testConfig.
	r.NoError(serverCfg.Preflight())

Why doesn't it?


internal/source/cdc/server/integration_test.go line 725 at r5 (raw file):

	r.NoError(serverCfg.Preflight())

	timeoutCtx, cancel := context.WithTimeout(ctx, 30*time.Second)

The Context in a test fixture already has a timeout behavior.


internal/source/cdc/server/integration_test.go line 749 at r5 (raw file):

		diff:    cfg.diff,
		queries: cfg.queries,
		webhook: cfg.webhook,

Are these fields in cfg set?


internal/source/cdc/server/integration_test.go line 763 at r5 (raw file):

	// Make this the target fixture for the accumulator. This is
	// required for the data to write properly later on when
	// we accumulate the batch.

I don't understand what this comment is trying to say.


internal/source/cdc/server/integration_test.go line 792 at r5 (raw file):

			break
		}
		log.Debug("waiting for target rows to be written")

This loop should probably also check for the expected number of child rows, since the generator's last action could be to create a new child.


internal/source/cdc/server/integration_test.go line 799 at r5 (raw file):

	// We need to wait for the connection to shut down, otherwise the
	// database cleanup callbacks (to drop the publication, etc.) from

Publication?

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 40b20e1 to 3bcc4c4 Compare October 25, 2024 18:29
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 412 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Why error ?

Good call out, removing.


internal/source/cdc/server/integration_test.go line 660 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

That's not quite correct. The checker itself is fine, the complication is the schema-inspection code that's part of setting up the fixture.

Done.


internal/source/cdc/server/integration_test.go line 691 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This check is out of place.

Good catch, removing.


internal/source/cdc/server/integration_test.go line 698 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Didn't you just set this a couple of lines above?

Updated this so we don't create this twice. I'm going to just pass in the result of ident.NewTable to the NewGeneratorBase call and then use the sourceGeneratorWorkload as the source of truth to grab the parent and child table SQLs.


internal/source/cdc/server/integration_test.go line 792 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This loop should probably also check for the expected number of child rows, since the generator's last action could be to create a new child.

Checking for both the child and parent rows now!


internal/source/cdc/server/integration_test.go line 799 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Publication?

Removed and updated comment to be more clear.

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 763 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

I don't understand what this comment is trying to say.

Yeah comment was a note to self when I didn't really understand how it works. I don't think a comment is really needed here. We just want to create a multiacceptor so that we can accept the batches and write them over to the source database.

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 749 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Are these fields in cfg set?

They aren't currently in this test, but I'd imagine for other tests, they're useful to set so that they can run with diff mode, in cdc queries mode, or webhook

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 677 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This is always false.

Removing the cfg.queries since you're right, it's always false

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 3bcc4c4 to 26df784 Compare October 25, 2024 18:41
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 646 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Silly delegation.

Resolving.


internal/source/cdc/server/integration_test.go line 668 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Should this cycle through other flavors?

Decided to test the simplest case, webhook, since the setup for the changefeed is simpler (doesn't require multiple changefeed queries for multiple tables). Also, we have coverage for the other flavors already inside of TestIntegration above.


internal/source/cdc/server/integration_test.go line 725 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

The Context in a test fixture already has a timeout behavior.

Good call, going to just use targetFixture.Context

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/source/cdc/server/integration_test.go line 389 at r5 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Sort.

Done.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch 2 times, most recently from 5df8355 to 4097471 Compare October 28, 2024 22:05
Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r10.
Reviewable status: 3 of 6 files reviewed, 15 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)


internal/sinktest/changefeed.go line 1 at r10 (raw file):

package sinktest

Copyright.


internal/sinktest/changefeed.go line 14 at r10 (raw file):

// of the created changefeed statement depending
// on the options defined. This can be extended later on
// to handle various configurations and webhook parameters.

Reflow comments to 72 chars, here and elsewhere.


internal/sinktest/changefeed.go line 16 at r10 (raw file):

// to handle various configurations and webhook parameters.
type ChangefeedConfig struct {
	Webhook    bool

Sort.


internal/sinktest/changefeed.go line 18 at r10 (raw file):

	Webhook    bool
	Diff       bool
	Queries    bool

Sort.


internal/sinktest/changefeed.go line 26 at r10 (raw file):

// useful for creating changefeeds on CRDB sources.
type ChangefeedStatement struct {
	Cfg                    *ChangefeedConfig

This seems like an unnecessary separation of concerns. Why would some fields go into ChangefeedConfig, but not into ChangefeedStatement ? Does ChangefeedConfig need to exist at all?


internal/sinktest/changefeed.go line 32 at r10 (raw file):

	Tables                 []ident.Table
	SourceVersion          string
	QueryProjectionColumns string

Is is only columns that can be in this field? If so, a []ident.Ident seems like the right choice. If it's an arbitrary projection, just call it QueryProjection.


internal/sinktest/changefeed.go line 46 at r10 (raw file):

	} else {
		// Creating the comma-separated table string required by the changefeed.
		TablesStr := ""

Uncapitalize local variables.


internal/sinktest/changefeed_test.go line 1 at r10 (raw file):

package sinktest

Copyright.


internal/source/cdc/server/integration_test.go line 425 at r10 (raw file):

func getConfig(
	r *require.Assertions, cfg *testConfig, fixture *all.Fixture, targetPool *types.TargetPool,

Return (*Config, error). Passing in the *require.Assertions mixes concerns of producing a valid configuration, versus test flow.


internal/source/cdc/server/integration_test.go line 459 at r10 (raw file):

	// Preflight sets default values that are not set in the testConfig.
	r.NoError(fixtureCfg.Preflight())
	return fixtureCfg

You can return fixtureCfg, fixtureCfg.Preflight()


internal/source/cdc/server/integration_test.go line 462 at r10 (raw file):

}

const maxIterations = 25

If this is specific only to the testWork(), declare it at the top of the function.


internal/source/cdc/server/integration_test.go line 469 at r10 (raw file):

		cfg  *testConfig
	}{
		{"webhook true diff true queries false", &testConfig{webhook: true, diff: true, queries: false}},

Reflow to 100 columns.


internal/source/cdc/server/integration_test.go line 471 at r10 (raw file):

		{"webhook true diff true queries false", &testConfig{webhook: true, diff: true, queries: false}},
		{"webhook true diff false queries false", &testConfig{webhook: true, diff: false, queries: false}},
		{"webhook true diff true queries true key in value true", &testConfig{webhook: true, diff: true, queries: true, keyInValue: true}},

This should also test non-webhook delivery. Webhook versus ndjson shouldn't make any difference, but it's good to verify across the combinatorial explosion that is changefeed configurations.


internal/source/cdc/server/integration_test.go line 514 at r10 (raw file):

	ctx := targetFixture.Context
	targetChecker, _, err := targetFixture.NewWorkload(ctx, &all.WorkloadConfig{
		DisableFK: false,

Shouldn't this be conditional on changefeed queries being enabled?


internal/source/cdc/server/integration_test.go line 623 at r10 (raw file):

			Tables:                 tables,
			SourceVersion:          sourceVersion,
			QueryProjectionColumns: "", // Empty queryProjectionColumns

Unnecessary assignment of zero value.

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 6 files reviewed, 15 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/sinktest/changefeed.go line 1 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Copyright.

Done.


internal/sinktest/changefeed.go line 14 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Reflow comments to 72 chars, here and elsewhere.

Done.


internal/sinktest/changefeed.go line 16 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Sort.

Done.


internal/sinktest/changefeed.go line 18 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Sort.

Done.


internal/sinktest/changefeed.go line 46 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Uncapitalize local variables.

Sorry, this was a good catch. I did a find and replace in order to update the table field to Table. I think this was an unintended effect.


internal/sinktest/changefeed_test.go line 1 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Copyright.

Done.


internal/source/cdc/server/integration_test.go line 425 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Return (*Config, error). Passing in the *require.Assertions mixes concerns of producing a valid configuration, versus test flow.

Got it!


internal/source/cdc/server/integration_test.go line 459 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

You can return fixtureCfg, fixtureCfg.Preflight()

Yep, cleaner! I'll do it this way


internal/source/cdc/server/integration_test.go line 462 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

If this is specific only to the testWork(), declare it at the top of the function.

Yep only used here, I'll update this to be at the top of the function.


internal/source/cdc/server/integration_test.go line 623 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Unnecessary assignment of zero value.

Done.

Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 6 files reviewed, 12 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)


internal/sinktest/changefeed.go line 1 at r10 (raw file):

Previously, ryanluu12345 wrote…

Done.

It's 2024.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 6330dff to db39513 Compare October 29, 2024 21:00
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 6 files reviewed, 12 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)


internal/sinktest/changefeed.go line 1 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

It's 2024.

Done.


internal/sinktest/changefeed.go line 26 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This seems like an unnecessary separation of concerns. Why would some fields go into ChangefeedConfig, but not into ChangefeedStatement ? Does ChangefeedConfig need to exist at all?

Good point, I'll go ahead and remove this struct wrapper. Not needed here anymore.


internal/sinktest/changefeed.go line 32 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Is is only columns that can be in this field? If so, a []ident.Ident seems like the right choice. If it's an arbitrary projection, just call it QueryProjection.

For now I'm going to support just columns. If we need more flexibility later, I'll make this an arbitrary projection. I think []ident.Ident is a good move here and will support this + update the code and tests to reflect this.


internal/source/cdc/server/integration_test.go line 469 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Reflow to 100 columns.

Done.


internal/source/cdc/server/integration_test.go line 471 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

This should also test non-webhook delivery. Webhook versus ndjson shouldn't make any difference, but it's good to verify across the combinatorial explosion that is changefeed configurations.

Yep good call out here. I'm also planning to test against:

  • webhook == false, queries == false, diff = false | diff = true
  • webhook == false, queries == diff == key_in_value == true (since these must be set together in order to emit all deletes)

internal/source/cdc/server/integration_test.go line 514 at r10 (raw file):

Previously, bobvawter (Bob Vawter) wrote…

Shouldn't this be conditional on changefeed queries being enabled?

Yep, it should, I made it key off of cfg.queries

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch 2 times, most recently from fcbe36c to be5b169 Compare October 29, 2024 21:16
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I've gone through the log lines for the test cases that failed and I've noticed that this the common thread between all the failed runs is the test case: TestWorkload/webhook\_true\_diff\_true\_queries\_true\_key\_in\_value\_true and the following logging:

"error": "proposed checkpoint timestamp for group=\"tgt-15510-396.public\" [ \"tgt-15510-396\".\"public\".\"tbl-69\", \"tgt-15510-396\".\"public\".\"tbl-70\" ], partition=\"tgt-15510-396.public\" is going backwards: 1730245103746546739.0000000000; verify changefeed cursor or remove already-applied checkpoint timestamp entries",

We see several log messages of this form for every failed run. However, for all runs that succeeded, we did not see any failures of this types (grepped for proposed checkpoint timestamp).

I have seen these failures on the following targets:

  • MySQL 5.6 as target
  • MySQL Maria DB v10
  • Postgres v12
  • Oracle v21.3
  • Objstore
  • Postgres v13

I know that we previously implemented the configuration option to be able to disable this check, but I don't want to paper over this if this is a legitimate issue. I'm wondering if you have any insights into this noted behavior?

I'll attach some raw logs of a few failure cases below:
https://productionresultssa17.blob.core.windows.net/actions-results/265f65ce-26a7-4892-848d-c634a7627b7c/workflow-job-run-9651b4f4-7e76-59bc-40cd-903b1abdbf95/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-10-30T00%3A21%3A38Z&sig=qG8o1kliQWmvUQ77cYpkZE%2FgReI9w2Mtarz4xrlajDY%3D&ske=2024-10-30T08%3A46%3A07Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-10-29T20%3A46%3A07Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-08-04&sp=r&spr=https&sr=b&st=2024-10-30T00%3A11%3A33Z&sv=2024-08-04

https://productionresultssa17.blob.core.windows.net/actions-results/265f65ce-26a7-4892-848d-c634a7627b7c/workflow-job-run-8e0db1fe-7ff6-5c40-1ba5-2022544b595a/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-10-30T00%3A21%3A27Z&sig=kA1E1j%2BqwIXuvbX9mXcZ6ocgzs1L5ZsECKRgOP2sliM%3D&ske=2024-10-30T08%3A54%3A12Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-10-29T20%3A54%3A12Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-08-04&sp=r&spr=https&sr=b&st=2024-10-30T00%3A11%3A22Z&sv=2024-08-04

https://productionresultssa17.blob.core.windows.net/actions-results/265f65ce-26a7-4892-848d-c634a7627b7c/workflow-job-run-116d4943-12ce-5a1c-6e9a-64c4c5013f12/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-10-30T00%3A21%3A00Z&sig=UieORS09NbOHCOkX3SDaWqBxwHv5xrYIksKJVUq0UFE%3D&ske=2024-10-30T08%3A46%3A38Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-10-29T20%3A46%3A38Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-08-04&sp=r&spr=https&sr=b&st=2024-10-30T00%3A10%3A55Z&sv=2024-08-04

Reviewable status: 1 of 6 files reviewed, 12 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)

ryanluu12345

This comment was marked as outdated.

@bobvawter
Copy link
Contributor

That message indicates a legitimate problem where, for some target schema, the resolved timestamps received from the changefeed aren't proceeding in a monotonic ordering. I filed cockroachdb/cockroach#112880 a year ago to make this less of an issue for Replicator.

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for linking the relevant ticket. Makes sense to me, and I'll keep this in mind. Given this context, enabling the SkipBackwardsDataCheck will ensure that tests pass in this scenario where resolved timestamps aren't arriving in monotonic ordering. However, it's something we must do until the ticket you linked is resolved and we use the upstream message data within replicator to mitigate or fix this issue.

What are your thoughts on keeping this test as is right now with the SkipBackwardsDataCheck enabled since this exercises and validates the e2e CDC server flow? Then, whenever the changefeedccl issue gets resolved, we can make the follow up change and reset SkipBackwardsDataCheck to false.

Reviewable status: 1 of 6 files reviewed, 12 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)

Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading the ticket, it seems that even with the changes requested in that ticket, we'll still potentially hit this issue but just be able to fail with more informative detail. So in this case I think it would still require the SkipBackwardsDataCheck. Later, can add testing that verifies we get the correct informative error logging.

Reviewable status: 1 of 6 files reviewed, 12 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)

@bobvawter
Copy link
Contributor

The check is telling you there's a problem with your test.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch 4 times, most recently from 722053a to 75d0106 Compare November 1, 2024 00:51
Copy link
Contributor Author

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this offline. Because in the multiple changefeed to same target schema case we run into the data moving backwards issue, since resolved timestamps are not currently differentiated right now, we need to disable the backwards data check. Added in a conditional setting of this SkipBackwardsDataCheck to true.

Reviewable status: 1 of 9 files reviewed, 12 unresolved discussions (waiting on @bobvawter and @Jeremyyang920)

Copy link
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: w/ nits

Reviewed 2 of 5 files at r10, 1 of 3 files at r12, 5 of 5 files at r17, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Jeremyyang920 and @ryanluu12345)


internal/sinktest/changefeed.go line 35 at r17 (raw file):

	Host                   string
	KeyInValue             bool
	Queries                bool

Isn't this redundant with a non-empty projection?


internal/sinktest/changefeed_test.go line 27 at r17 (raw file):

// Note that the `sourceVersion` must contain a space at the end to satisfy the
// semver regex (i.e. "CockroachDB CCL v24.2.1 ").

Your test string is missing the (platform) portion of a CRDB version() output.

https://github.com/cockroachdb/replicator/blob/master/internal/util/stdpool/crdb_hints_test.go#L48


internal/source/cdc/server/integration_test.go line 432 at r17 (raw file):

				// different streams. Therefore, the backwards data check
				// should be disabled in the multiple changefeed case since we
				// know the backwards data check will fail without disabling it.

Add links to cockroachdb/cockroach#112880 and #574 to close the loop later on.


internal/source/cdc/server/integration_test.go line 632 at r17 (raw file):

			Target:        targetParent,
			Token:         token,
			Tables:        tables,

Sort. Here and above.

@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 75d0106 to 134a8e5 Compare November 4, 2024 20:12
…r using the workload.Checker

Integrated the logic from the existing CDC server integration test and the
latest developments for the pglogical integration tests to create a test based
on the workload.Checker that validates the CDC server. There are variants of
this test that support various changefeed configuration options such as
key_in_value == diff == queries == true or setting webhook == false, among others.
This gives confidence that various ways that customers define changefeeds will
work with the replicator CDC server.

There are now helper methods that help to create changefeeds for a variety
of configurations, along with relevant tests that verify the create changefeed
string is as expected. There are also new helpers added to help gate when the
workload checker runs since there is a minimum version requirement for its
operation. These shared utilities now live within the sinktest package.

Resolves: #906
Release Note: None
@ryanluu12345 ryanluu12345 force-pushed the user/rluu/cdc-integration-test-wl-check branch from 134a8e5 to 1a17894 Compare November 4, 2024 20:23
@ryanluu12345 ryanluu12345 added this pull request to the merge queue Nov 4, 2024
Merged via the queue into master with commit b8af3ae Nov 4, 2024
51 of 52 checks passed
@ryanluu12345 ryanluu12345 deleted the user/rluu/cdc-integration-test-wl-check branch November 4, 2024 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update cdc/server/integration_test.go with workload.Checker
2 participants