Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3Client#getObject response input streams contains more than it should #5819

Open
1 task done
mihohren opened this issue Jan 23, 2025 · 2 comments
Open
1 task done
Labels
bug This issue is a bug. p1 This is a high priority issue potential-regression Marking this issue as a potential regression to be checked by team member

Comments

@mihohren
Copy link

mihohren commented Jan 23, 2025

Describe the bug

When switching from v2.29.x to v2.30.x I noticed the ResponseInputStream given by S3Client#getObject contains more than it should. The running example will be an object containing abc.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

S3Client#getObject#readAllBytes contains only

abc

Current Behavior

S3Client#getObject#readAllBytes contains

3
abc
0
x-amz-checksum-crc32:NSRBwg==
  

Reproduction Steps

Download an existing object through S3Client#getObject.

Possible Solution

Revert to 2.29.x

Additional Information/Context

Discovered through test in LocalStack testcontainer localstack/localstack:0.11.3

AWS Java SDK version used

v2.30.3 (bug confirmed for v2.30.0 - v2.30.3)

JDK version used

openjdk version "21.0.5" 2024-10-15 OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu122.04) OpenJDK 64-Bit Server VM (build 21.0.5+11-Ubuntu-1ubuntu122.04, mixed mode, sharing)

Operating System and version

Linux 6.8.0-51-generic #52~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Dec 9 15:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

@mihohren mihohren added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 23, 2025
@bhoradc
Copy link

bhoradc commented Jan 23, 2025

Hello @mihohren,

Thank you for reporting the issue.

The S3Client#putObject request containing extra content (including checksum information like x-amz-checksum-crc32:NSRBwg==) seems related to the changes introduced in AWS SDK Java v2.30.0 - related Discussion Announcement.

As mentioned in the linked announcement, if you need to disable this behavior (especially when working with third-party services like LocalStack that might not support this feature yet), you can do so by setting the config flag to WHEN_REQUIRED, or by using the related AWS shared config file settings or environment variables.

For instance using below sample, works for both LocalStack and AWS S3:

        S3Client s3Client = S3Client.builder()
                .endpointOverride(URI.create("http://localhost:4566"))
                .serviceConfiguration(S3Configuration.builder()
                        .pathStyleAccessEnabled(true)
                        .build())
                 .responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED)
                 .requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED)
                .build();

contentString retruns just abc using below code snippet:

  try (ResponseInputStream<GetObjectResponse> responseGet = s3Client.getObject(getObjectRequest)) {

            byte[] content = responseGet.readAllBytes();
            String contentString = new String(content);
            System.out.println("Full response content:");
            System.out.println(contentString);
        }

Also, I don't see below additional data for the getObject call. Kindly share the exact code sample you may be using to reproduce this behavior with the S3 getObject API.

3 - chunk size
abc - actual content
0 - marking the end of chunks
x-amz-checksum-crc32:NSRBwg== - trailer

Regards,
Chaitanya

@bhoradc bhoradc added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. p2 This is a standard priority issue p1 This is a high priority issue potential-regression Marking this issue as a potential regression to be checked by team member and removed needs-triage This issue or PR still needs to be triaged. p2 This is a standard priority issue labels Jan 23, 2025
@mihohren
Copy link
Author

@bhoradc Thank you for the quick response, adding WHEN_REQUIRED to both requestChecksumCalculation and responseChecksumValidation seems to have fixed the problem.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p1 This is a high priority issue potential-regression Marking this issue as a potential regression to be checked by team member
Projects
None yet
Development

No branches or pull requests

2 participants