s3.adoc 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292
  1. = S3 Integration
  2. https://aws.amazon.com/s3/[S3] allows storing files in a cloud.
  3. A Spring Boot starter is provided to auto-configure the various S3 integration related components.
  4. Maven coordinates, using <<index.adoc#bill-of-materials, Spring Cloud AWS BOM>>:
  5. [source,xml]
  6. ----
  7. <dependency>
  8. <groupId>io.awspring.cloud</groupId>
  9. <artifactId>spring-cloud-aws-starter-s3</artifactId>
  10. </dependency>
  11. ----
  12. == Using S3 client
  13. The starter automatically configures and registers a `S3Client` bean in the Spring application context. The `S3Client` bean can be used to perform operations on S3 buckets and objects.
  14. [source,java]
  15. ----
  16. import java.io.IOException;
  17. import java.nio.charset.StandardCharsets;
  18. import software.amazon.awssdk.core.ResponseInputStream;
  19. import software.amazon.awssdk.services.s3.S3Client;
  20. import software.amazon.awssdk.services.s3.model.GetObjectResponse;
  21. import org.springframework.stereotype.Component;
  22. import org.springframework.util.StreamUtils;
  23. @Component
  24. class S3ClientSample {
  25. private final S3Client s3Client;
  26. S3ClientSample(S3Client s3Client) {
  27. this.s3Client = s3Client;
  28. }
  29. void readFile() throws IOException {
  30. ResponseInputStream<GetObjectResponse> response = s3Client.getObject(
  31. request -> request.bucket("bucket-name").key("file-name.txt"));
  32. String fileContent = StreamUtils.copyToString(response, StandardCharsets.UTF_8);
  33. System.out.println(fileContent);
  34. }
  35. }
  36. ----
  37. == Using S3TransferManager and CRT-based S3 Client
  38. AWS https://aws.amazon.com/blogs/developer/introducing-crt-based-s3-client-and-the-s3-transfer-manager-in-the-aws-sdk-for-java-2-x/[launched] a high level file transfer utility, called Transfer Manager and a CRT based S3 client.
  39. The starter automatically configures and registers a `software.amazon.awssdk.transfer.s3.S3TransferManager` bean if the following dependency is added to the project:
  40. [source,xml]
  41. ----
  42. <dependency>
  43. <groupId>software.amazon.awssdk</groupId>
  44. <artifactId>s3-transfer-manager</artifactId>
  45. </dependency>
  46. ----
  47. Transfer Manager works the best with CRT S3 Client. To auto-configure CRT based `S3AsyncClient` add following dependency to your project:
  48. [source,xml]
  49. ----
  50. <dependency>
  51. <groupId>software.amazon.awssdk.crt</groupId>
  52. <artifactId>aws-crt</artifactId>
  53. </dependency>
  54. ----
  55. When no `S3AsyncClient` bean is created, the default `S3AsyncClient` created through AWS SDK is used. To benefit from maximum throughput, multipart upload/download and resumable file upload consider using CRT based `S3AsyncClient`.
  56. == S3 Objects as Spring Resources
  57. https://docs.spring.io/spring/docs/current/spring-framework-reference/html/resources.html[Spring Resources] are an abstraction for a number of low-level resources, such as file system files, classpath files, servlet context-relative files, etc.
  58. Spring Cloud AWS adds a new resource type: a `S3Resource` object.
  59. The Spring Resource Abstraction for S3 allows S3 objects to be accessed by their S3 URL using the `@Value` annotation:
  60. [source,java]
  61. ----
  62. @Value("s3://[S3_BUCKET_NAME]/[FILE_NAME]")
  63. private Resource s3Resource;
  64. ----
  65. ...or the Spring application context
  66. [source,java]
  67. ----
  68. SpringApplication.run(...).getResource("s3://[S3_BUCKET_NAME]/[FILE_NAME]");
  69. ----
  70. This creates a `Resource` object that can be used to read the file, among https://docs.spring.io/spring/docs/current/spring-framework-reference/html/resources.html#resources-resource[other possible operations].
  71. It is also possible to write to a `Resource`, although a `WriteableResource` is required.
  72. [source,java]
  73. ----
  74. @Value("s3://[S3_BUCKET_NAME]/[FILE_NAME]")
  75. private Resource s3Resource;
  76. ...
  77. try (OutputStream os = ((WritableResource) s3Resource).getOutputStream()) {
  78. os.write("content".getBytes());
  79. }
  80. ----
  81. To work with the `Resource` as a S3 resource, cast it to `io.awspring.cloud.s3.S3Resource`.
  82. Using `S3Resource` directly lets you set the https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html[S3 object metadata].
  83. [source,java]
  84. ----
  85. @Value("s3://[S3_BUCKET_NAME]/[FILE_NAME]")
  86. private Resource s3Resource;
  87. ...
  88. ObjectMetadata objectMetadata = ObjectMetadata.builder()
  89. .contentType("application/json")
  90. .serverSideEncryption(ServerSideEncryption.AES256)
  91. .build();
  92. s3Resource.setObjectMetadata(objectMetadata);
  93. try (OutputStream outputStream = s3Resource.getOutputStream()) {
  94. outputStream.write("content".getBytes(StandardCharsets.UTF_8));
  95. }
  96. ----
  97. === S3 Output Stream
  98. Under the hood by default `S3Resource` uses a `io.awspring.cloud.s3.InMemoryBufferingS3OutputStream`. When data is written to the resource, is gets sent to S3 using multipart upload.
  99. If a network error occurs during upload, `S3Client` has a built-in retry mechanism that will retry each failed part. If the upload fails after retries, multipart upload gets aborted and `S3Resource` throws `io.awspring.cloud.s3.S3Exception`.
  100. If `InMemoryBufferingS3OutputStream` behavior does not fit your needs, you can use `io.awspring.cloud.s3.DiskBufferingS3OutputStream` by defining a bean of type `DiskBufferingS3OutputStreamProvider` which will override the default output stream provider.
  101. With `DiskBufferingS3OutputStream` when data is written to the resource, first it is stored on the disk in a `tmp` directory in the OS. Once the stream gets closed, the file gets uploaded with https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Client.html#putObject-java.util.function.Consumer-java.nio.file.Path-[S3Client#putObject] method.
  102. If a network error occurs during upload, `S3Client` has a built-in retry mechanism. If the upload fails after retries, `S3Resource` throws `io.awspring.cloud.s3.UploadFailed` exception containing a file location in a temporary directory in a file system.
  103. [source,java]
  104. ----
  105. try (OutputStream outputStream = s3Resource.getOutputStream()) {
  106. outputStream.write("content".getBytes(StandardCharsets.UTF_8));
  107. } catch (UploadFailedException e) {
  108. // e.getPath contains a file location in temporary folder
  109. }
  110. ----
  111. If you are using the `S3TransferManager`, the default implementation will switch to `io.awspring.cloud.s3.TransferManagerS3OutputStream`. This OutputStream also uses a temporary file to write it on disk before uploading it to S3, but it may be faster as it uses a multi-part upload under the hood.
  112. === Searching resources
  113. The Spring resource loader also supports collecting resources based on an Ant-style path specification. Spring Cloud AWS
  114. offers the same support to resolve resources within a bucket and even throughout buckets. The actual resource loader needs
  115. to be wrapped with the Spring Cloud AWS one in order to search for S3 buckets, in case of non S3 bucket the resource loader
  116. will fall back to the original one. The next example shows the resource resolution by using different patterns.
  117. [source,java,indent=0]
  118. ----
  119. import org.springframework.context.ApplicationContext;
  120. import org.springframework.core.io.support.ResourcePatternResolver;
  121. import org.springframework.core.io.Resource;
  122. import io.awspring.cloud.s3.S3PathMatchingResourcePatternResolver;
  123. import software.amazon.awssdk.services.s3.S3Client;
  124. public class SimpleResourceLoadingBean {
  125. private final ResourcePatternResolver resourcePatternResolver;
  126. @Autowired
  127. public void setupResolver(S3Client s3Client, ApplicationContext applicationContext) {
  128. this.resourcePatternResolver = new S3PathMatchingResourcePatternResolver(s3Client, applicationContext);
  129. }
  130. public void resolveAndLoad() throws IOException {
  131. Resource[] allTxtFilesInFolder = this.resourcePatternResolver.getResources("s3://bucket/name/*.txt");
  132. Resource[] allTxtFilesInBucket = this.resourcePatternResolver.getResources("s3://bucket/**/*.txt");
  133. Resource[] allTxtFilesGlobally = this.resourcePatternResolver.getResources("s3://**/*.txt");
  134. }
  135. }
  136. ----
  137. [WARNING]
  138. ====
  139. Resolving resources throughout all buckets can be very time consuming depending on the number of buckets a user owns.
  140. ====
  141. == Using S3Template
  142. Spring Cloud AWS provides a higher abstraction on the top of `S3Client` providing methods for the most common use cases when working with S3.
  143. On the top of self-explanatory methods for creating and deleting buckets, `S3Template` provides a simple methods for uploading and downloading files:
  144. [source,java]
  145. ----
  146. @Autowired
  147. private S3Template s3Template;
  148. InputStream is = ...
  149. // uploading file without metadata
  150. s3Template.upload(BUCKET, "file.txt", is);
  151. // uploading file with metadata
  152. s3Template.upload(BUCKET, "file.txt", is, ObjectMetadata.builder().contentType("text/plain").build());
  153. ----
  154. Another feature of `S3Template` is the ability to generate signed URLs for getting/putting S3 objects in a single method call.
  155. [source,java]
  156. ----
  157. URL signedGetUrl = s3Template.createSignedGetUrl("bucket_name", "file.txt", Duration.ofMinutes(5));
  158. ----
  159. `S3Template` also allows storing & retrieving Java objects.
  160. [source,java]
  161. ----
  162. Person p = new Person("John", "Doe");
  163. s3Template.store(BUCKET, "person.json", p);
  164. Person loadedPerson = s3Template.read(BUCKET, "person.json", Person.class);
  165. ----
  166. By default, if Jackson is on the classpath, `S3Template` uses `ObjectMapper` based `Jackson2JsonS3ObjectConverter` to convert from S3 object to Java object and vice versa.
  167. This behavior can be overwritten by providing custom bean of type `S3ObjectConverter`.
  168. == Determining S3 Objects Content Type
  169. All S3 objects stored in S3 through `S3Template`, `S3Resource` or `S3OutputStream` automatically get set a `contentType` property on the S3 object metadata, based on the S3 object key (file name).
  170. By default, `PropertiesS3ObjectContentTypeResolver` - a component supporting over 800 file extensions is responsible for content type resolution.
  171. If this content type resolution does not meet your needs, you can provide a custom bean of type `S3ObjectContentTypeResolver` which will be automatically used in all components responsible for uploading files.
  172. == Configuration
  173. The Spring Boot Starter for S3 provides the following configuration options:
  174. [cols="2,3,1,1"]
  175. |===
  176. | Name | Description | Required | Default value
  177. | `spring.cloud.aws.s3.enabled` | Enables the S3 integration. | No | `true`
  178. | `spring.cloud.aws.s3.endpoint` | Configures endpoint used by `S3Client`. | No | `http://localhost:4566`
  179. | `spring.cloud.aws.s3.region` | Configures region used by `S3Client`. | No | `eu-west-1`
  180. | `spring.cloud.aws.s3.accelerate-mode-enabled` | Option to enable using the accelerate endpoint when accessing S3. Accelerate endpoints allow faster transfer of objects by using Amazon CloudFront's globally distributed edge locations. | No | `null` (falls back to SDK default)
  181. | `spring.cloud.aws.s3.checksum-validation-enabled` | Option to disable doing a validation of the checksum of an object stored in S3. | No | `null` (falls back to SDK default)
  182. | `spring.cloud.aws.s3.chunked-encoding-enabled` | Option to enable using chunked encoding when signing the request payload for `PutObjectRequest` and `UploadPartRequest`. | No | `null` (falls back to SDK default)
  183. | `spring.cloud.aws.s3.path-style-access-enabled` | Option to enable using path style access for accessing S3 objects instead of DNS style access. DNS style access is preferred as it will result in better load balancing when accessing S3. | No | `null` (falls back to SDK default)
  184. | `spring.cloud.aws.s3.use-arn-region-enabled` | If an S3 resource ARN is passed in as the target of an S3 operation that has a different region to the one the client was configured with, this flag must be set to 'true' to permit the client to make a cross-region call to the region specified in the ARN otherwise an exception will be thrown. | No | `null` (falls back to SDK default)
  185. | `spring.cloud.aws.s3.crt.minimum-part-size-in-bytes` | Sets the minimum part size for transfer parts. Decreasing the minimum part size causes multipart transfer to be split into a larger number of smaller parts. Setting this value too low has a negative effect on transfer speeds, causing extra latency and network communication for each part. | No | `null` (falls back to SDK default)
  186. | `spring.cloud.aws.s3.crt.initial-read-buffer-size-in-bytes` | Configure the starting buffer size the client will use to buffer the parts downloaded from S3. Maintain a larger window to keep up a high download throughput; parts cannot download in parallel unless the window is large enough to hold multiple parts. Maintain a smaller window to limit the amount of data buffered in memory. | No | `null` (falls back to SDK default)
  187. | `spring.cloud.aws.s3.crt.target-throughput-in-gbps` | The target throughput for transfer requests. Higher value means more S3 connections will be opened. Whether the transfer manager can achieve the configured target throughput depends on various factors such as the network bandwidth of the environment and the configured `max-concurrency` | No | `null` (falls back to SDK default)
  188. | `spring.cloud.aws.s3.crt.max-concurrency` | Specifies the maximum number of S3 connections that should be established during transfer | No | `null` (falls back to SDK default)
  189. | `spring.cloud.aws.s3.transfer-manager.max-depth` | Specifies the maximum number of levels of directories to visit in `S3TransferManager#uploadDirectory` operation | No | `null` (falls back to SDK default)
  190. | `spring.cloud.aws.s3.transfer-manager.follow-symbolic-links` | Specifies whether to follow symbolic links when traversing the file tree in `S3TransferManager#uploadDirectory` operation | No | `null` (falls back to SDK default)
  191. |===
  192. == IAM Permissions
  193. Following IAM permissions are required by Spring Cloud AWS:
  194. [cols="2,1"]
  195. |===
  196. | Downloading files | `s3:GetObject`
  197. | Searching files | `s3:ListObjects`
  198. | Uploading files | `s3:PutObject`
  199. |===
  200. Sample IAM policy granting access to `spring-cloud-aws-demo` bucket:
  201. [source,json,indent=0]
  202. ----
  203. {
  204. "Version": "2012-10-17",
  205. "Statement": [
  206. {
  207. "Effect": "Allow",
  208. "Action": "s3:ListBucket",
  209. "Resource": "arn:aws:s3:::spring-cloud-aws-demo"
  210. },
  211. {
  212. "Effect": "Allow",
  213. "Action": "s3:GetObject",
  214. "Resource": "arn:aws:s3:::spring-cloud-aws-demo/*"
  215. },
  216. {
  217. "Effect": "Allow",
  218. "Action": "s3:PutObject",
  219. "Resource": "arn:aws:s3:::spring-cloud-aws-demo/*"
  220. }
  221. ]
  222. }
  223. ----