-
Notifications
You must be signed in to change notification settings - Fork 120
/
Copy pathcc-blobstore.html.md.erb
173 lines (125 loc) · 9.86 KB
/
cc-blobstore.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
---
title: Cloud Controller blobstore
owner: CAPI
---
To stage and run apps, <%= vars.app_runtime_abbr %> manages and stores the following types of
binary large object (blob) files:
<table><thead>
<tr>
<th>Blob Type</th>
<th>Description</th>
<th>Location in Blobstore</th>
</tr></thead>
<tbody><tr>
<td>App Packages</td>
<td>Full contents of app directories, including source code and resource files, zipped into single blob files.</td>
<td><code>/cc-packages</code></td>
</tr>
<tr>
<td>Buildpacks</td>
<td>Buildpack directories, which Diego cells download to compile and stage apps with.</td>
<td><code>/cc-buildpacks</code></td>
</tr>
<tr>
<td>Resource Cache</td>
<td>Large files from app packages that the Cloud Controller stores with a SHA for later re-use. To save bandwidth, the <%= vars.app_runtime_abbr %> Command Line Interface (cf CLI) only uploads large application files that the Cloud Controller has not already stored in the resource cache.</td>
<td><code>/cc-resources</code></td>
</tr>
<tr>
<td>Buildpack Cache</td>
<td>Large files that buildpacks generate during staging, stored for later re-use. This cache lets buildpacks run more quickly when staging apps that have been staged previously.</td>
<td><code>cc-droplets/buildpack_cache</code></td>
</tr>
<tr>
<td>Droplets</td>
<td>Staged apps packaged with everything needed to run in a container.</td>
<td><code>/cc-droplets</code></td>
</tr></tbody>
</table>
<%= vars.app_runtime_abbr %> blobstores use the [Fog](http://fog.io/) Ruby gem to store blobs in
services like Amazon S3, WebDAV, or the NFS filesystem. The file system location of an internal blobstore is `/var/vcap/store/shared`.
A single blobstore typically stores all five types of blobs, but you can configure the Cloud Controller to use separate blobstores for each type.
This topic references staging and treats all blobstores as generic object stores.
For more information about staging, see [How Apps Are Staged](../concepts/how-applications-are-staged.html).
<% if vars.platform_code == "CF" || vars.platform_code == "PCF" %>
For more information about how specific third-party blobstores can be configured, see <%= vars.blobstore_link %>.
<% end %>
## <a id='stage-blobstore'></a> Staging using the blobstore
This section describes how staging buildpack apps uses the blobstore.
The following diagram illustrates how the staging process uses the blobstore.
To walk through the same diagram in an app staging context, see [How Diego Stages Buildpack Apps](./how-applications-are-staged.html).
The staging process for buildpack apps includes a developer and the following components: CF Command Line, Cloud Controller (CCNG), Blobstore, cc-uploader, Diego Cell (Staging), and Diego Cell (Running).
* Step 1: cf push from Developer to CF CLI.
* Step 2: Checksum source files from Developer to CF CLI.
* Step 3: Resource Match from CF CLI to the CCNG.
* Step 4: Check file existence from the CCNG to Blobstore.
* Step 5: Upload unmatched files from CF CLI to CCNG.
* Step 6: Download cached files from Blobstore to CCNG.
* Step 7: Upload complete package from CCNG to Blobstore.
* Step 8: Download package and buildpack from Blobstore to Diego Cell (Staging).
* Step 9: Upload droplet from Diego Cell (Staging) through cc-uploader, then CCNG, to the Blobstore.
* Step 10: Download droplet from Blobstore to CCNG.
![Full description of this diagram is in the text.](./images/cc-blobstore-staging.png)
[//]: https://docs.google.com/drawings/d/1LbM5mRjJMOfQXVvJXmDb-csfV61kHaPJaCJy20WVqMI/edit?usp=sharing
The process in which the staging process uses the blobstore is as follows:
1. **cf push:** A developer runs `cf push`.
1. **Create app:** The Cloud Foundry Command Line Interface (cf CLI) gathers local source code files and computes a checksum of each.
1. **Store app metadata:** The cf CLI makes a `resource_matches` request, which matches resources to Cloud Controller. The request lists file names and their checksums. For more information and an example API request, see [Resource Matches](http://v3-apidocs.cloudfoundry.org/version/3.74.0/index.html#resource-matches) in the <%= vars.app_runtime_abbr %> API documentation.
1. **Check file existence** includes the following:
<ol type="a">
<li>The Cloud Controller makes a series of `HEAD` requests to the blobstore to find out which files it has cached.</li>
<li>Cloud Controller content-addresses its cached files so that changes to a file result in it being stored as a different object.</li>
<li>Cloud Controller computes which files it has and which it needs the cf CLI to upload. This process can take a long time.
<li>In response to the resource match request, Cloud Controller lists the files the cf CLI needs to upload.</li>
</ol>
1. **Upload unmatched files:** The cf CLI compresses and uploads the unmatched files to Cloud Controller.
1. **Download cached files:** Cloud Controller downloads, to its local disk, the matched files that are cached in the blobstore.
1. **Upload complete package** includes the following:
<ol type="a">
<li>Cloud Controller compresses the newly uploaded files with the downloaded cached files in a ZIP file.</li>
<li>Cloud Controller uploads the complete package to the blobstore.</li>
</ol>
1. **Download package & buildpack(s):** A Diego Cell downloads the package and its buildpacks into a container and stages the app.
1. **Upload droplet** includes the following:
<ol type="a">
<li>After the app has been staged, the Diego Cell uploads the complete droplet to `cc-uploader`.</li>
<li>`cc-uploader` makes a multi-part upload request to upload the droplet to Cloud Controller.</li>
<li>Cloud Controller enqueues an asynchronous job to upload to the blobstore.</li>
</ol>
1. **Download droplet** includes the following:
<ol type="a">
<li>A Diego Cell attempts to download the droplet from Cloud Controller into the app container.</li>
<li>Cloud Controller asks the blobstore for a signed URL.</li>
<li>Cloud Controller redirects the Diego Cell droplet download request to the blobstore.</li>
<li>A Diego Cell downloads the app droplet from the blobstore and runs it.</li>
</ol>
## <a id='blobstore-cleanup'></a> Blobstore Cleanup
### <a id='blobstore-expiry'></a> How Cloud Controller reaps expired packages, droplets, and buildpacks
As new droplets and packages are created, the oldest ones associated with an app are marked as `EXPIRED` if they exceed the configured limits for packages and droplets stored per app.
Each night, starting at midnight, Cloud Controller runs a series of jobs to delete the data associated with expired packages, droplets, and buildpacks.
Enabling the native versioning feature on your blobstore increases the number of resources stored and costs. For more information, see [Using Versioning](https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html) in the AWS documentation.
### <a id='automatic-clean'></a> Automatic Orphaned Blob Cleanup
Orphaned Blobs are blobs that are stored in the blobstore, but Cloud Controller does not list in its database. These are distinct from [Expired Blobs](#blobstore-expiry), which Cloud Controller can still link to a particular package, droplet, or buildpack.
Orphaned Blobs are typically created after a blob deletion fails silently or something else goes wrong.
Cloud Controller detects and removes Orphaned Blobs by scanning part of the blobstore daily and checking for any blobs that its database does not account for. The process scans through the entire blobstore every week, and only removes blobs that show as orphans for three consecutive days.
Cloud Controller performs this automatic cleanup when the `cloud_controller_worker` job property `cc.perform_blob_cleanup` is set to `true`.
<p class="note">
<span class="note__title"><strong>Note</strong></span>
Two instances of Cloud Foundry should <strong>not</strong> use the same blobstore buckets, as the uploaded blobs will be marked as 'orphaned' by the other Cloud Foundry instance and deleted.</p>
### <a id='manual-clean'></a>Manual blob cleanup
Cloud Controller does not track resource cache and buildpack cache blob-types in its database, so it does not [reap them automatically](#blobstore-expiry) as it does with app package, buildpack, and droplet type blobs.
To clean up the buildpack cache, admin users can run `cf curl -X DELETE /v2/blobstores/buildpack_cache`. This empties the buildpack cache completely, which is a safe operation.
To clean up the resource cache, delete it as follows:
* **Internal blobstore**: Run `bosh ssh` to connect to the blobstore VM (NFS or WebDav) and `rm *` the contents of the `/var/vcap/store/shared/cc-resources` directory.
* **External blobstore**: Use the file store's API to delete the contents of the `resources` bucket.
Do not manually delete app package, buildpack, or droplet blobs from the blobstore. To free up resources from those locations, run `cf delete-buildpack` for buildpacks or `cf delete` for app packages and droplets.
## <a id='blobstore-expiry'></a> Blobstore load
The load that Cloud Controller generates on its blobstore is unique due to resource matching.
Many blobstores that perform well under normal read, write, and delete loada are overwhelmed by
Cloud Controller's heavy use of HEAD requests during resource matching.
Pushing an app with large number of files causes Cloud Controller to check the blobstore for the existence of each file.
Parallel BOSH deployments of Diego Cells can also generate significant read load on the Cloud Controller blobstore as the cells perform evacuation. For more information, see the [Evacuation](https://docs.cloudfoundry.org/devguide/deploy-apps/app-lifecycle.html#evacuation) section of the _App Container Lifecycle_ topic.
## <a id='blobstore-timeouts'></a> Blobstore interaction timeouts
Cloud Controller inherits default blobstore operation timeouts from Excon.
Excon defaults to 60 second read, write, and connect timeouts.
For more information, see the [excon](https://github.com/excon/excon) repository on GitHub.