Cloudflare Git LFS Worker
Cloudflare Workers
A Cloudflare Worker is a serverless application and is executed in a CDN style approach: Closest to the system executing the request. It is a application that gets executed in a vacuum whenever a request is received. It handles the request and when finished returning a response, it is killed.
This means that there is limited capabilities for data consistency between two different requests. There are several solutions to achieve persistency:
- Databases
- Key/Value stores (Cloudflare has its own)
- Durable Objects (Cloudflare concept)
Ideally though, a Worker should be as light as possible and only use persistency concepts if there really is a need.
Requirements
A Worker that can serve as a Git LFS endpoint/service, should have the following capabilities/requirements. These are roughly in order of priority.
- Implement the Git LFS Batch API, with the following actions:
- upload
- download
- verify
- Implement token-based authentication, with a minimum of two unique entities:
- read-only
- read-write
- Implement the basic transfer adapter
- Use an R2 bucket for storage
- Support any type of file and file size
- Return proper response codes as per Git LFS API design
- Limit the amount of time a link to upload, download or verify is valid
- Must be configurable
- (Optional) Implement Locking API
- If not implemented, it should return 404
Git LFS API
Below is some relevant details about the Git LFS API that needs to be implemented and how this is impacted by the Worker that will be implemented.
For full details, consult the Git LFS API documentation
Overview
In general the Git LFS API is designed in such a way that it doesn't handle the files itself. It has a Batch API component, which receives a batch of requests for operations/actions on a file, for which it returns a URL for the operation/action to be executed. It also has a Locking API, which will put a lock on a file so no other process can manipulate that file.
By separating the Batch API from the actual file operations itself, it is possible to have the LFS API server itself be an entity that just generates URLs which point to a secondary service (for instance R2 API) and the LFS API server doesn't have to handle the file manipulation.
All Batch API requests are executed as HTTP POST requests. And the Batch API is always JSON, with a special Content-Type
: application/vnd.git-lfs+json
Batch API
The Batch API is used by clients to indicate the client wants to execute certain operations on files in the Git LFS storage. The only supported operations are download and upload. The Batch API itself doesn't perform these operations itself. Instead it returns a set of actions (urls) to use to execute the requested operations.
The Batch API receives requests in a format that has the following components:
operation
- Type of operation the client wants to take on the objects (download
orupload
)transfers
- Client side supported transfer type, typically just[ "basic" ]
ref
name
- A git reference to which the objects belong to (example:refs/heads/main
)
objects
- List of object the client wants to perform theoperation
onoid
- Unique object ID, which is a unique hash through thehash_algo
size
- Size of the object
hash_algo
- hashing algorithm to determining theoid
, default issha256
Note on transfers
- Officially only basic
is supported, but there are some experimental ones.
The response to a Batch API request contains the following components:
transfer
- Server side preferred transfer type (typicallybasic
)objects
- List of the objects to execute the operation on, should be the same set of objects as per the requestoid
- Unique object ID, same as in the requestsize
- Object size in bytes, same as in the requestauthenticated
- Indicates if the action to take are authenticatedactions
- Object with the actions to take to meet the requested operation. Each entry in this object references eitherdownload
,upload
orverify
href
- URL to use for the actionheader
- Any special headers that need to be provided with the request, for instance an authentication headerexpires_in
- Indicates for how long thehref
andheaders
will be valid (in seconds)expires_at
- Exact time thehref
andheaders
expire
hash_algo
- Similar to the request, defaults tosha256
Operations vs Actions
Operations that are requested can be download
or upload
, while the actions
returned can be download
, upload
and verify
.
For the download
operation, the only action returned has to be download
. This will tell the LFS client to do a HTTP GET request for the href
provided.
For the upload
operation, the minimum action returned has to be upload
. Which will result in a HTTP PUT request to the href
. Optionally, a verify
action can also be returned in addition to the upload
action. The verify
action will result in a HTTP POST request to the href
.
The verify
option can be used to verify the upload
action has succeeded by verifying the oid
and size
are as what the client expects them to be.
Errors
Several errors can occur, the full list of response codes can be checked in the Git LFS API documentation. In general, an error will always return a JSON object with the following properties:
message
- Human readable error messagedocumentation_url
- Optional URL for more informationrequest_id
- Optional request identifier from the client
Locking API
The File Locking API is used to create, list, and delete locks, as well as verify that locks are respected in Git pushes.
Overall this is used to prevent multiple pushes modifying the same file at the same time.
This API requires persistency of state across requests and will initially not be implemented.
Git LFS Worker Design
High Level
To meet the requirements the following high level items will be implemented:
- Authentication using 3 unique configurable tokens:
- read
- write
- verify
- Batch API will be implemented to support
- download
- upload
- verify
- Downloads and Uploads will directly go to the R2 bucket, file upload/downloads will never pass through the Worker (Cloudflare has a 100MB limit per request for workers)
- The Batch API will use the R2 API to generate unique secure URLs for downloads and uploads which are only valid for a configurable duration
- The Batch API will generate verify URLs with a unique token that is only valid for a configurable duration
- The Worker will manage verify API calls itself and handle them as follows:
- Check the token is valid (both authentication wise as well as expiration)
- Verify the size in the request matches with the size as seen in the R2 bucket
- Return a 404 error for the Locks API, as per Locks API specifications if not implemented
- Return proper response codes for all situation as highlighted by the Git LFS API documentation
Configurable Settings
The following settings can be configured through environment variables for the Worker:
DOWNLOAD_KEY
- A unique key/token that must be provided by the client to be able to request thedownload
operation of the Batch APIUPLOAD_KEY
- A unique key/token that must be provided by the client to be able to request theupload
operation of the Batch APIVERIFY_KEY
- A unique key/token that is used to generate a token with an expiration time based on theVALIDITY_DURATION
, which the client needs to use for everyverify
actionVALIDITY_DURATION
- A value in seconds to identify how long averify
token must be valid and similarly for how long adownload
orupload
URL for the R2 bucket is validR2_BUCKET_URL
- The URL of the R2 BucketR2_BUCKET_NAME
- The name of the R2 BucketR2_BUCKET_KEY_ID
- The unique key ID for the key that can manage the R2 BucketR2_BUCKET_KEY_SECRET
- The unique key secret for they key that can manage the R2 Bucketbucket_name
- Needed for the Worker system to know which bucket to allow access to
Security
The solution will allow uploading files into an R2 Bucket, which means it needs to be protected from external uploads that are not meant for Git LFS. This means security needs to be implemented at multiple levels:
- Batch API - Authenticated and Authorize requests for operations, both download and upload
- Verify API - While not a dangerous API, it is best to secure all access
- R2 Bucket access - Generate URLs that are valid for a short period, to make sure the URLs can't be reused or abused
Security Configuration
Security can be configured through the following settings:
DOWNLOAD_KEY
UPLOAD_KEY
VERIFY_KEY
VALIDITY_DURATION
In the initial implementation, the keys will be used by all users. For future enhancements, it might be possible to have a unique token per user.
It is advised that the keys are 128 characters or more and consist of alphanumerical values
Bucket API Authentication
Every request to the Bucket API must have a Authorization
header. This Authorization
must match the Basic <base64 encoded username:token>
format. A username has been added for potential future enhancements where each user has its own unique token.
As an example, if the user john
wants to make a request to download files, and the configured DOWNLOAD_KEY
equals Sae9phua8ieghahK9aeH
, the Authorization
header would look like:
Authorization: Basic am9objpTYWU5cGh1YThpZWdoYWhLOWFlSA==
If the request uses the UPLOAD_KEY
, it can request both the download and upload operations. If the request uses the DOWNLOAD_KEY
, only the download operation will be allowed.
R2 URL Security
The R2 Bucket API of Cloudflare is the same API as AWS uses for their S3 buckets, so the same tools can be used, as per their documentation.
Using aws4fetch
, the Bucket API handler will use the AWSClient
sign
method to have the R2 Bucket API securely sign a R2 Bucket HTTP GET (download
operation) or HTTP PUT (upload
operation). This request will also have a max validity period, equal to VALIDITY_DURATION
.
This secures the downloading and uploading files directly from/to the R2 bucket and prevents unauthenticated access to the data.
Verify API Authentication
The verify API is the least significant or security sensitive API. When the Bucket API needs to generate a verify
href
as part of the upload
operation, it will encode the current date/time (unix timestamp format), using the VERIFY_KEY
. It will add that as a special X-FBW-GITLFS-TOKEN
header to the verify action in the response.
When the verify action is executed by the client, the Worker will decode the X-FBW-GITLFS-TOKEN
header and check the current time against the time in the request. If the difference is larger than the VALIDITY_DURATION
, it will reject the verify request.
Functional Design
This will provide a relatively high level view of the main functionalities of the Git LFS Worker, covering the 4 major components:
- Initial Request handling
- Batch API handling
- Verify API handling
- Lock API handling
Initial Request Handling
When a request is made and the Worker is executed the following initial actions are taken:
- Verify the request method is supported (supported methods:
OPTIONS, POST, PUT, GET
) - Split up the URL Path into a
repository
andaction
, exampleaircraft.git
andobjects/batch
which would indicate a Batch API request for theaircraft.git
repository
If (1) fails, a 405 HTTP Error is returned.
If the OPTIONS
HTTP method is used, a response is send back with the allow
header containing the allowed methods.
Batch API Handling
If the HTTP method is POST
and the action
is objects/batch
, the Batch API handling is executed.
Verify Authentication
The Authorization
header is retrieved and the handler determines if the user has read-only (download) or read/write (upload) access.
If the user has no access at all, a HTTP 401 (Unauthorized) error is returned.
Verifying Request
Several headers in the request are verified according to the Git LFS API specifications:
Accept
header must beapplication/vnd.git-lfs+json
, else a HTTP 406 (Accept Error) is returned.Content-Type
header must beapplication/vnd.git-lfs+json
, else a HTTP 422 (Content-Type Error) is returned.Host
header must be set, else a HTTP 422 (Host Error) is returned.
If no body is found in the request, a HTTP 422 (Validation Error) is returned.
To facilitate the handling of the Batch API, the Batch API request and Response objects are verified against the proper schemas. This is done using the ts-interface-checker
library. It is used to build a proper schema from an interface definition based on the Batch request and response schemas.
If the request does not match the schema, a HTTP 422 error is returned, including details of what part of the schema check failed.
Building the Response
Each object in the objects
property of the request is individually handled, and based on the operation
property of the request, a different response is prepared for the object.
The handler will build a list of objects and their actions to be included in the final Batch API Response.
Every time an R2 URL is signed, the composition of the URL is:
URL
is set toR2_BUCKET_URL
pathname
is set to<R2_BUCKET_NAME>/<repoUser>/<repository>/<oid>
, whererepoUser
is currently always_GLOBAL_
download
If the operation
is download
, the handler will use the AWSClient
to sign a request for a HTTP GET method with the URL pointing to the object oid
and proper location, and an expiration within VALIDITY_DURATION
time.
upload
If the operation
is upload
, the handler will use the AWSClient
to sign a request for a HTTP PUT method with the URL pointing to the object oid
and proper location, and an expiration within VALIDITY_DURATION
time.
The handler will also generate a URL for the Verify API for this object, including a proper X-FBW-GITLFS-TOKEN
header.
In case the user has no write access to the object, a 403 error will be raised for that specific object, stating "No write access to object".
Final Response
After generating the correct actions for each object, a Batch API Response object is build and returned to the client.
Verify API Handling
When a POST
HTTP request is received with a proper verify
path, the handler will verify that the X-FBW-GITLFS-TOKEN
is valid (expiration and authentication). If not, a HTTP 422 Error (Unauthorized) is returned.
Similarly, if no body in the request is found, a HTTP 422 Error (Validation Error) is returned.
For a valid request, the handler will execute a head
request to the R2 bucket for the object and compare the returned size with the size provided by request. If these match, a HTTP 200 response is returned. If these do not matc, a HTTP 500 Error (Verify Error) is returned.
Lock API Handling
Currently the Lock API is not supported, so any request for the Lock API is responded to with a HTTP 404 Error (Not found).
Fallback Response
In case a request is not handled by any of the previous handlers, a HTTP 501 Error (Method not implemented) is returned.
Setup and Deployment
R2 Bucket Configuration
A Cloudflare R2 Bucket is needed and should be reachable from a controlled URL. This does not need to be a custom domain.
An API token needs to be created with Object Read & Write access to the R2 Bucket.
To configure the Worker correctly, set the following configuration options correctly:
R2_BUCKET_URL
- Set this to the S3 API URL, without the name of the bucketR2_BUCKET_NAME
- Set this to the name of the R2 BucketR2_BUCKET_KEY_ID
- Set this to the API token IDR2_BUCKET_KEY_SECRET
- Set this to the API token Secret
For the R2_BUCKET_URL
, this can initially cause some confusion. As an example: If the S3 API URL for the R2 Bucket is https://8f6b3614f05e66b371351cd51293e44b.r2.cloudflarestorage.com/fbw-git-lfs-bucket
, then the R2_BUCKET_URL
must be set to https://8f6b3614f05e66b371351cd51293e44b.r2.cloudflarestorage.com
(do not include fbw-git-lfs-bucket
).
Worker Configuration
A Cloudflare Worker can either be created manually, before deploying the code, using the Cloudflare UI (or API). Or Wrangler can be used to deploy the Worker directly from the code.
What is important to verify is:
- Correct R2 Bucket bindings to the correct R2 Bucket
- Configure a custom domain in case you want the Worker to be accessible over an easy to remember URL
- Check the environment variables, and make sure to encrypt all the sensitive data (Anything with
KEY
in the name)
Deployment
-
Make sure to install Wrangler:
npm install -g wrangler
-
Update the
wrangler.toml
file with the appropriate configuration. -
Use Wrangler to deploy the LFS server
wrangler deploy
Troubleshooting
You can run a local server running your code using wrangler dev
.
Or you can tail the logs of the deployed Worker using wrangler tail
.