Network Working Group | J. Reschke |
Internet-Draft | greenbytes |
Intended status: Standards Track | S. Loreto |
Expires: September 18, 2016 | Ericsson |
March 17, 2016 |
This document describes an Hypertext Transfer Protocol (HTTP) content coding that can be used to describe the location of a secondary resource that contains the payload.¶
Distribution of this document is unlimited. Although this is not a work item of the HTTPbis Working Group, comments should be sent to the Hypertext Transfer Protocol (HTTP) mailing list at ietf-http-wg@w3.org, which may be joined by sending a message with subject "subscribe" to ietf-http-wg-request@w3.org.¶
Discussions of the HTTPbis Working Group are archived at <http://lists.w3.org/Archives/Public/ietf-http-wg/>.¶
XML versions, latest edits, and issue tracking for this document are available from <https://github.com/reschke/oobencoding> and <http://greenbytes.de/tech/webdav/#draft-reschke-http-oob-encoding>.¶
The changes in this draft are summarized in Appendix C.4.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress”.¶
This Internet-Draft will expire on September 18, 2016.¶
Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
This document describes an Hypertext Transfer Protocol (HTTP) content coding (Section 3.1.2.1 of [RFC7231]) that can be used to describe the location of a secondary resource that contains the payload.¶
The primary use case for this content coding is to enable origin servers to delegate the delivery of content to a secondary server that might be "closer" to the client (with respect to network topology) and/or able to cache content, leveraging content encryption, as described in [ENCRYPTENC].¶
The 'Out-Of-Band' content coding is used to direct the recipient to retrieve the actual message representation (Section 3 of [RFC7231]) from a secondary resource, such as a public cache:¶
Client Secondary Server Origin Server sends GET request with Accept-Encoding: out-of-band (1) |---------------------------------------------------------\ status 200 and Content-Coding: out-of-band | (2) <---------------------------------------------------------/ GET to secondary server (3) |---------------------------\ payload | (4) <---------------------------/ (5) Client and combines payload received in (4) with metadata received in (2).
The name of the content coding is "out-of-band".¶
The payload format uses JavaScript Object Notation (JSON, [RFC7159]), describing an object describing secondary resources plus OPTIONAL additional metadata:¶
The payload format uses a JSON array so that the origin server can specify multiple secondary resources. When a client receives a response containing multiple URIs, it is free to choose which of these to use.¶
New specifications can define new OPTIONAL header fields, thus clients MUST ignore unknown fields. Extension specifications will have to update this specification. [rfc.comment.1: or we define a registry] ¶
The client then obtains the original message by:¶
If the client is unable to retrieve the secondary resource's representation (host can't be reached, non 2xx response status code, payload failing integrity check, etc.), it can choose an alternate secondary resource (if specified), try the fallback URI (if given), or simply retry the request to the origin server without including "out-of-band" in the Accept-Encoding request header field. In the latter case, it can be useful to inform the origin server about what problems were encountered when trying to access the secondary resource; see Section 3.3 for details.¶
Note that although this mechanism causes the inclusion of external content, it will not affect the application-level security properties of the reconstructed message, such as its web origin ([RFC6454]).¶
The cacheability of the response for the secondary resource does not affect the cacheability of the reconstructed response message, which is the same as for the origin server's response.¶
Note that because the server's response depends on the request's Accept-Encoding header field, the response usually will need to be declared to vary on that. See Section 7.1.4 of [RFC7231] and Section 2.3 of [RFC7232] for details.¶
When the client fails to obtain the secondary resource, it can be useful to inform the origin server about the condition. This can be accomplished by adding a "Link" header field ([RFC5988]) to a subsequent request to the origin server, detailing the URI of the secondary resource and the failure reason.¶
The following link extension relations are defined:¶
Used in case the server was not reachable.¶
Link relation:
http://purl.org/NET/linkrel/not-reachable
Used in case the server responded, but the object could not be obtained.¶
Link relation:
http://purl.org/NET/linkrel/resource-not-found
Used in case the the payload could be obtained, but wasn't usable (for instance, because integrity checks failed).¶
Link relation:
http://purl.org/NET/linkrel/payload-unusable
Client request of primary resource:
GET /test HTTP/1.1 Host: www.example.com Accept-Encoding: gzip, out-of-band
Response:
HTTP/1.1 200 OK
Date: Thu, 14 May 2015 18:52:00 GMT
Content-Type: text/plain
Cache-Control: max-age=10, public
Content-Encoding: out-of-band
Content-Length: 145
Vary: Accept-Encoding
{
"URIs": [
"http://example.net/bae27c36-fa6a-11e4-ae5d-00059a3c7a00"
],
"fallback": "/c/bae27c36-fa6a-11e4-ae5d-00059a3c7a00"
}
(note that the Content-Type header field describes the media type of the secondary's resource representation, and the origin server supplied a fallback URI)
Client request for secondary resource:
GET /bae27c36-fa6a-11e4-ae5d-00059a3c7a00 HTTP/1.1 Host: example.net
Response:
HTTP/1.1 200 OK
Date: Thu, 14 May 2015 18:52:10 GMT
Cache-Control: private
Content-Length: 15
Hello, world.
(Note no Content-Type header field is present here because the secondary server truly does not know the media type of the payload)
Final message after recombining header fields:
HTTP/1.1 200 OK
Date: Thu, 14 May 2015 18:52:00 GMT
Content-Length: 15
Cache-Control: max-age=10, public
Content-Type: text/plain
Hello, world.
Given the example HTTP message from Section 5.4 of [ENCRYPTENC], a primary resource could use the "out-of-band" encoding to specify just the location of the secondary resource plus the contents of the "Crypto-Key" header field needed to decrypt the payload:¶
Response:
HTTP/1.1 200 OK
Date: Thu, 14 May 2015 18:52:00 GMT
Content-Encoding: aesgcm128, out-of-band
Content-Type: text/plain
Encryption: keyid="a1"; salt="vr0o6Uq3w_KDWeatc27mUg"
Crypto-Key: keyid="a1"; aesgcm128="csPJEXBYA5U-Tal9EdJi-w"
Content-Length: 87
Vary: Accept-Encoding
{
"URIs": [
"http://example.net/bae27c36-fa6a-11e4-ae5d-00059a3c7a00"
]
}
(note that the Content-Type header field describes the media type of the secondary's resource representation)
Response for secondary resource:
HTTP/1.1 200 OK Date: Thu, 14 May 2015 18:52:10 GMT Content-Length: ... Cache-Control: private fuag8ThIRIazSHKUqJ5OduR75UgEUuM76J8UFwadEvg
(payload body shown in base64 here)
Final message undoing all content codings:
HTTP/1.1 200 OK
Date: Thu, 14 May 2015 18:52:00 GMT
Content-Length: 15
Content-Type: text/plain
I am the walrus
Client requests primary resource as in Section 3.4.1, but the attempt to access the secondary resource fails.¶
Response:
HTTP/1.1 404 Not Found
Date: Thu, 08 September 2015 16:49:00 GMT
Content-Type: text/plain
Content-Length: 20
Resource Not Found
Client retries with the origin server and includes Link header field reporting the problem:
GET /test HTTP/1.1 Host: www.example.com Accept-Encoding: gzip, out-of-band Link: <http://example.net/bae27c36-fa6a-11e4-ae5d-00059a3c7a00>; rel="http://purl.org/NET/linkrel/resource-not-found"
New content codings can be deployed easily, as the client can use the "Accept-Encoding" header field (Section 5.3.4 of [RFC7231]) to signal which content codings are supported.¶
This specification does not define means to verify that the payload obtained from the secondary resource really is what the origin server expects it to be. Content signatures can address this concern (see [CONTENTSIG] and [MICE]).¶
The Out-Of-Band content coding could be used to circumvent the same-origin policy ([RFC6454], Section 3) of user agents: an attacking site which knows the URI of a secondary resource would use the out-of-band coding to trick the user agent to read the contents of the secondary resource, which then, due to the security properties of out-of-band codings, would be handled as if it originated from the origin's resource.¶
This problem is not yet addressed by this specification. Possible defenses would be to rely on signatures and encryption, or to add an indication to the secondary resource's response that would prevent further processing in responses from "bad" origins (not unlike the "Access-Control-Allow-Origin" header field defined in Section 5.1 of [CORS]).¶
In general, content codings can be used in both requests and responses. This particular content coding has been designed for responses. When supported in requests, it creates a new attack vector where the receiving server can be tricked into including content that the client might not have access to otherwise (such as HTTP resources behind a firewall).¶
The IANA "HTTP Content Coding Registry", located at <http://www.iana.org/assignments/http-parameters>, needs to be updated with the registration below:¶
A plausible alternative approach would be to implement this functionality one level up, using a new redirect status code (Section 6.4 of [RFC7231]). However, this would have several drawbacks:¶
Another alternative would be to implement the indirection on the level of the media type using something similar to the type "message/external-body", defined in [RFC2017] and refined for use in the Session Initiation Protocol (SIP) in [RFC4483]. This approach though would share most of the drawbacks of the status code approach mentioned above.¶
We probably need to handle Range Requests. How would this work? Passing down the Range request header field to the secondary resource?¶
What about codes other than 200 and 206?¶
One use-case for this protocol is to enable a system of "blind caches", which would serve the secondary resources. These caches might only be populated on demand, thus it could happen that whatever mechanism is used to populate the cache hasn't finished when the client hits it (maybe due to race conditions, or because the cache is behind a middlebox which doesn't allow the origin server to push content to it).¶
In this particular case, it can be useful if the client was able to "piggyback" the URI of the fallback for the primary resource, giving the secondary server a means by which it could obtain the payload itself. This information could be provided in yet another Link header field:¶
GET bae27c36-fa6a-11e4-ae5d-00059a3c7a00 HTTP/1.1 Host: example.net Link: <http://example.com/c/bae27c36-fa6a-11e4-ae5d-00059a3c7a00>; rel="http://purl.org/NET/linkrel/primary-resource"
(continuing the example from Section 3.4.1)
When out-of-band encoding is used as part of a caching solution, the additional round trips to the origin server can be a significant performance problem; in particular, when many small resources need to be loaded (such as scripts, images, or video fragments). In cases like these, it could be useful for the origin server to provide a "resource map", allowing to skip the round trips to the origin server for these mapped resources. Plausible ways to transmit the resource map could be:¶
This specification does not define a format, nor a mechanism to transport the map, but it's a given that some specification using "out-of-band" encoding will do.¶
It might be a good idea to allow padding in the secondary resource's payload, in order to even hide the precise content length. This could be accomplished by adding range information to the out-of-band metadata, allowing the client to throw away parts of the payload when reconstructing the response body.¶
Mention media type approach.¶
Explain that clients can always fall back not to use oob when the secondary resource isn't available.¶
Add Vary response header field to examples and mention that it'll usually be needed (<https://github.com/reschke/oobencoding/issues/6>).¶
Experimentally add problem reporting using piggy-backed Link header fields (<https://github.com/reschke/oobencoding/issues/7>).¶
Updated ENCRYPTENC reference.¶
Add MICE reference.¶
Remove the ability of the secondary resource to contain anything but the payload (<https://github.com/reschke/oobencoding/issues/11>).¶
Changed JSON payload to be an object containing an array of URIs plus additional members. Specify "fallback" as one of these additional members, and update Appendix B.2 accordingly).¶
Discuss extensibility a bit.¶
Mention "Content Stealing" thread.¶
Mention padding.¶
Thanks to Christer Holmberg, Daniel Lindstrom, Goran Eriksson, John Mattsson, Kevin Smith, Magnus Westerlund, Mark Nottingham, Martin Thomson, and Roland Zink for feedback on this document.¶