draft-ietf-httpapi-ratelimit-headers-06.unpg.txt   draft-ietf-httpapi-ratelimit-headers-latest.txt 
HTTPAPI Working Group R. Polli HTTPAPI Working Group R. Polli
Internet-Draft Team Digitale, Italian Government Internet-Draft Team Digitale, Italian Government
Intended status: Standards Track A. Martinez Intended status: Standards Track A. Martinez
Expires: June 25, 2023 Red Hat Expires: April 7, 2025 Red Hat
December 22, 2022 D. Miller
Microsoft
October 04, 2024
RateLimit Fields for HTTP RateLimit header fields for HTTP
draft-ietf-httpapi-ratelimit-headers-06 draft-ietf-httpapi-ratelimit-headers-latest
Abstract Abstract
This document defines the RateLimit-Limit, RateLimit-Remaining, This document defines the RateLimit-Policy and RateLimit HTTP header
RateLimit-Reset and RateLimit-Policy HTTP fields for servers to fields for servers to advertise their service policy limits and the
advertise their current service rate limits, thereby allowing clients current limits, thereby allowing clients to avoid being throttled.
to avoid being throttled.
About This Document About This Document
This note is to be removed before publishing as an RFC. This note is to be removed before publishing as an RFC.
Status information for this document may be found at Status information for this document may be found at
<https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit- <https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-
headers/>. headers/>.
Discussion of this document takes place on the HTTPAPI Working Group Discussion of this document takes place on the HTTPAPI Working Group
skipping to change at line 51 skipping to change at page 2, line 7
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 25, 2023. This Internet-Draft will expire on April 7, 2025.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Goals 1.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Notational Conventions 1.2. Notational Conventions . . . . . . . . . . . . . . . . . 5
2. Concepts 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1. Quota Policy 2.1. Quota . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Time Window 2.2. Quota Unit . . . . . . . . . . . . . . . . . . . . . . . 5
2.3. Service Limit 2.3. Quota Partition . . . . . . . . . . . . . . . . . . . . . 5
3. RateLimit Field Definitions 2.4. Time Window . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. RateLimit-Limit 2.5. Quota Policy . . . . . . . . . . . . . . . . . . . . . . 6
3.2. RateLimit-Policy 2.6. Service Limit . . . . . . . . . . . . . . . . . . . . . . 6
3.3. RateLimit-Remaining 3. RateLimit-Policy Field . . . . . . . . . . . . . . . . . . . 6
3.4. RateLimit-Reset 3.1. Quota Policy Item . . . . . . . . . . . . . . . . . . . . 6
4. Server Behavior 3.1.1. Quota Parameter . . . . . . . . . . . . . . . . . . . 7
4.1. Performance Considerations 3.1.2. Quota Unit Parameter . . . . . . . . . . . . . . . . 7
5. Client Behavior 3.1.3. Window Parameter . . . . . . . . . . . . . . . . . . 7
5.1. Intermediaries 3.1.4. Partition Key Parameter . . . . . . . . . . . . . . . 7
5.2. Caching 3.2. RateLimit Policy Field Examples . . . . . . . . . . . . . 7
6. Security Considerations 4. RateLimit Field . . . . . . . . . . . . . . . . . . . . . . . 8
6.1. Throttling does not prevent clients from issuing requests 4.1. Service Limit Item . . . . . . . . . . . . . . . . . . . 8
6.2. Information disclosure 4.1.1. Remaining Parameter . . . . . . . . . . . . . . . . . 8
6.3. Remaining quota units are not granted requests 4.1.2. Reset Parameter . . . . . . . . . . . . . . . . . . . 8
6.4. Reliability of RateLimit-Reset 4.1.3. Partition Key Parameter . . . . . . . . . . . . . . . 9
6.5. Resource exhaustion 4.2. RateLimit Field Examples . . . . . . . . . . . . . . . . 9
6.5.1. Denial of Service 5. Server Behavior . . . . . . . . . . . . . . . . . . . . . . . 9
7. Privacy Considerations 5.1. Performance Considerations . . . . . . . . . . . . . . . 10
8. IANA Considerations 6. Client Behavior . . . . . . . . . . . . . . . . . . . . . . . 10
8.1. RateLimit Parameters Registration 6.1. Intermediaries . . . . . . . . . . . . . . . . . . . . . 11
9. References 6.2. Caching . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.1. Normative References 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12
9.2. Informative References 7.1. Throttling does not prevent clients from issuing requests 12
9.3. URIs 7.2. Information disclosure . . . . . . . . . . . . . . . . . 12
Appendix A. Rate-limiting and quotas 7.3. Remaining quota units are not granted requests . . . . . 13
A.1. Interoperability issues 7.4. Reliability of the reset keyword . . . . . . . . . . . . 13
Appendix B. Examples 7.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 13
B.1. Unparameterized responses 7.5.1. Denial of Service . . . . . . . . . . . . . . . . . . 14
B.1.1. Throttling information in responses 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 15
B.1.2. Use in conjunction with custom fields 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
B.1.3. Use for limiting concurrency 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 15
B.1.4. Use in throttled responses 10.1. Normative References . . . . . . . . . . . . . . . . . . 15
B.2. Parameterized responses 10.2. Informative References . . . . . . . . . . . . . . . . . 16
B.2.1. Throttling window specified via parameter Appendix A. Rate-limiting and quotas . . . . . . . . . . . . . . 17
B.2.2. Dynamic limits with parameterized windows A.1. Interoperability issues . . . . . . . . . . . . . . . . . 18
B.2.3. Dynamic limits for pushing back and slowing down Appendix B. Examples . . . . . . . . . . . . . . . . . . . . . . 18
B.1. Responses without defining policies . . . . . . . . . . . 18
B.1.1. Throttling information in responses . . . . . . . . . 18
B.1.2. Multiple policies in response . . . . . . . . . . . . 19
B.1.3. Use for limiting concurrency . . . . . . . . . . . . 20
B.1.4. Use in throttled responses . . . . . . . . . . . . . 21
B.2. Responses with defined policies . . . . . . . . . . . . . 22
B.2.1. Throttling window specified via parameter . . . . . . 22
B.2.2. Dynamic limits with parameterized windows . . . . . . 22
B.2.3. Dynamic limits for pushing back and slowing down . . 23
B.3. Dynamic limits for pushing back with Retry-After and slow B.3. Dynamic limits for pushing back with Retry-After and slow
down down . . . . . . . . . . . . . . . . . . . . . . . . . . 23
B.3.1. Missing Remaining information B.3.1. Missing Remaining information . . . . . . . . . . . . 24
B.3.2. Use with multiple windows B.3.2. Use with multiple windows . . . . . . . . . . . . . . 25
FAQ FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
RateLimit fields currently used on the web RateLimit header fields currently used on the web . . . . . . . . 28
Acknowledgements Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 30
Changes Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
F.1. Since draft-ietf-httpapi-ratelimit-headers-03 F.1. Since draft-ietf-httpapi-ratelimit-headers-07 . . . . . . 30
F.2. Since draft-ietf-httpapi-ratelimit-headers-02 F.2. Since draft-ietf-httpapi-ratelimit-headers-03 . . . . . . 30
F.3. Since draft-ietf-httpapi-ratelimit-headers-01 F.3. Since draft-ietf-httpapi-ratelimit-headers-02 . . . . . . 30
F.4. Since draft-ietf-httpapi-ratelimit-headers-00 F.4. Since draft-ietf-httpapi-ratelimit-headers-01 . . . . . . 30
Authors' Addresses F.5. Since draft-ietf-httpapi-ratelimit-headers-00 . . . . . . 31
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31
1. Introduction 1. Introduction
Rate limiting HTTP clients has become a widespread practice, Rate limiting of HTTP clients has become a widespread practice,
especially for HTTP APIs. Typically, servers who do so limit the especially for HTTP APIs. Typically, servers who do so limit the
number of acceptable requests in a given time window (e.g. 10 number of acceptable requests in a given time window (e.g. 10
requests per second). See Appendix A for further information on the requests per second). See Appendix A for further information on the
current usage of rate limiting in HTTP. current usage of rate limiting in HTTP.
Currently, there is no standard way for servers to communicate quotas Currently, there is no standard way for servers to communicate quotas
so that clients can throttle its requests to prevent errors. This so that clients can throttle their requests to prevent errors. This
document defines a set of standard HTTP fields to enable rate document defines a set of standard HTTP header fields to enable rate
limiting: limiting:
o RateLimit-Limit: the server's quota for requests by the client in o RateLimit: to convey the server's current limit of quota units
the time window, available to the client in the policy time window, the remaining
quota units in the current window, and the time remaining in the
o RateLimit-Remaining: the remaining quota in the current window, current window, specified in seconds, and
o RateLimit-Reset: the time remaining in the current window,
specified in seconds, and
o RateLimit-Policy: the quota policy. o RateLimit-Policy: the service policy limits.
These fields allow the establishment of complex rate limiting These fields enable establishing complex rate limiting policies,
policies, including using multiple and variable time windows and including using multiple and variable time windows and dynamic
dynamic quotas, and implementing concurrency limits. quotas, and implementing concurrency limits.
The behavior of the RateLimit-Reset field is compatible with the The behavior of the RateLimit header field is compatible with the
delay-seconds notation of Retry-After. delay-seconds notation of Retry-After.
1.1. Goals 1.1. Goals
The goals of this document are: The goals of this document are:
Interoperability: Standardization of the names and semantics of Interoperability: Standardize the names and semantics of rate-limit
rate-limit headers to ease their enforcement and adoption; headers to ease their enforcement and adoption;
Resiliency: Improve resiliency of HTTP infrastructure by providing Resiliency: Improve resiliency of HTTP infrastructure by providing
clients with information useful to throttle their requests and clients with information useful to throttle their requests and
prevent 4xx or 5xx responses; prevent 4xx or 5xx responses;
Documentation: Simplify API documentation by eliminating the need to Documentation: Simplify API documentation by eliminating the need to
include detailed quota limits and related fields in API include detailed quota limits and related fields in API
documentation. documentation.
The following features are out of the scope of this document: The following features are out of the scope of this document:
Authorization: RateLimit fields are not meant to support Authorization: RateLimit header fields are not meant to support
authorization or other kinds of access controls. authorization or other kinds of access controls.
Throttling scope: This specification does not cover the throttling Response status code: RateLimit header fields may be returned in
scope, that may be the given resource-target, its parent path or both successful (see Section 15.3 of [HTTP]) and non-successful
the whole Origin (see Section 7 of [WEB-ORIGIN]). This can be
addressed using extensibility mechanisms such as the parameter
registry Section 8.1.
Response status code: RateLimit fields may be returned in both
successful (see Section 15.3 of [HTTP]) and non-successful
responses. This specification does not cover whether non responses. This specification does not cover whether non
Successful responses count on quota usage, nor it mandates any Successful responses count on quota usage, nor does it mandates
correlation between the RateLimit values and the returned status any correlation between the RateLimit values and the returned
code. status code.
Throttling policy: This specification does not mandate a specific Throttling algorithm: This specification does not mandate a specific
throttling policy. The values published in the fields, including throttling algorithm. The values published in the fields,
the window size, can be statically or dynamically evaluated. including the window size, can be statically or dynamically
evaluated.
Service Level Agreement: Conveyed quota hints do not imply any Service Level Agreement: Conveyed quota hints do not imply any
service guarantee. Server is free to throttle respectful clients service guarantee. Server is free to throttle respectful clients
under certain circumstances. under certain circumstances.
1.2. Notational Conventions 1.2. Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
This document uses the Augmented BNF defined in [RFC5234] and updated
by [RFC7405] along with the "#rule" extension defined in
Section 5.6.1 of [HTTP].
The term Origin is to be interpreted as described in Section 7 of The term Origin is to be interpreted as described in Section 7 of
[WEB-ORIGIN]. [WEB-ORIGIN].
This document uses the terms List, Item and Integer from Section 3 of This document uses the terms List, Item and Integer from Section 3 of
[STRUCTURED-FIELDS] to specify syntax and parsing, along with the [STRUCTURED-FIELDS] to specify syntax and parsing, along with the
concept of "bare item". concept of "bare item".
The fields defined in this document are collectively referred to as 2. Terminology
"RateLimit fields".
2. Concepts 2.1. Quota
2.1. Quota Policy A quota is an allocation of capacity to enable a server to limit
client requests. That capacity is counted in quota units and may be
reallocated at the end of a time window Section 2.4.
A quota policy is described in terms of quota units (Section 2.3) and 2.2. Quota Unit
a time window (Section 2.2). It is an Item whose bare item is a
service limit (Section 2.3), along with associated Parameters.
The following parameters are defined in this specification: A quota unit is the unit of measure used to count the activity of a
client.
w: The REQUIRED "w" parameter value conveys a time window value as 2.3. Quota Partition
defined in Section 2.2.
Other parameters are allowed and can be regarded as comments. They A quota partition is a division of a server's capacity across
ought to be registered within the "Hypertext Transfer Protocol (HTTP) different clients, users and owned resources.
RateLimit Parameters Registry", as described in Section 8.1.
For example, a quota policy of 100 quota units per minute: 2.4. Time Window
100;w=60 A time window indicates a period of time associated to the allocated
quota.
The definition of a quota policy does not imply any specific The time window is a non-negative Integer value expressing an
distribution of quota units within the time window. If applicable, interval in seconds, similar to the "delay-seconds" rule defined in
these details can be conveyed as extension parameters. Section 10.2.3 of [HTTP]. Sub-second precision is not supported.
For example, two quota policies containing further details via 2.5. Quota Policy
extension parameters:
100;w=60;comment="fixed window" A quota policy is maintained by a server to limit the activity
12;w=1;burst=1000;policy="leaky bucket" (counted in quota units (Section 2.2)) of a given quota partition
(Section 2.3) over a period of time (known as the time window
(Section 2.4)) to a specified amount known as the Section 2.1.
To avoid clashes, implementers SHOULD prefix unregistered parameters Quota policies can be advertised by servers (see Section 3), but they
with a vendor identifier, e.g. "acme-policy", "acme-burst". While it are not required to be, and more than one quota policy can affect a
is useful to define a clear syntax and semantics even for custom given request from a client to a server.
parameters, it is important to note that user agents are not required
to process quota policy information.
2.2. Time Window 2.6. Service Limit
Rate limit policies limit the number of acceptable requests within a A service limit is the current limit of the amount of activity that a
given time interval, known as a time window. server will allow based on the remaining quota for a particular quota
partition within the time-window, if defined.
The time window is a non-negative Integer value expressing that 3. RateLimit-Policy Field
interval in seconds, similar to the "delay-seconds" rule defined in
Section 10.2.3 of [HTTP]. Subsecond precision is not supported.
2.3. Service Limit The "RateLimit-Policy" response header field is a non-empty List of
Section 3.1. Its value is informative. The values are expected to
remain consistent over a the lifetime of a connection. It is this
characteristic that differentiates it from the RateLimit (Section 4)
that contains values that may change on every request.
The service limit is associated with the maximum number of requests RateLimit-Policy: burst;q=100;w=60,daily;q=1000;w=86400
that the server is willing to accept from one or more clients on a
given basis (originating IP, authenticated user, geographical, ..)
during a time window (Section 2.2).
The service limit is a non-negative Integer expressed in quota units. 3.1. Quota Policy Item
The service limit SHOULD match the maximum number of acceptable A quota policy Item contains information about a server's capacity
requests. However, the service limit MAY differ from the total allocation for a quota partition associated with the request.
number of acceptable requests when weight mechanisms, bursts, or
other server policies are implemented.
If the service limit does not match the maximum number of acceptable The following parameters are defined in this specification:
requests the relation with that SHOULD be communicated out-of-band.
Example: A server could q: The REQUIRED "q" parameter indicates the quota allocated.
(Section 3.1.1)
o count once requests like "/books/{id}" qu: The OPTIONAL "qu" parameter value conveys the quota units
associated to the "q" parameter. The default quota unit is
"request". (Section 3.1.2)
o count twice search requests like "/books?author=WuMing" w: The OPTIONAL "w" parameter value conveys a time "window"
(Section 2.4). (Section 3.1.3)
so that we have the following counters pk: The OPTIONAL "pk" parameter value conveys the partition key
associated to the corresponding request. Section 3.1.4
GET /books/123 ; service-limit=4, remaining: 3, status=200 Other parameters are allowed and can be regarded as comments.
GET /books?author=WuMing ; service-limit=4, remaining: 1, status=200
GET /books?author=Eco ; service-limit=4, remaining: 0, status=429
3. RateLimit Field Definitions Implementation- or service-specific parameters SHOULD be prefixed
parameters with a vendor identifier, e.g. "acme-policy", "acme-
burst".
The following RateLimit response fields are defined. 3.1.1. Quota Parameter
3.1. RateLimit-Limit The "q" parameter uses a non-negative integer value to indicate the
quota allocated for client activity (counted in quota units) for a
given quota partition (Section 2.6).
The "RateLimit-Limit" response field indicates the service limit 3.1.2. Quota Unit Parameter
(Section 2.3) associated with the client in the current time window
(Section 2.2). If the client exceeds that limit, it MAY not be
served.
The field is an Item and its value is a non-negative Integer referred The "qu" parameter value conveys the quota units associated to the
to as the "expiring-limit". This specification does not define "q" parameter.
Parameters for this field. If they appear, they MUST be ignored.
The expiring-limit MUST be set to the service limit that is closest 3.1.3. Window Parameter
to reaching its limit, and the associated time window MUST either be:
o inferred by the value of RateLimit-Reset field at the moment of The "w" parameter value conveys a time "window" in seconds.
the reset, or (Section 2.4).
o communicated out-of-band (e.g. in the documentation). 3.1.4. Partition Key Parameter
The RateLimit-Policy field (see Section 3.2), might contain The "pk" parameter value conveys the partition key associated to the
information on the associated time window. request. Servers MAY use the partition key to divide server capacity
across different clients and resources. Quotas are allocated per
partition key.
RateLimit-Limit: 100 3.2. RateLimit Policy Field Examples
This field can be sent in a trailer section. This field MAY convey the time window associated with the expiring-
limit, as shown in this example:
3.2. RateLimit-Policy RateLimit-Policy: default;l=100;w=10
The "RateLimit-Policy" response field indicates the quota policies These examples show multiple policies being returned:
currently associated with the client. Its value is informative.
The field is a non-empty List of Items. Each item is a quota policy RateLimit-Policy: permin;l=50;w=60,perhr;l=1000;w=3600,perday;l=5000;w=86400
(Section 2.1).
This field can convey the time window associated with the expiring- The following example shows a policy with a partition key:
limit, as shown in this example:
RateLimit-Policy: 100;w=10 RateLimit-Policy: peruser;l=100;w=60;pk=user123
RateLimit-Limit: 100
These examples show multiple policies being returned: The following example shows a policy with a partition key and a quota
unit:
RateLimit-Policy: 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 RateLimit-Policy: peruser;l=65535;w=10;pk=user123;qu=bytes
RateLimit-Policy: 10;w=1;burst=1000, 1000;w=3600
This field can be sent in a trailer section. This field cannot appear in a trailer section.
3.3. RateLimit-Remaining 4. RateLimit Field
The "RateLimit-Remaining" response field indicates the remaining A server uses the "RateLimit" response header field to communicate
quota units associated to the expiring-limit. the service limit for a quota policy for a particular partition key.
The field is an Item and its value is a non-negative Integer The field is expressed as List of Section 4.1.
expressed in quota units (Section 2.3). This specification does not
define Parameters for this field. If they appear, they MUST be
ignored.
This field can be sent in a trailer section. RateLimit: default;r=50;t=30
Clients MUST NOT assume that a positive RateLimit-Remaining field 4.1. Service Limit Item
value is a guarantee that further requests will be served.
When the value of RateLimit-Remaining is low, it indicates that the Each service limit item in identifies the quota policy associated
server may soon throttle the client (see Section 4). with the request and
For example: The following parameters are defined in this specification:
RateLimit-Remaining: 50 r: This parameter value conveys the remaining quota units for the
identified policy (Section 4.1.1).
3.4. RateLimit-Reset t: This OPTIONAL parameter value conveys the time window reset time
for the identified policy (Section 4.1.2).
The "RateLimit-Reset" field response field indicates the number of pk: The OPTIONAL "pk" parameter value conveys the partition key
seconds until the quota associated to the expiring-limit resets. associated to the corresponding request.
The field is a non-negative Integer compatible with the delay-seconds This field cannot appear in a trailer section. Other parameters are
rule, because: allowed and can be regarded as comments.
Implementation- or service-specific parameters SHOULD be prefixed
parameters with a vendor identifier, e.g. "acme-policy", "acme-
burst".
4.1.1. Remaining Parameter
The "r" parameter indicates the remaining quota units for the
identified policy (Section 4.1.1).
It is a non-negative Integer expressed in quota units (Section 2.2).
Clients MUST NOT assume that a positive remaining value is a
guarantee that further requests will be served. When remaining
parameter value is low, it indicates that the server may soon
throttle the client (see Section 5).
4.1.2. Reset Parameter
The "t" parameter indicates the number of seconds until the quota
associated with the quota policy resets.
It is a non-negative Integer compatible with the delay-seconds rule,
because:
o it does not rely on clock synchronization and is resilient to o it does not rely on clock synchronization and is resilient to
clock adjustment and clock skew between client and server (see clock adjustment and clock skew between client and server (see
Section 5.6.7 of [HTTP]); Section 5.6.7 of [HTTP]);
o it mitigates the risk related to thundering herd when too many o it mitigates the risk related to thundering herd when too many
clients are serviced with the same timestamp. clients are serviced with the same timestamp.
This specification does not define Parameters for this field. If The client MUST NOT assume that all its service limit will be reset
they appear, they MUST be ignored. at the moment indicated by the reset keyword. The server MAY
arbitrarily alter the reset parameter value between subsequent
requests; for example, in case of resource saturation or to implement
sliding window policies.
This field can be sent in a trailer section. 4.1.3. Partition Key Parameter
An example of RateLimit-Reset field use is below. The "pk" parameter value conveys the partition key associated to the
request. Servers MAY use the partition key to divide server capacity
across different clients and resources. Quotas are allocated per
partition key.
RateLimit-Reset: 50 4.2. RateLimit Field Examples
The client MUST NOT assume that all its service limit will be reset This example shows a RateLimit field with a remaining quota of 50
at the moment indicated by the RateLimit-Reset field. The server MAY units and a time window reset in 30 seconds:
arbitrarily alter the RateLimit-Reset field value between subsequent
requests; for example, in case of resource saturation or to implement
sliding window policies.
4. Server Behavior RateLimit: default;r=50;t=30
A server uses the RateLimit fields to communicate its quota policies. This example shows a remaining quota of 999 requests for a partition
Sending the RateLimit-Limit and RateLimit-Reset fields is REQUIRED; key that has no time window reset:
sending RateLimit-Remaining field is RECOMMENDED.
A server MAY return RateLimit fields independently of the response RateLimit: default;r=999;pk=trial-121323
status code. This includes on throttled responses. This document
does not mandate any correlation between the RateLimit field values
and the returned status code.
Servers should be careful when returning RateLimit fields in This example shows a 300MB remaining quota for an application in the
next 60 seconds:
RateLimit: default;r=300000000;pk=App-999;t=60;qu=bytes
5. Server Behavior
A server MAY return RateLimit header fields independently of the
response status code. This includes on throttled responses. This
document does not mandate any correlation between the RateLimit
header field values and the returned status code.
Servers should be careful when returning RateLimit header fields in
redirection responses (i.e., responses with 3xx status codes) because redirection responses (i.e., responses with 3xx status codes) because
a low RateLimit-Remaining field value could prevent the client from a low remaining keyword value could prevent the client from issuing
issuing requests. For example, given the RateLimit fields below, a requests. For example, given the RateLimit header fields below, a
client could decide to wait 10 seconds before following the client could decide to wait 10 seconds before following the
"Location" header field (see Section 10.2.2 of [HTTP]), because the "Location" header field (see Section 10.2.2 of [HTTP]), because the
RateLimit-Remaining field value is 0. remaining keyword value is 0.
HTTP/1.1 301 Moved Permanently HTTP/1.1 301 Moved Permanently
Location: /foo/123 Location: /foo/123
RateLimit-Remaining: 0 RateLimit: problemPolicy;r=0, t=10
RateLimit-Limit: 10
RateLimit-Reset: 10
If a response contains both the Retry-After and the RateLimit-Reset
fields, the RateLimit-Reset field value SHOULD reference the same
point in time as the Retry-After field value.
When using a policy involving more than one time window, the server If a response contains both the Retry-After and the RateLimit header
MUST reply with the RateLimit fields related to the time window with fields, the reset keyword value SHOULD reference the same point in
the lower RateLimit-Remaining field values. time as the Retry-After field value.
A service using RateLimit fields MUST NOT convey values exposing an A service using RateLimit header fields MUST NOT convey values
unwanted volume of requests and SHOULD implement mechanisms to cap exposing an unwanted volume of requests and SHOULD implement
the ratio between RateLimit-Remaining and RateLimit-Reset field mechanisms to cap the ratio between the remaining and the reset
values (see Section 6.5); this is especially important when a quota keyword values (see Section 7.5); this is especially important when a
policy uses a large time window. quota policy uses a large time window.
Under certain conditions, a server MAY artificially lower RateLimit Under certain conditions, a server MAY artificially lower RateLimit
field values between subsequent requests, e.g. to respond to Denial header field values between subsequent requests, e.g. to respond to
of Service attacks or in case of resource saturation. Denial of Service attacks or in case of resource saturation.
Servers usually establish whether the request is in-quota before
creating a response, so the RateLimit field values should be already
available in that moment. Nonetheless servers MAY decide to send the
RateLimit fields in a trailer section.
4.1. Performance Considerations 5.1. Performance Considerations
Servers are not required to return RateLimit fields in every Servers are not required to return RateLimit header fields in every
response, and clients need to take this into account. For example, response, and clients need to take this into account. For example,
an implementer concerned with performance might provide RateLimit an implementer concerned with performance might provide RateLimit
fields only when a given quota is going to expire. header fields only when a given quota is close to exhaustion.
Implementers concerned with response fields' size, might take into Implementers concerned with response fields' size, might take into
account their ratio with respect to the content length, or use account their ratio with respect to the content length, or use
header-compression HTTP features such as [HPACK]. header-compression HTTP features such as [HPACK].
5. Client Behavior 6. Client Behavior
The RateLimit fields can be used by clients to determine whether the The RateLimit header fields can be used by clients to determine
associated request respected the server's quota policy, and as an whether the associated request respected the server's quota policy,
indication of whether subsequent requests will. However, the server and as an indication of whether subsequent requests will. However,
might apply other criteria when servicing future requests, and so the the server might apply other criteria when servicing future requests,
quota policy may not completely reflect whether they will succeed. and so the quota policy may not completely reflect whether requests
will succeed.
For example, a successful response with the following fields: For example, a successful response with the following fields:
RateLimit-Limit: 10 RateLimit: default;r=1;t=7
RateLimit-Remaining: 1
RateLimit-Reset: 7
does not guarantee that the next request will be successful. does not guarantee that the next request will be successful.
Servers' behavior may be subject to other conditions like the one Servers' behavior may be subject to other conditions.
shown in the example from Section 2.3.
A client MUST validate the RateLimit fields before using them and
check if there are significant discrepancies with the expected ones.
This includes a RateLimit-Reset field moment too far in the future
(e.g. similarly to receiving "Retry-after: 1000000") or a service-
limit too high.
A client receiving RateLimit fields MUST NOT assume that future A client is responsible for ensuring that RateLimit header field
responses will contain the same RateLimit fields, or any RateLimit values returned cause reasonable client behavior with respect to
fields at all. throughput and latency (see Section 7.5 and Section 7.5.1).
Malformed RateLimit fields MUST be ignored. A client receiving RateLimit header fields MUST NOT assume that
future responses will contain the same RateLimit header fields, or
any RateLimit header fields at all.
A client SHOULD NOT exceed the quota units conveyed by the RateLimit- Malformed RateLimit header fields MUST be ignored.
Remaining field before the time window expressed in RateLimit-Reset
field.
A client MAY still probe the server if the RateLimit-Reset field is A client SHOULD NOT exceed the quota units conveyed by the remaining
considered too high. keyword before the time window expressed in the reset keyword.
The value of RateLimit-Reset field is generated at response time: a The value of the reset keyword is generated at response time: a
client aware of a significant network latency MAY behave accordingly client aware of a significant network latency MAY behave accordingly
and use other information (e.g. the "Date" response header field, or and use other information (e.g. the "Date" response header field, or
otherwise gathered metrics) to better estimate the RateLimit-Reset otherwise gathered metrics) to better estimate the reset keyword
field moment intended by the server. moment intended by the server.
The details provided in RateLimit-Policy field are informative and The details provided in the RateLimit-Policy header field are
MAY be ignored. informative and MAY be ignored.
If a response contains both the RateLimit-Reset and Retry-After If a response contains both the RateLimit and Retry-After fields, the
fields, the Retry-After field MUST take precedence and the RateLimit- Retry-After field MUST take precedence and the reset keyword MAY be
Reset field MAY be ignored. ignored.
This specification does not mandate a specific throttling behavior This specification does not mandate a specific throttling behavior
and implementers can adopt their preferred policies, including: and implementers can adopt their preferred policies, including:
o slowing down or preemptively back-off their request rate when o slowing down or pre-emptively back-off their request rate when
approaching quota limits; approaching quota limits;
o consuming all the quota according to the exposed limits and then o consuming all the quota according to the exposed limits and then
wait. wait.
5.1. Intermediaries 6.1. Intermediaries
This section documents the considerations advised in Section 16.3.2 This section documents the considerations advised in Section 16.3.2
of [HTTP]. of [HTTP].
An intermediary that is not part of the originating service An intermediary that is not part of the originating service
infrastructure and is not aware of the quota policy semantic used by infrastructure and is not aware of the quota policy semantic used by
the Origin Server SHOULD NOT alter the RateLimit fields' values in the Origin Server SHOULD NOT alter the RateLimit header fields'
such a way as to communicate a more permissive quota policy; this values in such a way as to communicate a more permissive quota
includes removing the RateLimit fields. policy; this includes removing the RateLimit header fields.
An intermediary MAY alter the RateLimit fields in such a way as to An intermediary MAY alter the RateLimit header fields in such a way
communicate a more restrictive quota policy when: as to communicate a more restrictive quota policy when:
o it is aware of the quota unit semantic used by the Origin Server; o it is aware of the quota unit semantic used by the Origin Server;
o it implements this specification and enforces a quota policy which o it implements this specification and enforces a quota policy which
is more restrictive than the one conveyed in the fields. is more restrictive than the one conveyed in the fields.
An intermediary SHOULD forward a request even when presuming that it An intermediary SHOULD forward a request even when presuming that it
might not be serviced; the service returning the RateLimit fields is might not be serviced; the service returning the RateLimit header
the sole responsible of enforcing the communicated quota policy, and fields is the sole responsible of enforcing the communicated quota
it is always free to service incoming requests. policy, and it is always free to service incoming requests.
This specification does not mandate any behavior on intermediaries This specification does not mandate any behavior on intermediaries
respect to retries, nor requires that intermediaries have any role in respect to retries, nor requires that intermediaries have any role in
respecting quota policies. For example, it is legitimate for a proxy respecting quota policies. For example, it is legitimate for a proxy
to retransmit a request without notifying the client, and thus to retransmit a request without notifying the client, and thus
consuming quota units. consuming quota units.
Privacy considerations (Section 7) provide further guidance on Privacy considerations (Section 8) provide further guidance on
intermediaries. intermediaries.
5.2. Caching 6.2. Caching
[HTTP-CACHING] defines how responses can be stored and reused for [HTTP-CACHING] defines how responses can be stored and reused for
subsequent requests, including those with RateLimit fields. Because subsequent requests, including those with RateLimit header fields.
the information in RateLimit fields on a cached response may not be Because the information in RateLimit header fields on a cached
current, they SHOULD be ignored on responses that come from cache response may not be current, they SHOULD be ignored on responses that
(i.e., those with a positive current_age; see Section 4.2.3 of come from cache (i.e., those with a positive current_age; see
[HTTP-CACHING]). Section 4.2.3 of [HTTP-CACHING]).
6. Security Considerations 7. Security Considerations
6.1. Throttling does not prevent clients from issuing requests 7.1. Throttling does not prevent clients from issuing requests
This specification does not prevent clients from making requests. This specification does not prevent clients from making requests.
Servers should always implement mechanisms to prevent resource Servers should always implement mechanisms to prevent resource
exhaustion. exhaustion.
6.2. Information disclosure 7.2. Information disclosure
Servers should not disclose to untrusted parties operational capacity Servers should not disclose to untrusted parties operational capacity
information that can be used to saturate its infrastructural information that can be used to saturate its infrastructural
resources. resources.
While this specification does not mandate whether non-successful While this specification does not mandate whether non-successful
responses consume quota, if error responses (such as 401 responses consume quota, if error responses (such as 401
(Unauthorized) and 403 (Forbidden)) count against quota, a malicious (Unauthorized) and 403 (Forbidden)) count against quota, a malicious
client could probe the endpoint to get traffic information of another client could probe the endpoint to get traffic information of another
user. user.
As intermediaries might retransmit requests and consume quota units As intermediaries might retransmit requests and consume quota units
without prior knowledge of the user agent, RateLimit fields might without prior knowledge of the user agent, RateLimit header fields
reveal the existence of an intermediary to the user agent. might reveal the existence of an intermediary to the user agent.
6.3. Remaining quota units are not granted requests Where partition keys contain identifying information, either of the
client application or the user, servers should be aware of the
potential for impersonation and apply the appropriate security
mechanisms.
RateLimit fields convey hints from the server to the clients in order 7.3. Remaining quota units are not granted requests
to help them avoid being throttled out.
Clients MUST NOT consider the quota units (Section 2.3) returned in RateLimit header fields convey hints from the server to the clients
RateLimit-Remaining field as a service level agreement. in order to help them avoid being throttled out.
Clients MUST NOT consider the quota units (Section 2.6) returned in
remaining keyword as a service level agreement.
In case of resource saturation, the server MAY artificially lower the In case of resource saturation, the server MAY artificially lower the
returned values or not serve the request regardless of the advertised returned values or not serve the request regardless of the advertised
quotas. quotas.
6.4. Reliability of RateLimit-Reset 7.4. Reliability of the reset keyword
Consider that service limit might not be restored after the moment Consider that quota might not be restored after the moment referenced
referenced by RateLimit-Reset field, and the RateLimit-Reset field by the reset keyword (Section 4.1.2), and the reset parameter value
value may not be fixed nor constant. may not be constant.
Subsequent requests might return a higher RateLimit-Reset field value Subsequent requests might return a higher reset parameter value to
to limit concurrency or implement dynamic or adaptive throttling limit concurrency or implement dynamic or adaptive throttling
policies. policies.
6.5. Resource exhaustion 7.5. Resource exhaustion
When returning RateLimit-Reset field you must be aware that many When returning reset values, servers must be aware that many
throttled clients may come back at the very moment specified. throttled clients may come back at the very moment specified.
This is true for Retry-After too. This is true for Retry-After too.
For example, if the quota resets every day at "18:00:00" and your For example, if the quota resets every day at "18:00:00" and your
server returns the RateLimit-Reset field accordingly server returns the reset parameter accordingly
Date: Tue, 15 Nov 1994 18:00:00 GMT
Date: Tue, 15 Nov 1994 08:00:00 GMT RateLimit: daily;r=1;t=36400
RateLimit-Reset: 36000
there's a high probability that all clients will show up at there's a high probability that all clients will show up at
"18:00:00". "18:00:00".
This could be mitigated by adding some jitter to the field-value. This could be mitigated by adding some jitter to the reset value.
Resource exhaustion issues can be associated with quota policies Resource exhaustion issues can be associated with quota policies
using a large time window, because a user agent by chance or on using a large time window, because a user agent by chance or on
purpose might consume most of its quota units in a significantly purpose might consume most of its quota units in a significantly
shorter interval. shorter interval.
This behavior can be even triggered by the provided RateLimit fields. This behavior can be even triggered by the provided RateLimit header
The following example describes a service with an unconsumed quota fields. The following example describes a service with an unconsumed
policy of 10000 quota units per 1000 seconds. quota policy of 10000 quota units per 1000 seconds.
RateLimit-Limit: 10000 RateLimit-Policy: somepolicy;l=10000;w=1000
RateLimit-Policy: 10000;w=1000 RateLimit: somepolicy;r=10000;t=10
RateLimit-Remaining: 10000
RateLimit-Reset: 10
A client implementing a simple ratio between RateLimit-Remaining A client implementing a simple ratio between remaining keyword and
field and RateLimit-Reset field could infer an average throughput of reset keyword could infer an average throughput of 1000 quota units
1000 quota units per second, while the RateLimit-Limit field conveys per second, while the limit keyword conveys a quota-policy with an
a quota-policy with an average of 10 quota units per second. If the average of 10 quota units per second. If the service cannot handle
service cannot handle such load, it should return either a lower such load, it should return either a lower remaining keyword value or
RateLimit-Remaining field value or an higher RateLimit-Reset field an higher reset keyword value. Moreover, complementing large time
value. Moreover, complementing large time window quota policies with window quota policies with a short time window one mitigates those
a short time window one mitigates those risks. risks.
6.5.1. Denial of Service 7.5.1. Denial of Service
RateLimit fields may contain unexpected values by chance or on RateLimit header fields may contain unexpected values by chance or on
purpose. For example, an excessively high RateLimit-Remaining field purpose. For example, an excessively high remaining keyword value
value may be: may be:
o used by a malicious intermediary to trigger a Denial of Service o used by a malicious intermediary to trigger a Denial of Service
attack or consume client resources boosting its requests; attack or consume client resources boosting its requests;
o passed by a misconfigured server; o passed by a misconfigured server;
or a high RateLimit-Reset field value could inhibit clients to or a high reset keyword value could inhibit clients to contact the
contact the server. server (e.g. similarly to receiving "Retry-after: 1000000").
Clients MUST validate the received values to mitigate those risks. To mitigate this risk, clients can set thresholds that they consider
reasonable in terms of quota units, time window, concurrent requests
or throughput, and define a consistent behavior when the RateLimit
exceed those thresholds. For example this means capping the maximum
number of request per second, or implementing retries when the reset
keyword exceeds ten minutes.
7. Privacy Considerations The considerations above are not limited to RateLimit header fields,
but apply to all fields affecting how clients behave in subsequent
requests (e.g. Retry-After).
8. Privacy Considerations
Clients that act upon a request to rate limit are potentially re- Clients that act upon a request to rate limit are potentially re-
identifiable (see Section 5.2.1 of [PRIVACY]) because they react to identifiable (see Section 5.2.1 of [PRIVACY]) because they react to
information that might only be given to them. Note that this might information that might only be given to them. Note that this might
apply to other fields too (e.g. Retry-After). apply to other fields too (e.g. Retry-After).
Since rate limiting is usually implemented in contexts where clients Since rate limiting is usually implemented in contexts where clients
are either identified or profiled (e.g. assigning different quota are either identified or profiled (e.g. assigning different quota
units to different users), this is rarely a concern. units to different users), this is rarely a concern.
Privacy enhancing infrastructures using RateLimit fields can define Privacy enhancing infrastructures using RateLimit header fields can
specific techniques to mitigate the risks of re-identification. define specific techniques to mitigate the risks of re-
identification.
8. IANA Considerations 9. IANA Considerations
IANA is requested to update one registry and create one new registry. IANA is requested to update one registry and create one new registry.
Please add the following entries to the "Hypertext Transfer Protocol Please add the following entries to the "Hypertext Transfer Protocol
(HTTP) Field Name Registry" registry ([HTTP]): (HTTP) Field Name Registry" registry ([HTTP]):
+---------------------+-----------+-------------------------+ +------------------+-----------+-----------------------+
| Field Name | Status | Specification | | Field Name | Status | Specification |
+---------------------+-----------+-------------------------+ +------------------+-----------+-----------------------+
| RateLimit-Limit | permanent | Section 3.1 of RFC nnnn | | RateLimit | permanent | Section 4 of RFC nnnn |
| RateLimit-Remaining | permanent | Section 3.3 of RFC nnnn | | RateLimit-Policy | permanent | Section 3 of RFC nnnn |
| RateLimit-Reset | permanent | Section 3.4 of RFC nnnn | +------------------+-----------+-----------------------+
| RateLimit-Policy | permanent | Section 3.2 of RFC nnnn |
+---------------------+-----------+-------------------------+
8.1. RateLimit Parameters Registration
IANA is requested to create a new registry to be called "Hypertext
Transfer Protocol (HTTP) RateLimit Parameters Registry", to be
located at https://www.iana.org/assignments/http-ratelimit-parameters
[1]. Registration is done on the advice of a Designated Expert,
appointed by the IESG or their delegate. All entries are
Specification Required ([IANA], Section 4.6).
Registration requests consist of the following information:
o Parameter name: The parameter name, conforming to
[STRUCTURED-FIELDS].
o Field name: The RateLimit field for which the parameter is
registered. If a parameter is intended to be used with multiple
fields, it has to be registered for each one.
o Description: A brief description of the parameter.
o Specification document: A reference to the document that specifies
the parameter, preferably including a URI that can be used to
retrieve a copy of the document.
o Comments (optional): Any additional information that can be
useful.
The initial contents of this registry should be:
+----------------+----------+------------+--------------+-----------+
| Field Name | Paramete | Descriptio | Specificatio | Comments |
| | r name | n | n | (optional |
| | | | | ) |
+----------------+----------+------------+--------------+-----------+
| RateLimit- | w | Time | Section 2.1 | |
| Policy | | window | of RFC nnnn | |
+----------------+----------+------------+--------------+-----------+
9. References 10. References
9.1. Normative References 10.1. Normative References
[HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, [HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "HTTP Semantics", STD 97, RFC 9110, Ed., "HTTP Semantics", STD 97, RFC 9110,
DOI 10.17487/RFC9110, June 2022, DOI 10.17487/RFC9110, June 2022,
<https://www.rfc-editor.org/info/rfc9110>. <https://www.rfc-editor.org/info/rfc9110>.
[IANA] Cotton, M., Leiba, B., and T. Narten, "Guidelines for [IANA] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26, Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017, RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>. <https://www.rfc-editor.org/info/rfc8126>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>.
[RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF",
RFC 7405, DOI 10.17487/RFC7405, December 2014,
<https://www.rfc-editor.org/info/rfc7405>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[STRUCTURED-FIELDS] [STRUCTURED-FIELDS]
Nottingham, M. and P-H. Kamp, "Structured Field Values for Nottingham, M. and P. Kamp, "Structured Field Values for
HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021,
<https://www.rfc-editor.org/info/rfc8941>. <https://www.rfc-editor.org/info/rfc8941>.
[WEB-ORIGIN] [WEB-ORIGIN]
Barth, A., "The Web Origin Concept", RFC 6454, Barth, A., "The Web Origin Concept", RFC 6454,
DOI 10.17487/RFC6454, December 2011, DOI 10.17487/RFC6454, December 2011,
<https://www.rfc-editor.org/info/rfc6454>. <https://www.rfc-editor.org/info/rfc6454>.
9.2. Informative References 10.2. Informative References
[HPACK] Peon, R. and H. Ruellan, "HPACK: Header Compression for [HPACK] Peon, R. and H. Ruellan, "HPACK: Header Compression for
HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015,
<https://www.rfc-editor.org/info/rfc7541>. <https://www.rfc-editor.org/info/rfc7541>.
[HTTP-CACHING] [HTTP-CACHING]
Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "HTTP Caching", STD 98, RFC 9111, Ed., "HTTP Caching", STD 98, RFC 9111,
DOI 10.17487/RFC9111, June 2022, DOI 10.17487/RFC9111, June 2022,
<https://www.rfc-editor.org/info/rfc9111>. <https://www.rfc-editor.org/info/rfc9111>.
skipping to change at line 800 skipping to change at page 17, line 12
Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
<https://www.rfc-editor.org/info/rfc3339>. <https://www.rfc-editor.org/info/rfc3339>.
[RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status
Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
<https://www.rfc-editor.org/info/rfc6585>. <https://www.rfc-editor.org/info/rfc6585>.
[UNIX] The Open Group, "The Single UNIX Specification, Version 2 [UNIX] The Open Group, "The Single UNIX Specification, Version 2
- 6 Vol Set for UNIX 98", February 1997. - 6 Vol Set for UNIX 98", February 1997.
9.3. URIs 10.3. URIs
[1] https://www.iana.org/assignments/http-ratelimit-parameters
[2] https://github.com/httpwg/http-core/
pull/317#issuecomment-585868767
[3] https://github.com/ioggstream/draft-polli-ratelimit-headers/
issues/70
[4] https://community.ntppool.org/t/another-ntp-client-failure- [1] https://community.ntppool.org/t/another-ntp-client-failure-
story/1014/ story/1014/
[5] https://lists.w3.org/Archives/Public/ietf-http- [2] https://lists.w3.org/Archives/Public/ietf-http-
wg/2019JulSep/0202.html wg/2019JulSep/0202.html
[6] https://github.com/ioggstream/draft-polli-ratelimit-headers/
issues/34#issuecomment-519366481
Appendix A. Rate-limiting and quotas Appendix A. Rate-limiting and quotas
Servers use quota mechanisms to avoid systems overload, to ensure an Servers use quota mechanisms to avoid systems overload, to ensure an
equitable distribution of computational resources or to enforce other equitable distribution of computational resources or to enforce other
policies - e.g. monetization. policies - e.g. monetization.
A basic quota mechanism limits the number of acceptable requests in a A basic quota mechanism limits the number of acceptable requests in a
given time window, e.g. 10 requests per second. given time window, e.g. 10 requests per second.
When quota is exceeded, servers usually do not serve the request When quota is exceeded, servers usually do not serve the request
skipping to change at line 882 skipping to change at page 18, line 35
o header field names proliferates. o header field names proliferates.
User agents interfacing with different servers may thus need to User agents interfacing with different servers may thus need to
process different headers, or the very same application interface process different headers, or the very same application interface
that sits behind different reverse proxies may reply with different that sits behind different reverse proxies may reply with different
throttling headers. throttling headers.
Appendix B. Examples Appendix B. Examples
B.1. Unparameterized responses B.1. Responses without defining policies
Some servers may not expose the policy limits in the RateLimit-Policy
header field. Clients can still use the RateLimit header field to
throttle their requests.
B.1.1. Throttling information in responses B.1.1. Throttling information in responses
The client exhausted its service-limit for the next 50 seconds. The The client exhausted its quota for the next 50 seconds. The limit
time-window is communicated out-of-band or inferred by the field and time-window is communicated out-of-band.
values.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit: default;r=0;t=50
Ratelimit-Remaining: 0
Ratelimit-Reset: 50
{"hello": "world"} {"hello": "world"}
Since the field values are not necessarily correlated with the Since the field values are not necessarily correlated with the
response status code, a subsequent request is not required to fail. response status code, a subsequent request is not required to fail.
The example below shows that the server decided to serve the request The example below shows that the server decided to serve the request
even if RateLimit-Remaining field value is 0. Another server, or the even if remaining keyword value is 0. Another server, or the same
same server under other load conditions, could have decided to server under other load conditions, could have decided to throttle
throttle the request instead. the request instead.
Request: Request:
GET /items/456 HTTP/1.1 GET /items/456 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit: default;r=0;t=48
Ratelimit-Remaining: 0
Ratelimit-Reset: 48
{"still": "successful"} {"still": "successful"}
B.1.2. Use in conjunction with custom fields B.1.2. Multiple policies in response
The server uses two custom fields, namely "acme-RateLimit-DayLimit" The server uses two different policies to limit the client's
and "acme-RateLimit-HourLimit" to expose the following policy: requests:
o 5000 daily quota units; o 5000 daily quota units;
o 1000 hourly quota units. o 1000 hourly quota units.
The client consumed 4900 quota units in the first 14 hours. The client consumed 4900 quota units in the first 14 hours.
Despite the next hourly limit of 1000 quota units, the closest limit Despite the next hourly limit of 1000 quota units, the closest limit
to reach is the daily one. to reach is the daily one.
The server then exposes the RateLimit fields to inform the client The server then exposes the RateLimit header fields to inform the
that: client that:
o it has only 100 quota units left; o it has only 100 quota units left in the daily quota and the window
will reset in 10 hours;
o the window will reset in 10 hours. The server MAY choose to omit returning the hourly policy as it uses
the same quota units as the daily policy and the daily policy is the
one that is closest to being exhausted.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
acme-RateLimit-DayLimit: 5000 RateLimit: dayLimit;r=100;t=36000
acme-RateLimit-HourLimit: 1000
RateLimit-Limit: 5000
RateLimit-Remaining: 100
RateLimit-Reset: 36000
{"hello": "world"} {"hello": "world"}
B.1.3. Use for limiting concurrency B.1.3. Use for limiting concurrency
Throttling fields may be used to limit concurrency, advertising RateLimit header fields may be used to limit concurrency, advertising
limits that are lower than the usual ones in case of saturation, thus limits that are lower than the usual ones in case of saturation, thus
increasing availability. increasing availability.
The server adopted a basic policy of 100 quota units per minute, and The server adopted a basic policy of 100 quota units per minute, and
in case of resource exhaustion adapts the returned values reducing in case of resource exhaustion adapts the returned values reducing
both RateLimit-Limit and RateLimit-Remaining field values. both limit and remaining keyword values.
After 2 seconds the client consumed 40 quota units After 2 seconds the client consumed 40 quota units
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit-Policy: basic;l=100;w=60
RateLimit-Remaining: 60 RateLimit: basic;r=60;t=58
RateLimit-Reset: 58
{"elapsed": 2, "issued": 40} {"elapsed": 2, "issued": 40}
At the subsequent request - due to resource exhaustion - the server At the subsequent request - due to resource exhaustion - the server
advertises only "RateLimit-Remaining: 20". advertises only "r=20".
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit-Policy: basic;l=100;w=60
RateLimit-Remaining: 20 RateLimit: basic;r=20;t=56
RateLimit-Reset: 56
{"elapsed": 4, "issued": 41} {"elapsed": 4, "issued": 41}
B.1.4. Use in throttled responses B.1.4. Use in throttled responses
A client exhausted its quota and the server throttles it sending A client exhausted its quota and the server throttles it sending
Retry-After. Retry-After.
In this example, the values of Retry-After and RateLimit-Reset field In this example, the values of Retry-After and RateLimit header field
reference the same moment, but this is not a requirement. reference the same moment, but this is not a requirement.
The 429 (Too Many Request) HTTP status code is just used as an The 429 (Too Many Request) HTTP status code is just used as an
example. example.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
Date: Mon, 05 Aug 2019 09:27:00 GMT Date: Mon, 05 Aug 2019 09:27:00 GMT
Retry-After: Mon, 05 Aug 2019 09:27:05 GMT Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
RateLimit-Reset: 5 RateLimit: default;r=0;t=5
RateLimit-Limit: 100
Ratelimit-Remaining: 0
{ {
"title": "Too Many Requests", "title": "Too Many Requests",
"status": 429, "status": 429,
"detail": "You have exceeded your quota" "detail": "You have exceeded your quota"
} }
B.2. Parameterized responses B.2. Responses with defined policies
B.2.1. Throttling window specified via parameter B.2.1. Throttling window specified via parameter
The client has 99 quota units left for the next 50 seconds. The time The client has 99 quota units left for the next 50 seconds. The time
window is communicated by the "w" parameter, so we know the window is communicated by the "w" parameter, so we know the
throughput is 100 quota units per minute. throughput is 100 quota units per minute.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit: fixedwindow;r=99;t=50
RateLimit-Policy: 100;w=60 RateLimit-Policy: fixedwindow;l=100;w=60
Ratelimit-Remaining: 99
Ratelimit-Reset: 50
{"hello": "world"} {"hello": "world"}
B.2.2. Dynamic limits with parameterized windows B.2.2. Dynamic limits with parameterized windows
The policy conveyed by the RateLimit-Limit field states that the The policy conveyed by the RateLimit header field states that the
server accepts 100 quota units per minute. server accepts 100 quota units per minute.
To avoid resource exhaustion, the server artificially lowers the To avoid resource exhaustion, the server artificially lowers the
actual limits returned in the throttling headers. actual limits returned in the throttling headers.
The RateLimit-Remaining field then advertises only 9 quota units for The remaining keyword then advertises only 9 quota units for the next
the next 50 seconds to slow down the client. 50 seconds to slow down the client.
Note that the server could have lowered even the other values in the Note that the server could have lowered even the other values in the
RateLimit-Limit field: this specification does not mandate any RateLimit header field: this specification does not mandate any
relation between the field values contained in subsequent responses. relation between the field values contained in subsequent responses.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10 RateLimit-Policy: dynamic;l=100;w=60
RateLimit-Policy: 100;w=60 RateLimit: dynamic;r=9;t=50
Ratelimit-Remaining: 9
Ratelimit-Reset: 50
{ {
"status": 200, "status": 200,
"detail": "Just slow down without waiting." "detail": "Just slow down without waiting."
} }
B.2.3. Dynamic limits for pushing back and slowing down B.2.3. Dynamic limits for pushing back and slowing down
Continuing the previous example, let's say the client waits 10 Continuing the previous example, let's say the client waits 10
seconds and performs a new request which, due to resource exhaustion, seconds and performs a new request which, due to resource exhaustion,
the server rejects and pushes back, advertising "RateLimit-Remaining: the server rejects and pushes back, advertising "r=0" for the next 20
0" for the next 20 seconds. seconds.
The server advertises a smaller window with a lower limit to slow The server advertises a smaller window with a lower limit to slow
down the client for the rest of its original window after the 20 down the client for the rest of its original window after the 20
seconds elapse. seconds elapse.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 0 RateLimit-Policy: dynamic;l=15;w=20
RateLimit-Policy: 15;w=20 RateLimit: dynamic;r=0;t=20
Ratelimit-Remaining: 0
Ratelimit-Reset: 20
{ {
"status": 429, "status": 429,
"detail": "Wait 20 seconds, then slow down!" "detail": "Wait 20 seconds, then slow down!"
} }
B.3. Dynamic limits for pushing back with Retry-After and slow down B.3. Dynamic limits for pushing back with Retry-After and slow down
Alternatively, given the same context where the previous example Alternatively, given the same context where the previous example
starts, we can convey the same information to the client via Retry- starts, we can convey the same information to the client via Retry-
skipping to change at line 1151 skipping to change at page 24, line 17
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
Retry-After: 20 Retry-After: 20
RateLimit-Limit: 15 RateLimit-Policy: dynamic;l=100;w=60
RateLimit-Policy: 100;w=60 RateLimit: dynamic;r=15;t=40
Ratelimit-Remaining: 15
Ratelimit-Reset: 40
{ {
"status": 429, "status": 429,
"detail": "Wait 20 seconds, then slow down!" "detail": "Wait 20 seconds, then slow down!"
} }
Note that in this last response the client is expected to honor Note that in this last response the client is expected to honor
Retry-After and perform no requests for the specified amount of time, Retry-After and perform no requests for the specified amount of time,
whereas the previous example would not force the client to stop whereas the previous example would not force the client to stop
requests before the reset time is elapsed, as it would still be free requests before the reset time is elapsed, as it would still be free
to query again the server even if it is likely to have the request to query again the server even if it is likely to have the request
rejected. rejected.
B.3.1. Missing Remaining information B.3.1. Missing Remaining information
The server does not expose RateLimit-Remaining field values (for The server does not expose remaining values (for example, because the
example, because the underlying counters are not available). underlying counters are not available). Instead, it resets the limit
Instead, it resets the limit counter every second. counter every second.
It communicates to the client the limit of 10 quota units per second It communicates to the client the limit of 10 quota units per second
always returning the couple RateLimit-Limit and RateLimit-Reset always returning the limit and reset keywords.
field.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10 RateLimit-Policy: quota;l=100;w=1
Ratelimit-Reset: 1 RateLimit: quota;t=1
{"first": "request"} {"first": "request"}
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10 RateLimit-Policy: quota;l=10
Ratelimit-Reset: 1 RateLimit: quota;t=1
{"second": "request"} {"second": "request"}
B.3.2. Use with multiple windows B.3.2. Use with multiple windows
This is a standardized way of describing the policy detailed in This is a standardized way of describing the policy detailed in
Appendix B.1.2: Appendix B.1.2:
o 5000 daily quota units; o 5000 daily quota units;
o 1000 hourly quota units. o 1000 hourly quota units.
The client consumed 4900 quota units in the first 14 hours. The client consumed 4900 quota units in the first 14 hours.
Despite the next hourly limit of 1000 quota units, the closest limit Despite the next hourly limit of 1000 quota units, the closest limit
to reach is the daily one. to reach is the daily one.
The server then exposes the RateLimit fields to inform the client The server then exposes the RateLimit header fields to inform the
that: client that:
o it has only 100 quota units left; o it has only 100 quota units left;
o the window will reset in 10 hours; o the window will reset in 10 hours;
o the expiring-limit is 5000. o the expiring-limit is 5000.
Request: Request:
GET /items/123 HTTP/1.1 GET /items/123 HTTP/1.1
Host: api.example Host: api.example
Response: Response:
HTTP/1.1 200 OK HTTP/1.1 200 OK
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 5000 RateLimit-Policy: hour;l=1000;w=3600, day;l=5000;w=86400
RateLimit-Policy: 1000;w=3600, 5000;w=86400 RateLimit: day;r=100;t=36000
RateLimit-Remaining: 100
RateLimit-Reset: 36000
{"hello": "world"} {"hello": "world"}
FAQ FAQ
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
1. Why defining standard fields for throttling? 1. Why defining standard fields for throttling?
To simplify enforcement of throttling policies. To simplify enforcement of throttling policies and enable clients
to constraint their requests to avoid being throttled.
2. Can I use RateLimit fields in throttled responses (eg with status 2. Can I use RateLimit header fields in throttled responses (eg with
code 429)? status code 429)?
Yes, you can. Yes, you can.
3. Are those specs tied to RFC 6585? 3. Are those specs tied to RFC 6585?
No. [RFC6585] defines the "429" status code and we use it just No. [RFC6585] defines the "429" status code and we use it just
as an example of a throttled request, that could instead use even as an example of a throttled request, that could instead use even
"403" or whatever status code. The goal of this specification is "403" or whatever status code.
to standardize the name and semantic of three ratelimit fields
widely used on the internet. Stricter relations with status
codes or error response payloads would impose behaviors to all
the existing implementations making the adoption more complex.
4. Why don't pass the throttling scope as a parameter?
The word "scope" can have different meanings: for example it can
be an URL, or an authorization scope. Since authorization is out
of the scope of this document (see Section 1.1), and that we rely
only on [HTTP], in Section 1.1 we defined "scope" in terms of
URL.
Since clients are not required to process quota policies (see
Section 5), we could add a new "RateLimit-Scope" field to this
spec. See this discussion on a similar thread [2]
Specific ecosystems can still bake their own prefixed parameters, 4. Why is the partition key necessary?
such as "acme-auth-scope" or "acme-url-scope" and ensure that
clients process them. This behavior cannot be relied upon when
communicating between different ecosystems.
We are open to suggestions: comment on this issue [3] Without a partition key, a server can only effectively only have
one scope (aka partition), which is impractical for most
services, or it needs to communicate the scopes out-of-band.
This prevents the development of generic connector code that can
be used to prevent requests from being throttled. Many APIs rely
on API keys, user identity or client identity to allocate quota.
As soon as a single client processes requests for more than one
partition, the client needs to know the corresponding partition
key to properly track requests against allocated quota.
5. Why using delay-seconds instead of a UNIX Timestamp? Why not 5. Why using delay-seconds instead of a UNIX Timestamp? Why not
using subsecond precision? using subsecond precision?
Using delay-seconds aligns with Retry-After, which is returned in Using delay-seconds aligns with Retry-After, which is returned in
similar contexts, eg on 429 responses. similar contexts, eg on 429 responses.
Timestamps require a clock synchronization protocol (see Timestamps require a clock synchronization protocol (see
Section 5.6.7 of [HTTP]). This may be problematic (e.g. clock Section 5.6.7 of [HTTP]). This may be problematic (e.g. clock
adjustment, clock skew, failure of hardcoded clock adjustment, clock skew, failure of hardcoded clock
synchronization servers, IoT devices, ..). Moreover timestamps synchronization servers, IoT devices, ..). Moreover timestamps
may not be monotonically increasing due to clock adjustment. See may not be monotonically increasing due to clock adjustment. See
Another NTP client failure story [4] Another NTP client failure story [1]
We did not use subsecond precision because: We did not use subsecond precision because:
* that is more subject to system clock correction like the one * that is more subject to system clock correction like the one
implemented via the adjtimex() Linux system call; implemented via the adjtimex() Linux system call;
* response-time latency may not make it worth. A brief * response-time latency may not make it worth. A brief
discussion on the subject is on the httpwg ml [5] discussion on the subject is on the httpwg ml [2]
* almost all rate-limit headers implementations do not use it. * almost all rate-limit headers implementations do not use it.
6. Why not support multiple quota remaining? 6. Shouldn't I limit concurrency instead of request rate?
While this might be of some value, my experience suggests that
overly-complex quota implementations results in lower
effectiveness of this policy. This spec allows the client to
easily focusing on RateLimit-Remaining and RateLimit-Reset.
7. Shouldn't I limit concurrency instead of request rate?
You can use this specification to limit concurrency at the HTTP You can use this specification to limit concurrency at the HTTP
level (see {#use-for-limiting-concurrency}) and help clients to level (see {#use-for-limiting-concurrency}) and help clients to
shape their requests avoiding being throttled out. shape their requests avoiding being throttled out.
A problematic way to limit concurrency is connection dropping, A problematic way to limit concurrency is connection dropping,
especially when connections are multiplexed (e.g. HTTP/2) especially when connections are multiplexed (e.g. HTTP/2)
because this results in unserviced client requests, which is because this results in unserviced client requests, which is
something we want to avoid. something we want to avoid.
A semantic way to limit concurrency is to return 503 + Retry- A semantic way to limit concurrency is to return 503 + Retry-
After in case of resource saturation (e.g. thrashing, connection After in case of resource saturation (e.g. thrashing, connection
queues too long, Service Level Objectives not meet, ..). queues too long, Service Level Objectives not meet, ..).
Saturation conditions can be either dynamic or static: all this Saturation conditions can be either dynamic or static: all this
is out of the scope for the current document. is out of the scope for the current document.
8. Do a positive value of RateLimit-Remaining field imply any 7. Do a positive value of remaining paramter imply any service
service guarantee for my future requests to be served? guarantee for my future requests to be served?
No. FAQ integrated in Section 3.3.
9. Is the quota-policy definition Section 2.1 too complex?
You can always return the simplest form of the 3 fields
RateLimit-Limit: 100
RateLimit-Remaining: 50
RateLimit-Reset: 60
The key runtime value is the first element of the list: "expiring-
limit", the others quota-policy are informative. So for the
following field:
RateLimit-Limit: 100
RateLimit-Policy: 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window"
the key value is the one referencing the lowest limit: "100"
1. Can we use shorter names? Why don't put everything in one field? No. FAQ integrated in Section 4.1.1.
The most common syntax we found on the web is "X-RateLimit-*" and 8. Is the quota-policy definition Section 2.5 too complex?
when starting this I-D we opted for it [6]
The basic form of those fields is easily parseable, even by You can always return the simplest form
implementers processing responses using technologies like dynamic
interpreter with limited syntax.
Using a single field complicates parsing and takes a significantly RateLimit:default;r=50;t=60
different approach from the existing ones: this can limit adoption. The policy key clearly connects the current usage status of a policy
to the defined limits. So for the following field:
1. Why don't mention connections? RateLimit-Policy: sliding;l=100;w=60;burst=1000;comment="sliding window", fixed;l=5000;w=3600;burst=0;comment="fixed window"
RateLimit: sliding;r=50;t=44
Beware of the term "connection": &#65532; &#65532; - it is just the value "sliding" identifies the policy being reported.
_one_ possible saturation cause. Once you go that path &#65532;
you will expose other infrastructural details (bandwidth, CPU, ..
see Section 6.2) &#65532; and complicate client compliance;
&#65532; - it is an infrastructural detail defined in terms of
server and network &#65532; rather than the consumed service.
This specification protects the services first, and then the
infrastructures through client cooperation (see Section 6.1).
&#65532; &#65532; RateLimit fields enable sending _on the same
connection_ different limit values &#65532; on each response,
depending on the policy scope (e.g. per-user, per-custom-key, ..)
&#65532;
2. Can intermediaries alter RateLimit fields? 1. Can intermediaries alter RateLimit header fields?
Generally, they should not because it might result in unserviced Generally, they should not because it might result in unserviced
requests. There are reasonable use cases for intermediaries requests. There are reasonable use cases for intermediaries
mangling RateLimit fields though, e.g. when they enforce stricter mangling RateLimit header fields though, e.g. when they enforce
quota-policies, or when they are an active component of the stricter quota-policies, or when they are an active component of
service. In those case we will consider them as part of the the service. In those case we will consider them as part of the
originating infrastructure. originating infrastructure.
3. Why the "w" parameter is just informative? Could it be used by a 2. Why the "w" parameter is just informative? Could it be used by a
client to determine the request rate? client to determine the request rate?
A non-informative "w" parameter might be fine in an environment A non-informative "w" parameter might be fine in an environment
where clients and servers are tightly coupled. Conveying where clients and servers are tightly coupled. Conveying
policies with this detail on a large scale would be very complex policies with this detail on a large scale would be very complex
and implementations would be likely not interoperable. We thus and implementations would be likely not interoperable. We thus
decided to leave "w" as an informational parameter and only rely decided to leave "w" as an informational parameter and only rely
on RateLimit-Limit, RateLimit-Remaining field and RateLimit-Reset on the limit, remaining and reset keywords for defining the
field for defining the throttling behavior. throttling behavior.
RateLimit fields currently used on the web 3. Can I use RateLimit fields in trailers? Servers usually
establish whether the request is in-quota before creating a
response, so the RateLimit field values should be already
available in that moment. Supporting trailers has the only
advantage that allows to provide more up-to-date information to
the client in case of slow responses. However, this complicates
client implementations with respect to combining fields from
headers and accounting for intermediaries that drop trailers.
Since there are no current implementations that use trailers, we
decided to leave this as a future-work.
RateLimit header fields currently used on the web
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
Commonly used header field names are: Commonly used header field names are:
o "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset"; o "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset";
o "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit-
Reset".
There are variants too, where the window is specified in the header There are variants too, where the window is specified in the header
field name, eg: field name, eg:
o "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x- o "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x-
ratelimit-limit-day" ratelimit-limit-day"
o "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x- o "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x-
ratelimit-remaining-day" ratelimit-remaining-day"
Here are some interoperability issues: Here are some interoperability issues:
skipping to change at line 1445 skipping to change at page 29, line 35
o different headers, with the same semantic, are used by different o different headers, with the same semantic, are used by different
implementers: implementers:
* X-RateLimit-Limit and X-Rate-Limit-Limit * X-RateLimit-Limit and X-Rate-Limit-Limit
* X-RateLimit-Remaining and X-Rate-Limit-Remaining * X-RateLimit-Remaining and X-Rate-Limit-Remaining
* X-RateLimit-Reset and X-Rate-Limit-Reset * X-RateLimit-Reset and X-Rate-Limit-Reset
The semantic of RateLimit-Remaining depends on the windowing The semantic of RateLimit depends on the windowing algorithm. A
algorithm. A sliding window policy for example may result in having sliding window policy for example, may result in having a remaining
a RateLimit-Remaining field value related to the ratio between the keyword value related to the ratio between the current and the
current and the maximum throughput. e.g. maximum throughput. e.g.
RateLimit-Limit: 12 RateLimit-Policy: sliding;l=12;w=1
RateLimit-Policy: 12;w=1 RateLimit: sliding;l=12;r=6;t=1 ; using 50% of throughput, that is 6 units/s
RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s
RateLimit-Reset: 1
If this is the case, the optimal solution is to achieve If this is the case, the optimal solution is to achieve
RateLimit-Limit: 12 RateLimit-Policy: sliding;l=12;w=1
RateLimit-Policy: 12;w=1 RateLimit: sliding;l=12;r=1;t=1 ; using 100% of throughput, that is 12 units/s
RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s
RateLimit-Reset: 1
At this point you should stop increasing your request rate. At this point you should stop increasing your request rate.
Acknowledgements Acknowledgements
Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro
Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark
Nottingham for being the initial contributors of these Nottingham for being the initial contributors of these
specifications. Kudos to the first community implementers: Aapo specifications. Kudos to the first community implementers: Aapo
Talvensaari, Nathan Friedly and Sanyam Dogra. Talvensaari, Nathan Friedly and Sanyam Dogra.
In addition to the people above, this document owes a lot to the In addition to the people above, this document owes a lot to the
extensive discussion in the HTTPAPI workgroup, including Rich Salz, extensive discussion in the HTTPAPI workgroup, including Rich Salz,
Darrel Miller and Julian Reschke. Darrel Miller and Julian Reschke.
Changes Changes
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
F.1. Since draft-ietf-httpapi-ratelimit-headers-03 F.1. Since draft-ietf-httpapi-ratelimit-headers-07
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
o Split policy informatio in RateLimit-Policy #81 o Refactored both fields to lists of Items that identify policy and
use parameters
F.2. Since draft-ietf-httpapi-ratelimit-headers-02 o Added quota unit parameter
o Added partition key parameter
F.2. Since draft-ietf-httpapi-ratelimit-headers-03
This section is to be removed before publishing as an RFC.
o Split policy informatiom in RateLimit-Policy #81
F.3. Since draft-ietf-httpapi-ratelimit-headers-02
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
o Address throttling scope #83 o Address throttling scope #83
F.3. Since draft-ietf-httpapi-ratelimit-headers-01 F.4. Since draft-ietf-httpapi-ratelimit-headers-01
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
o Update IANA considerations #60 o Update IANA considerations #60
o Use Structured fields #58 o Use Structured fields #58
o Reorganize document #67 o Reorganize document #67
F.4. Since draft-ietf-httpapi-ratelimit-headers-00 F.5. Since draft-ietf-httpapi-ratelimit-headers-00
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
o Use I-D.httpbis-semantics, which includes referencing delay- o Use I-D.httpbis-semantics, which includes referencing delay-
seconds instead of delta-seconds. #5 seconds instead of delta-seconds. #5
Authors' Addresses Authors' Addresses
Roberto Polli Roberto Polli
Team Digitale, Italian Government Team Digitale, Italian Government
Italy Italy
Email: robipolli@gmail.com Email: robipolli@gmail.com
Alejandro Martinez Ruiz Alejandro Martinez Ruiz
Red Hat Red Hat
Email: alex@flawedcode.org Email: alex@flawedcode.org
Darrel Miller
Microsoft
Email: darrel@tavis.ca
 End of changes. 211 change blocks. 
608 lines changed or deleted 515 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/