Traffic Shaping


To provide the fastest possible response times, Pitney Bowes uses traffic shaping to ensure that a very large number of closely spaced API requests from an individual developer account cannot negatively impact response times for all developer accounts. During any given second, Pitney Bowes throttles the requests for an individual developer if that developer’s requests would slow other developers’ requests.

Pitney Bowes uses a custom algorithm to determine when to apply traffic shaping. Traffic-shaping policy is handled on a per-API-operation basis, so the number of requests sent per second to one API has no effect on traffic shaping for any other API.

Safe Limit

For a given API operation, the Safe Limit is the number of transactions per second (TPS) a developer can send before traffic shaping might apply. To avoid traffic shaping, observe the Safe Limits in the table below.

Note that Pitney Bowes might adjust the traffic-shaping policy for a given API over time. Also note that additional APIs might be added. For this reason we suggest that your development team treat APIs not specifically listed below as if they had a Safe Limit of 100 TPS.


Safe Limit in Transactions Per Second

Address Validation

200 TPS

Create Shipment,
Reprint Shipment,
Retry Shipment,
Void Shipment

100 TPS

Rate Shipment

300 TPS

All other API operations

100 TPS (guideline)

When Does Traffic Shaping Apply?

If your application exceeds your assigned Safe Limit for an API operation during a 1 second period, Pitney Bowes might throttle your requests to that API for the rest of that second and for all future seconds until one of the following two conditions are met:

  1. Your TPS for the API operation falls below your Safe Limit for a period of one second.

  2. The total TPS for the API operation for all developers falls below 50% of the maximum allowable bandwidth (as defined by the Pitney Bowes engineering teams) for a period of one second.

If an API request is throttled, it’s placed in a queue that releases transactions using a “leaky bucket” algorithm. Transactions are released from the queue on a First In, First Out (FIFO) basis. If the maximum queue size has been reached, all new requests entering the queue will instead be returned as errors until space in the queue has opened up. In practice this means that slightly exceeding the Safe Limit for a few seconds might have little or no effect on response times, but drastically exceeding the Safe Limit, or moderately exceeding it for an extended period, could result in a high percentage of requests being returned as errors.

Note that Pitney Bowes traffic-shaping policies do not, in general, specifically target concurrent (simultaneous) transactions. For example, a developer sending bursts of 200 concurrent transactions at the beginning of each second would be treated identically to a developer sending one transaction every 5 milliseconds. However, certain types of requests that require large amounts of system resources, such as rate-shopping Rate API requests, might have additional spacing applied to them in addition to the traffic shaping described in this document.

Build TPS Limits Into Your Applications

As a best practice, build TPS limits into your applications to avoid sending more transactions per second than the Safe Limit allows. This might mean spacing out transactions by a set number of milliseconds, or limiting the number of threads in a multi-threaded application.

Code that defines how retry requests are handled should also be reviewed to prevent the retry policy from compounding traffic-related issues.