By Lina Sorg
In 2012, Amazon began installing self-service delivery and return lockers throughout the U.S., U.K., and other parts of Europe. These lockers are located at convenience stores, gas stations, office spaces, and even apartment complexes. When purchasing items from Amazon, consumers can choose to ship their packages to a locker location of their choice and pick them up at their leisure without — worrying about package theft or adhering to leasing office pickup hours.
The convenience of this system means that increasingly large numbers of Amazon customers are opting to utilize locker delivery. As the frequency of demand increases, efficient capacity management becomes increasingly crucial to maintain customer satisfaction. During a minisymposium at the 2018 SIAM Annual Meeting, currently taking place in Portland, Ore., Samyukta Sethuraman, an operations research scientist at Amazon, discussed the company’s efforts to calculate locker reservation levels, prevent overbooking, and accommodate indicated shipping preferences. Amazon Planning Research and Optimization Science designed a comprehensive algorithm that uses locker demand forecast and dwell time probabilities to estimate the expected occupancy requirement for Amazon lockers, based on shipping speed and date.
“Capacity management consists of two parts,” Sethuraman said. “In real time, it tells us if we’ll have space for a package in the locker on the day it’s going to get delivered.” This calculation happens upon creation of the order. The second part considers the specifics of shipping options, which include same-day, next-day, two-day, or three-five-day; one must also allocate space for customer returns. Orders with slower shipping speeds are placed before those with fast shipping speeds. However, Amazon also reserves locker slots for the “better” customers who request fast shipping, which frequently results in wasted space. Sethuraman seeks to determine how much space to reserve for packages in the lockers.
This calculation essentially involves starting from scratch. “One of the things that make this problem unique is that we don’t know how long a package is going to stay in the locker,” Sethuraman said. Customers must pick up their packages within three business days, but can do so at any time during this interval. If a customer does not pick up his/her package, a carrier fetches the item and returns it to Amazon. Carrier pickup can take as many as six days. “But we don’t want to reserve six days because most packages are picked up on same day,” Sethuraman said. “That would be a lot of wasted space.”
She turns to locker demand forecasting to address this discrepancy. Lockers typically have between 30 and 150 available slots. While demand is often sparse for unpopular shipping options like next-day shipping, other options—such as same-day or two-day shipping—experience much higher demand. Because the amount of Amazon lockers nearly doubled in 2018, most lockers have at most only a few months of data. This lack of data presents a problem for traditional time-series techniques. For example, existing methodology assumes a linear relationship between demand for home versus locker delivery. However, these are not linearly dependent because demand varies significantly based on location; suburban customers are much more likely than city-dwellers to opt for home delivery.
A new methodology called random forest regression works well with sparse data and allows for additional features that un-constrain the demand. One such feature is last-accepted order time, which reveals how many people to whom Amazon has denied or granted locker access. “For many time series models, we need to have some data that is unconstrained,” Sethuraman said. “But we are always constrained by capacity in high-demand lockers.” Random forest regression offers an 80 percent forecast accuracy over the existing methodology.
Sethuraman uses a dwell time probability estimate and considers shipping speed and delivery rate to approximate the probability that a package stays in the locker for zero, one two, three, etc. days after delivery. Current methods fall short here as well due to sparse data and changing demographics. “The major challenge we face for most lockers is sparse data,” Sethuraman said. “The existing methodology is proportional to historical dwell likelihood and often results in overfitting due to this sparse data.” For instance, if a customer picks up a package on the first day and the sixth day, the calculated chance of it getting picked up on each of those days becomes 50 percent, with a zero percent chance of pickup on the remaining days. This is clearly not accurate. In comparison, random forest classification can handle sparse datasets and only requires historical dwell times as an input. It can thus account for situations like weekend office building closures, which may prevent pickup from those lockers. The model output reserves certain spaces and allocates them based on shipping preferences, leaving a few reserved lockers free for use on a first-come, first-serve basis.
In short, random forest classification experienced heightened success in maximizing Amazon locker storage efficiency while preventing overbooking. This new methodology experienced a decrease of up to 16 percent in the number of packages with a one-day prediction error. “The higher the throughput, the more number of customers that are satisfied,” Sethuraman said. “We want to serve as many customers as possible.”