Exploring Inventory Optimization with Machine Learning - Deep Q Network: A Beginner's Journey
Meet Mr. Frost (our fictional character), the proud owner of "Fresh Ice Cream," an ice cream shop known for its delicious, freshly churned treats.
Recently, Mr. Frost wanted to improve his forecast and inventory optimization to maximize his profits.
He started studying machine learning which is a branch of artificial intelligence that focuses on developing algorithms and techniques to enable computers to learn from data and make predictions or decisions without being explicitly programmed.
He considered to start with Reinforcement Learning after reading Mr. Xie`s article (https://towardsdatascience.com/a-reinforcement-learning-based-inventory-control-policy-for-retailers-ac35bc592278). Reinforcement Learning learns to interact with an environment by taking actions and receiving feedback in the form of rewards or penalties. Deep reinforcement learning extends traditional reinforcement learning techniques by incorporating deep neural networks to learn complex representations of the environment alowing superior performance.
Mr. Frost meticulously analyzed the demand patterns for his ice cream. From Monday to Thursday, the demand follows a normal distribution N(3, 1.5), while Friday sees a surge following distribution N(6, 1). The weekend brings even higher demand, with Saturday and Sunday following N(12, 2) distributions. He was amazed that the patterns were the same as the one in Mr. Xie article (the Deep Q Network Python code parameters I used the same as Mr. Xie, however I included the shelf life challenge and used a time based ordering system instead of stock amount).
Mr. Frost Current Strategy: After conducting numerous simulations using Excel, Mr. Frost concluded that ordering ice cream once a week, specifically 42 units per week, aligned with the average weekly demand. To prevent wastage due to the ice cream's short 7-day shelf life, he set out to optimize his inventory management.
Key Parameters:
This strategy was yelding arounf 1100 dollars per quarter to Mr. Frost. Is it possible that the machine learning algorithm will improve his profits?
Recommended by LinkedIn
The Python Code: Mr. Frost's journey led him to develop a Python code implementing the Deep Q Network to calculate profits for each interaction. His code incorporated considerations for shelf life, ensuring efficient utilization of inventory while minimizing waste.
Results and Insights: Training the model over 1000 iterations yielded promising results. The Reinforcement Learning model demonstrated an average profit of $2500, a significant improvement over Mr. Frost's original method, which yielded $1200 in profits.
Looking Ahead: Impressed by the outcomes, Mr. Frost is now considering further exploration into Machine Learning to refine his inventory policies. He envisions extending this project to benefit not only his ice cream parlor but also other ice cream shops seeking optimization.
As described above I changed the initial parameters andcreated a new time inventory replenishment technique. I believe the most difficult part to re-write would be the shelf life logic so I am adding it below:
def step(self, action):
if action > 0:
y = 1
self.order_arrival_list.append([self.current_period + self.lead_time, action])
else:
y = 0
if len(self.order_arrival_list) > 0:
if self.current_period == self.order_arrival_list[0][0]:
self.inv_level = min(self.capacity, self.inv_level + self.order_arrival_list[0][1])
self.order_arrival_list.pop(0)
demand = self.demand_list[self.current_period - 1]
units_sold = demand if demand <= self.inv_level else self.inv_level
reward = units_sold * self.unit_price - self.holding_cost * self.inv_level - y * self.fixed_order_cost
self.inv_level = max(0, self.inv_level - demand)
self.inv_pos = self.inv_level
if len(self.order_arrival_list) > 0:
for i in range(len(self.order_arrival_list)):
self.inv_pos += self.order_arrival_list[i][1]
self.day_of_week = (self.day_of_week + 1) % 7
self.state = np.array([self.inv_pos] + self.convert_day_of_week(self.day_of_week))
self.current_period += 1
self.state_list.append(self.state)
self.action_list.append(action)
self.reward_list.append(reward)
if self.day_of_week == 0 and self.count_days < 7:
self.sun_stock = self.inv_level
self.inv_w_scrap = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales += self.week_sales + units_sold
self.count_days += 1
elif self.day_of_week == 1 and self.count_days < 7:
self.mon_stock = self.inv_level
self.inv_w_scrap = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales = self.week_sales + units_sold
self.count_days += 1
#print(self.units_sold_day[self.day_of_week])
#print(self.week_sales)
elif self.day_of_week == 2 and self.count_days < 7:
self.tue_stock = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales = self.week_sales + units_sold
self.count_days += 1
#print(self.units_sold_day[self.day_of_week])
#print(self.week_sales)
elif self.day_of_week == 3 and self.count_days < 7:
self.wed_stock = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales += units_sold
self.count_days += 1
#print(self.units_sold_day[self.day_of_week])
#print(self.week_sales)
elif self.day_of_week == 4 and self.count_days < 7:
self.thu_stock = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales += units_sold
self.count_days += 1
elif self.day_of_week == 5 and self.count_days < 7:
self.fri_stock = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales += units_sold
self.count_days += 1
elif self.day_of_week == 6 and self.count_days < 7:
self.sat_stock = self.inv_level
self.units_sold_day[self.day_of_week] = units_sold
self.week_sales += units_sold
self.count_days += 1
#After 1 week start calculating scrap
if self.day_of_week == 0 and self.count_days >= 7 and self.sun_stock > self.week_sales:
self.scrap_qty = self.sun_stock - self.week_sales #Amount scrapped
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] #last 7 days sales updated
self.inv_pos = max(0, self.inv_pos-self.scrap_qty)#remove scrap from inventory
self.sun_stock = self.inv_pos
elif self.day_of_week == 0 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.sun_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 1 and self.count_days >= 7 and self.mon_stock > self.week_sales:
self.scrap_qty = self.mon_stock - self.week_sales # Amount scrapped
#print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.mon_stock = self.inv_pos
elif self.day_of_week == 1 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.mon_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 2 and self.count_days >= 7 and self.tue_stock > self.week_sales:
self.scrap_qty = self.tue_stock - self.week_sales # Amount scrapped
#print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales update
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.tue_stock = self.inv_pos
elif self.day_of_week == 2 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.tue_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 3 and self.count_days >= 7 and self.wed_stock > self.week_sales:
self.scrap_qty = self.wed_stock - self.week_sales # Amount scrapped
#print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales update
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.wed_stock = self.inv_pos
elif self.day_of_week == 3 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.wed_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 4 and self.count_days >= 7 and self.thu_stock > self.week_sales:
self.scrap_qty = self.thu_stock - self.week_sales # Amount scrapped
#print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.thu_stock = self.inv_pos
elif self.day_of_week == 4 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.thu_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 5 and self.count_days >= 7 and self.fri_stock > self.week_sales:
self.scrap_qty = self.fri_stock - self.week_sales # Amount scrapped
# print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.fri_stock = self.inv_pos
elif self.day_of_week == 5 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.fri_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.day_of_week == 6 and self.count_days >= 7 and self.sat_stock > self.week_sales:
self.scrap_qty = self.sat_stock - self.week_sales # Amount scrapped
# print(self.scrap_qty)
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.inv_pos = max(0, self.inv_pos - self.scrap_qty)
self.sat_stock = self.inv_pos
elif self.day_of_week == 6 and self.count_days >= 7:
self.week_sales = self.week_sales + units_sold - self.units_sold_day[self.day_of_week] # last 7 days sales updated
self.sat_stock = self.inv_level # New day stock / aqui eu preciso arrumar no futuro ou calcular o inv_level removendo o scrap
if self.current_period > self.n_period:
terminate = True
else:
terminate = False
return self.state, reward, terminate
Currently I'm very into machine learning and the power of data... discovering a new way of predicting through patterns and creating algorithms.
Currently I'm very into machine learning and the power of data... discovering a new way of predicting through patterns and creating algorithms.
Nice one Bruno Fink !
Wow Bruno, this is impressive! Do share your thinking more with the team ! Very interesting!