Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2019). Mnl-bandit: A dynamic learning approach to assortment selection. Operations Research, 67(5), 1453-1485.
Agrawal, S., Avadhanula, V., Goyal, V., & Zeevi, A. (2017, June). Thompson sampling for the mnl-bandit. In Conference on Learning Theory (pp. 76-78). PMLR.
Chapelle, O., & Li, L. (2011). An empirical evaluation of thompson sampling. Advances in neural information processing systems, 24, 2249-2257.
Caro, F., & Gallien, J. (2007). Dynamic assortment with demand learning for seasonal consumer goods. Management science, 53(2), 276-292.
Chen, B., & Chao, X. (2020). Dynamic inventory control with stockout substitution and demand learning. Management Science, 66(11), 5108-5127.
Chen, X., & Wang, Y. (2018). A note on a tight lower bound for capacitated mnl-bandit assortment selection models. Operations Research Letters, 46(5), 534-537.
Chen, W., Wang, Y., & Yuan, Y. (2013, February). Combinatorial multi-armed bandit: General framework and applications. In International Conference on Machine Learning (pp. 151-159). PMLR.
Gopalan, A., Mannor, S., & Mansour, Y. (2014, January). Thompson sampling for complex online problems. In International Conference on Machine Learning (pp. 100-108). PMLR.
Kök, A. G., & Fisher, M. L. (2007). Demand estimation and assortment optimization under substitution: Methodology and application. Operations Research, 55(6), 1001-1021.
Mahajan, S., & van Ryzin, G. J. (1999). Retail inventories and consumer choice. In Quantitative models for supply chain management (pp. 491-551). Springer, Boston, MA.
McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior.
Powell, W. B. (2019). A unified framework for stochastic optimization. European Journal of Operational Research, 275(3), 795-821.
Rusmevichientong, P., Shen, Z. J. M., & Shmoys, D. B. (2010). Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Operations research, 58(6), 1666-1680.
Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285-294.