Reinforcement Learning from Scarce Experience via Policy Search | DealShopping Deutschland