Train your robot to learn your coffee preferences!
Controls how much the robot adjusts its values after feedback.
Controls exploration. Higher values mean the robot exploits the best option more.
The robot chose:
Did you like the coffee?