|
ÄÚµå¹øÈ£ : 2 |
|
¹ßÇ¥ÀÚ : ¹ÚÁÖ¿µ |
|
¼Ò¼Ó : °í·Á´ëÇб³ |
|
ºÎ¼ : Á¦¾î°èÃø°øÇаú |
|
Á÷À§ : ±³¼ö |
|
¼¼¼Ç½Ã°£ : |
|
¹ßÇ¥ÀÚ¾à·Â : |
1993 - ÇöÀç : °í·Á´ëÇб³ Á¦¾î°èÃø°øÇаú ±³¼ö
1992 : University of Texas at Austin Àü±â¹×ÄÄÇ»ÅÍ°øÇаú ¹Ú»ç
1983 : ¼¿ï´ëÇб³ Àü±â°øÇаú Çлç |
|
|
°¿¬¿ä¾à : |
±íÀº °ÈÇнÀ(Deep Reinforcement Learning)Àº Çö´ë ÀΰøÁö´É ±â¼ú Áß °¡Àå È°¹ßÇÑ ¿¬±¸°¡
ÀÌ·ç¾îÁö´Â ºÐ¾ß Áß Çϳª·Î¼, ÈÇнÀ, Á¦¾îÀÌ·Ð ¹× µö·¯´× ±â¼úÀÌ °áÇÕµÇ¾î ½Ã³ÊÁö È¿°ú¸¦ °ÅµÎ¸ç
±Þ¼ÓÇÑ ¹ßÀüÀ» ÀÌ·ç°íÀÖ´Ù.
º» °Á¿¡¼´Â ±íÀº °ÈÇнÀ ±â¼úÀÇ °ú°Å¿Í ÇöÀ縦 ±¸¼ºÇÏ´Â ÁÖ¿ä ÁÖÁ¦ÀÎ Controlled Ito Process, Stochastic Optimal Control, Hamilton-Jacobi-Bellman Equation, Markov Decision Process, Model-based & Model-free Reinforcement Learning, Deep Learning, AlphaGo Zero µîÀÇ °³³äÀ» »ìÆ캸°í, ÀÌ¿Í°ü·ÃÇÑ ¹Ì·¡ ±â¼úÀÇ ¹æÇâ¿¡ ´ëÇØ »ý°¢Çغ»´Ù. |
|
|
¿Â¶óÀÎÇà»çÀå : |
|
|