Policy Iterations Without Selection Property

10 pages•Published: October 11, 2018

Abstract

In this paper, we propose a modified policy iterations algorithm which does not rely on the selection property. The selection property is the key argument to make improvements during policy iterations. Indeed, a new policy is computed as an optimal solution of a minimization problem. However, in some cases, it might be difficult to prove that an optimal solution exists. To overcome this issue, the new policy is computed as a guaranteed sub-optimal solution of the minimization problem. The good choice of the perturbation parameters preserves the advantages of the original policy iterations algorithm such as the computation of a post-fixed point at each step and the convergence to a fixed point.

Keyphrases: approximated and guaranteed optimal solutions, policy iterations, verification

In: Matthieu Martel, Nasrine Damouche and Julien Alexandre Dit Sandretto (editors). TNC'18. Trusted Numerical Computations, vol 8, pages 1-10.

Links:	https://easychair.org/publications/paper/LNf8
	https://doi.org/10.29007/9rn9

BibTeX entry

@inproceedings{TNC'18:Policy_Iterations_Without_Selection,
  author    = {Assale Adje},
  title     = {Policy Iterations Without Selection Property},
  booktitle = {TNC'18. Trusted Numerical Computations},
  editor    = {Matthieu Martel and Nasrine Damouche and Julien Alexandre Dit Sandretto},
  series    = {Kalpa Publications in Computing},
  volume    = {8},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {/publications/paper/LNf8},
  doi       = {10.29007/9rn9},
  pages     = {1-10},
  year      = {2018}}

Download PDF Open PDF in browser