Machine learning tree trimming for faster Markov reward game solutions

dc.authoridÖzkaya, Murat / 0000-0001-7241-4710
dc.contributor.authorİzgi, Burhaneddin
dc.contributor.authorÖzkaya, Murat
dc.contributor.authorÜre, Nazım Kemal
dc.contributor.authorPerc, Matjaz
dc.date.accessioned2026-02-03T12:02:42Z
dc.date.available2026-02-03T12:02:42Z
dc.date.issued2025
dc.departmentÇanakkale Onsekiz Mart Üniversitesi
dc.description.abstractExisting methodologies for solving Markov reward games mostly rely on state-action frameworks and iterative algorithms to address these challenges. However, these approaches often impose significant computational burdens, particularly when applied to large-scale games, due to their inherent complexity and the need for extensive iterative calculations. In this paper, we propose a new neural network architecture for solving Markov reward games in the form of a decision tree with relatively large state and action sets, such as 2-actions-3-stages, 3-actions-3-stages, and 4-actions-3-stages, by trimming the decision tree. In this context, we generate datasets of Markov reward games with sizes ranging from 103 to 105 using the holistic matrix norm-based solution method and obtain the necessary components, such as the payoff matrices and the corresponding solutions of the games, for training the neural network. We then propose a vectorization process to prepare the outcomes of the matrix norm-based solution method and adapt them for training the proposed neural network. The neural network is trained using both the vectorized payoff and transition matrices as input, and the prediction system generates the optimal strategy set as output. In the model, we approach the problem as a classification task by labeling the optimal and non-optimal branches of the decision tree with ones and zeros, respectively, to identify the most rewarding paths of each game. As a result, we propose a novel neural network architecture for solving Markov reward games in real time, enhancing its practicality for real-world applications. The results reveal that the system efficiently predicts the optimal paths for each decision tree, with f1-scores slightly greater than 0.99, 0.99, and 0.97 for Markov reward games with 2-actions-3-stages, 3-actions-3-stages, and 4-actions-3-stages, respectively.
dc.description.sponsorshipScientific and Technological Research Council of Turkey, Turkey [121E394]
dc.description.sponsorshipSlovenian Research and Innovation Agency [P1-0403]
dc.description.sponsorshipThis work is supported by the Scientific and Technological Research Council of Turkey, Turkey (in Turkish: TUBITAK) under grant agreement 121E394. M.P. was supported by the Slovenian Research and Innovation Agency (Javna agencija za znanstvenoraziskovalno in inovacijsko dejavnost Republike Slovenije) (Grant No. P1-0403) . The authors would like to thank the anonymous referees and the editor for their valuable suggestions and comments that helped improve the article's content.
dc.identifier.doi10.1016/j.jocs.2025.102726
dc.identifier.issn1877-7503
dc.identifier.issn1877-7511
dc.identifier.scopus2-s2.0-105019211828
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1016/j.jocs.2025.102726
dc.identifier.urihttps://hdl.handle.net/20.500.12428/34831
dc.identifier.volume92
dc.identifier.wosWOS:001595259200001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofJournal of Computational Science
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WOS_20260130
dc.subjectMachine learning
dc.subjectConvolutional neural networks
dc.subjectMatrix norm-based method
dc.subjectMarkov decision process
dc.subjectMarkov reward games
dc.titleMachine learning tree trimming for faster Markov reward game solutions
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
[ X ]
İsim:
Murat Ozkaya_Makale.pdf
Boyut:
2.03 MB
Biçim:
Adobe Portable Document Format