Java : เขียน AI ด้วย Minimax Algorithm – 1

Minimax Algorithm คือขั้นตอนวิธีในการหลีกเลี่ยงโอกาสที่จะทำให้เกิดความสูญเสียมากที่สุดในการเล่นเกมเชิงตรรกะที่มีผู้เล่นสองคน เช่นหมากรุก หมากฮอส หรือ โอเอกซ์ โดยมีเป้าหมายเพื่อให้ผู้เล่น A สามารถเลือกเส้นทางที่มีโอกาสมากที่สุดที่จะทำให้ผู้เล่น B ได้เปรียบน้อยที่สุดในแต่ละรอบของการเล่น โดยในขั้นตอนวิธีนี้ผู้เล่น A จะถูกเรียกว่าผู้เล่นหาค่าสูงสุด ส่วนผู้เล่น B จะถูกเรียกว่าผู้เล่นหาค่าต่ำสุด เพราะว่าตัวแปรของค่าเสียโอกาสจะเพิ่มขึ้นเมื่อผู้เล่น A ได้เปรียบ และจะลดลงเมื่อผู้เล่น B ได้เปรียบตามทฤษฎีเกมแบบ Combinatorial Game Theory ของจอห์น ฮอร์ตัน คอนเวย์ (John Horton Conway)

ที่มาของขั้นตอนวิธีมาจากทฤษฎีเกมของจอห์น ฟอน นอยมันน์ (John von Neumann) ซึ่งเกิดจากการศึกษาเกมผลรวมเป็นศูนย์ หรือ เกมที่ต้องมีแพ้-ชนะ ( zero–sum game) เพื่อหาวิธีการลดความเสี่ยงจากการแพ้

หลักการและวิธีค้นหาใช้หลักการของการค้นแบบจำกัดความลึก (หมายถึงจำกัดความลึกของการจำลองการเล่นในกรณีที่ทางเลือกมีมากมหาศาล เช่น การเล่นหมากรุก) โดยเก็บค่าตัวแปรของค่าเสียโอกาสไว้ในแต่ละโหนด (node) ของโครงสร้างการค้นหาแบบต้นไม้ (Search Tree) ค่ามากกว่าบ่งบอกว่าผู้เล่น A ได้เปรียบผู้เล่น B อยู่ ในทางกลับกันค่าติดลบก็บ่งบอกว่าผู้เล่น B ได้เปรียบผู้เล่น A อยู่นั่นเอง (ที่มา : https://th.wikipedia.org/wiki/ขั้นตอนวิธีมินิแมกซ์)

จากรูปภาพด้านล่างเป็นการจำลองการเล่นที่ลึกลงไป 4 ระดับโดยเริ่มต้นที่ระดับชั้น 0 ซึ่ง A เป็นผู้เล่น เมื่อผู้เล่น A เล่นแล้วจะผลลัพธ์ส่งลงไปยังระดับถัดไป เกิดเป็นกิ่งสาขาของโครงสร้างการค้นหาแบบต้นไม้ตามจำนวนวิธีที่สามารถทำได้ และสลับไปเป็นรอบของผู้เล่น B ซึ่งเป็นผู้เล่นในระดับความลึกที่ 1 การเล่นจะสลับกันไปแบบนี้จนพบการชนะ หรือ เสมอ หรือถึงระดับชั้นที่กำหนดไว้ซึ่งในที่นี่คือความลึกในระดับที่ 4 โหนดสุดท้ายในระดับที่ 4 ซึ่งเป็นผลลัพธ์จากการเล่นโดยผู้เล่น B และเป็นรอบของผู้เล่น A ที่นี้สมมุติว่ามีการรู้ผลแพ้ชนะแล้วจะทำการคำนวนค่าเสียโอกาสของแต่ละผลลัพธ์และส่งค่าเสียโอกาสที่คำนวนได้ของทุกโหนดย้อนกลับมาที่ระดับบนซึ่งก็คือระดับที่ 3 ตามเส้นทางของกิ่งของโครงสร้างการค้นหาแบบต้นไม้ของแต่ละโหนก (จากรูป ลูกศรสีแดงหมายถึงค่าที่ถูกเลือกจากค่าทั้งหมดที่ถูกส่งขึ้นมาตามลำดับชั้น) ซึ่งในระดับ 3 เป็นสิทธ์การเล่นโดย ผู้เล่น B ดังนั้นผู้เล่น B ซึ่งเป็นผู้เล่นที่หาค่าต่ำสุดจะเลือกคะแนนที่ต่ำที่สุดที่ส่งต่อขี้นไปยังระดับบนซึ่งในที่นี้ระดับที่ 2 ในระดับที่ 2 การเลือกที่ส่งมาจากผู้เล่น B จะถูกเลือกโดยผู้เล่น A ซึ่งผู้เล่นที่หาค่าสูงสุดก็จะเลือกคะแนนที่สูงที่สุดที่ถูกส่งขี้นมาและสลับกันไปแบบนี้จะถึงระดับชั้น 0 ซึ่งผู้เล่น A จะเลือกค่าสูงสุดที่ถูกส่งขึ้นมา

เพื่อให้เห็นภาพในรูปแบบการเล่นโอเอกซ์ เราสามารถแสดงโครงสร้างได้ดังภาพด้านล่าง จากสถานะล่าสุดยังมีที่ว่างอยู่ 3 ที่ซึ่งผู้เล่น A จะเลือกเล่นได้ 3 แบบ เกิดเป็นกิ่งสาขาของโครงสร้างการค้นหาแบบต้นไม้ขึ้นมา 3 กิ่งสาขา จากนั้นเป็นรอบของผู้เล่น B ซึ่งเลือกเล่นได้ 2 แบบเพราะเหลือช่องว่างแค่ 2 ช่องจึงเกิดกิ่งสาขาแยกออกมาอีก 2 กิ่งสาขาจากกิ่งสาขาบน และเมื่อมาถึงรอบของผู้เล่น A เมื่อเล่นแล้วก็จะไม่มีช่องว่างเหลือแล้วจึงมีการคำนวนค่าเสียโอกาสและส่งกลับต่อๆกันขึ้นมา

จากความเข้าใจด้านบนเราสามารถนำเอา Minimax Algorithm มาใช้ในการเล่นเกมส์ Tic Tac To หรือที่เรามักจะเรียกมันว่าเกมโอเอกซ์ ได้ดังตัวอย่างด้านล่าง ตัวอย่างโปรแกรมนี้จะกำหนดให้คอมพิวเตอร์เป็นผู้เล่นก่อนด้วยตัว X และมนุษย์ใช้ O โดยไม่มีการดักจับข้อผิดพลาด เช่น เลือกตาเดินที่ใช้ไปแล้ว หรือ กำหนดค่าที่เป็นไปไม่ได้ ทั้งนี้เพื่อให้โปรแกรมสั้นกระชับเพราะเราจะเน้นอธิบายการใช้ Minimax Algorithm เป็นหลัก

เพื่อไม่ให้มันยาวจนเกินไป ติดตามต่อได้ที่ Java : เขียน AI ด้วย Minimax Algorithm – 2

/* To simplify the step and focus on minimax algorithm :
this program will always start with computer(AI) as "X" and human will be "O".
this program does not check correctness selection number from user.
this program does not check available posotion selection.
*/

package com.company;

import java.util.*;

public class Main {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        // prepare game
        String[] board = new String[] {"0", "1", "2", "3", "4", "5", "6", "7", "8"};
        String ai = "X"; // define ai to use X
        String user = "O"; // define human to use O
        System.out.println("please select number to mark");

        // start playing
        boolean loop = true; // flag for game loop
        boolean turn = true; // flag for player turn
        while (loop) {  // game loop until finish (loop = false)
            showGrid(board); // show board status
            if (turn) { // ai turn
                loop = getAndMark(board, ai, scanner); // get position and mark
                turn = false; // turn to other opponent
            } else { // human turn
                loop = getAndMark(board, user, scanner); // get position and mark
                turn = true; // turn to other opponent
            }
        }
    }

    public static void showGrid (String[] board) { // show board
        String[][] grid = new String[3][3];
        int count = 0;
        for (int i = 0; i < grid.length; i++) { // loop to convert flat board to grid
            for (int j = 0; j < grid[0].length; j++) {
                grid[i][j] = board[count];
                count++;
            }
        }
        System.out.println("---------");
        for (int i = 0; i < grid.length; i++) { // loop to show grid
            System.out.print("| ");
            for (int j = 0; j < grid[0].length; j++) {
                System.out.print(grid[i][j]+" ");
            }
            System.out.print("|\n");
        }
        System.out.println("---------");
    }

    public static boolean getAndMark(String[] board, String player, Scanner scanner) { // get position and mark
        boolean loop = true;
        if (player.equals("X")) { // get position from minimax algorithm and mark
            System.out.println("playing by AI");
            int[] bestMove = new int[1];
            minimax(board, player,bestMove);
            board[bestMove[0]] = "X";
            if (winning(board,"X")) { // check if X win then game over
                showGrid(board);
                System.out.println("X win.");
                loop = false;
            }
        } else { // get position and mark from user
            System.out.print("Enter the number: ");
            int move = scanner.nextInt();
            scanner.nextLine();
            board[move] = "O";
            if (winning(board,"O")) { // check if O win then game over
                showGrid(board);
                System.out.println("O win.");
                loop = false;
            }
        }
        int[] available = emptyIndexies(board);
        if (available.length == 0) { // check if no available position then game over
            showGrid(board);
            System.out.println("Draw");
            loop = false;
        }
        return loop; // return boolean as true in case game over
    }

    public static boolean winning(String[]board, String player){ // check if someone win
        if (
                (board[0].equals(player) && board[1].equals(player) && board[2].equals(player)) ||
                        (board[3].equals(player) && board[4].equals(player) && board[5].equals(player)) ||
                        (board[6].equals(player) && board[7].equals(player) && board[8].equals(player)) ||
                        (board[0].equals(player) && board[3].equals(player) && board[6].equals(player)) ||
                        (board[1].equals(player) && board[4].equals(player) && board[7].equals(player)) ||
                        (board[2].equals(player) && board[5].equals(player) && board[8].equals(player)) ||
                        (board[0].equals(player) && board[4].equals(player) && board[8].equals(player)) ||
                        (board[2].equals(player) && board[4].equals(player) && board[6].equals(player))
        ) {
            return true;
        } else {
            return false;
        }
    }

    public static int[] emptyIndexies(String[] board){ // check available position
        int count = 0;
        for (int i = 0; i < board.length; i++) { // count available position
            if (!board[i].equals("X") && !board[i].equals("O")) {
                count++;
            }
        }
        int[] emptyIndexies = new int[count]; // define array with size according to available position
        count = 0;
        for (int i = 0; i < board.length; i++) { // keep position index in array
            if (!board[i].equals("X") && !board[i].equals("O")) {
                emptyIndexies[count] = Integer.parseInt(board[i]);
                count++;
            }
        }
        return emptyIndexies;
    }

    public static int minimax (String[] newBoard, String player, int[] bestMove) { // minimax algorithm
        int[] availSpots = emptyIndexies(newBoard); // define array to keep available position index
        if (winning(newBoard, "O" )){ // check if game over due to someone win or draw and return score
            return -10;
        } else if (winning(newBoard, "X" )){
            return 10;
        } else if (availSpots.length == 0){
            return 0;
        }
        int[][] score = new int[availSpots.length][2]; // define ato keep mapping index and score
        for (int i = 0; i < availSpots.length; i++) { // loop through all index for next simulation move
            score[i][0] = availSpots[i]; // keep index
            newBoard[availSpots[i]] = player; // try mark
            if (player.equals("X")) { // turn player for next simulation move
                player = "O";
            } else {
                player = "X";
            }
            score[i][1] = minimax(newBoard, player, bestMove); // simulate next move in recursive until end path and get score
            if (player.equals("X")) {
                player = "O";
            } else {
                player = "X";
            }
            newBoard[availSpots[i]] = String.valueOf(score[i][0]); // roll back board status for another move
        }
        int bestScore = -1; // define variable tp keep best score
        if (player.equals("X")){
            bestScore = -10000; // assign lowest value as initial value
            for(var i = 0; i < score.length; i++){ // loop check all available index to get +10 score
                if(score[i][1] > bestScore){
                    bestScore = score[i][1];
                    bestMove[0] = score[i][0];
                }
            }
        } else {
            bestScore = 10000; // assign highest value as initial value
            for(var i = 0; i < score.length; i++){ // loop check all index available index to get -10 score
                if(score[i][1] < bestScore){
                    bestScore = score[i][1];
                    bestMove[0] = score[i][0];
                }
            }
        }
        return bestScore;
    }
}

ผลการเล่นจะเป็นเสมอ (draw) ตลอดเนื่องจากตาเดินถูกคำนวนล่วงหน้าแล้วว่าดีที่สุด ดังนั้นเราจะไม่มีทางชนะได้เลย

please select number to mark

| 0 1 2 |
| 3 4 5 |
| 6 7 8 |

playing by AI

| X 1 2 |
| 3 4 5 |
| 6 7 8 |

Enter the number: 4

| X 1 2 |
| 3 O 5 |
| 6 7 8 |

playing by AI

| X X 2 |
| 3 O 5 |
| 6 7 8 |

Enter the number: 2

| X X O |
| 3 O 5 |
| 6 7 8 |

playing by AI

| X X O |
| 3 O 5 |
| X 7 8 |

Enter the number: 3

| X X O |
| O O 5 |
| X 7 8 |

playing by AI

| X X O |
| O O X |
| X 7 8 |

Enter the number: 8

| X X O |
| O O X |
| X 7 O |

playing by AI

| X X O |
| O O X |
| X X O |

Draw

MARUPATNOTE

ใหนๆก็เขียนให้ตัวเองอ่านกันลืมแล้ว เลยเอามาแปะไว้ เผื่อจะเป็นประโยชน์กับผู้อื่นบ้าง

Java : เขียน AI ด้วย Minimax Algorithm – 1

แบ่งปันสิ่งนี้:

Related