iT邦幫忙

2023 iThome 鐵人賽

DAY 27
0
Software Development

開心撰寫 PHPUnit系列 第 27

Day 27. 兩個物件互動測試 - PHPVCR

  • 分享至 

  • xImage
  •  

在上一篇我們的 Mock 對象已經改為 ClientInterface,那究竟將不能使用 PHPVCR 報替代 mock ClientInterface 呢?

如果要改為 PHPVCR 會面臨到的問題是

  1. HTML 不能任意修改
  2. 看版的文章列表分頁過多

這兩個問題,HTML 不能任意修改是無法解決的,但我們是不是可以控制看版的文章列表要抓取幾頁,根據 Board 的程式,我們的 fetch 有第二個參數,所以我們可以把程式改為

<?php
// src/PttCrawler.php

namespace Recca0120\Ithome30;

use Generator;
use Recca0120\Ithome30\Crawlers\Home;
use Recca0120\Ithome30\Crawlers\Board;

class PttCrawler
{
    public function __construct(private Home $home, private Board $board)
    {
    }

    public function all(): Generator
    {
        foreach ($this->home->all() as $board) {
            // 只抓取一個分頁
            foreach ($this->board->fetch($board, 1) as $paginator) {
                yield $paginator;
            }
        }
    }
}

但這樣合理嗎?當然不合理!因為這樣的修改,我們要發佈到正式平台上,我們就必須再修改程式,所以不能這樣修改。那要怎麼修改呢?有沒有發現 Board 是用注入的方式傳進來,我們只需讓 Board 注入進來前就指定 Board 只能抓取第一頁即可。

所以我們先調整 Board 的程式碼

<?php

namespace Recca0120\Ithome30\Crawlers;

use Generator;
use GuzzleHttp\Psr7\Request;
use Recca0120\Ithome30\Paginator;
use Psr\Http\Client\ClientInterface;

class Board
{
    // 增加一個 $take 的 property 保存下來
    public function __construct(private ClientInterface $httpClient, private ?int $take = null)
    {
    }

    public function fetch(array $board): Generator
    {
        $url = $board['url'];

        $page = 0;
        do {
            $page++;

            $html = $this->sendRequest($url);
            $rows = array_map(
                fn (string $row)  => $this->parseCols($row, $board),
                $this->parseRows($html)
            );

            yield $paginator = new Paginator($html, $rows, $page);

            if ($this->take !== null && $paginator->currentPage >= $this->take) {
                break;
            }

            $url = $paginator->meta['prev'];
        } while ($paginator->hasMorePage());
    }

    private function sendRequest($url)
    {
        $request = new Request('GET', $url, [
            'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
            'Accept-Encoding' => 'gzip, deflate, br',
            'Accept-Language' => 'zh-TW,zh;q=0.8',
            'Cache-Control' => 'max-age=0',
            'Cookie' => 'over18=1',
            'Referer' => 'https://www.ptt.cc/bbs/Gossiping/index.html',
            'Sec-Ch-Ua' => '"Brave";v="117", "Not;A=Brand";v="8", "Chromium";v="117"',
            'Sec-Ch-Ua-Mobile' => '?0',
            'Sec-Ch-Ua-Platform' => '"macOS"',
            'Sec-Fetch-Dest' => 'document',
            'Sec-Fetch-Mode' => 'navigate',
            'Sec-Fetch-Site' => 'same-origin',
            'Sec-Fetch-User' => '?1',
            'Sec-Gpc' => '1',
            'Upgrade-Insecure-Requests' => '1',
            'User-Agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36',
        ]);
        $response = $this->httpClient->sendRequest($request);
        $html = (string)$response->getBody();

        return $html;
    }

    private function parseCols($row, $board)
    {
        preg_match_all('/<div class="(?<name>(nrec|title|author|date))"[^>]*>(?<value>.*?)<\/div>/s', $row, $matches);

        $cols = [
            'board_name' => $board['name'],
            'board_class' => $board['class'],
        ];

        foreach (array_keys($matches[0]) as $index) {
            $cols[$matches['name'][$index]] = trim($matches['value'][$index]);
        }
        $cols['nrec'] = strip_tags($cols['nrec']);

        preg_match('/href="(.*)"/', $cols['title'], $matched);
        $cols['url'] = 'https://www.ptt.cc' . $matched[1];

        preg_match('/\[(.+)\](.+)/', strip_tags($cols['title']), $matched);
        $cols['type'] = trim($matched[1]);
        $cols['title'] = trim($matched[2]);

        return $cols;
    }

    private function parseRows($html)
    {
        preg_match_all('/class="r-ent">.+<div class="mark">(.+)<\/div>/sU', $html, $matches);

        return $matches[0];
    }
}

當然測試也必須得做相對應的修改

<?php

namespace Recca0120\Ithome30\Tests\Crawlers;

use Mockery;
use GuzzleHttp\Client;
use PHPUnit\Framework\TestCase;
use Recca0120\Ithome30\Crawlers\Board;

class BoardTest extends TestCase
{
    public function test_fetch_board_articles_list()
    {
        \VCR\VCR::turnOn();
        \VCR\VCR::insertCassette('ptt_gossiping.yaml');

        /** @var Mockery\Mock|ClientInterface $httpClient */
        $httpClient = Mockery::spy(new Client());

        $crawler = new Board($httpClient, 2);
        $records = iterator_to_array($crawler->fetch([
            'name' => 'Gossiping',
            "nuser" => '8803',
            'class' => '綜合',
            'title' => '[八卦] 亞運李智凱、許皓鋐奪金!',
            'url' => 'https://www.ptt.cc/bbs/Gossiping/index.html'
        ]));

        self::assertCount(23, $records[0]);
        self::assertEquals([
            'board_name' => 'Gossiping',
            'board_class' => '綜合',
            'nrec' => '4',
            'type' => '問卦',
            'title' => '司機夫人真的有去卡地亞血拚$1.1M嗎?',
            'author' => 'uwmtsa',
            'date' => '10/06',
            'url' => 'https://www.ptt.cc/bbs/Gossiping/M.1696537444.A.1A5.html',
        ], $records[0][0]);

        \VCR\VCR::eject();
        \VCR\VCR::turnOff();
    }
}

確定是綠燈後,我們再去調整 PttCrawlerTest

<?php

namespace Recca0120\Ithome30\Tests;

use GuzzleHttp\Client;
use PHPUnit\Framework\TestCase;
use Recca0120\Ithome30\PttCrawler;
use Recca0120\Ithome30\Crawlers\Home;
use Recca0120\Ithome30\Crawlers\Board;

class PttCrawlerTest extends TestCase
{
    public function test_fetch_board_page()
    {
        \VCR\VCR::turnOn();
        \VCR\VCR::insertCassette('ptt.yaml');

        $crawler = new PttCrawler(new Home(new Client()), new Board(new Client()));
        $generator = $crawler->all();
        $paginator = $generator->current();

        self::assertEquals([
            'board_name' => 'Gossiping',
            'board_class' => '綜合',
            'nrec' => '',
            'type' => '問卦',
            'title' => 'JFFT床哥是誰?',
            'author' => 'nicky1245',
            'date' => '10/12',
            'url' => 'https://www.ptt.cc/bbs/Gossiping/M.1697122126.A.6F5.html',
        ], $paginator[0]);

        \VCR\VCR::eject();
        \VCR\VCR::turnOff();
    }
}

調整完後再執行測試,我們依然能得到綠燈。

依賴注入的方式來撰寫程式就能透過物件的一些用法,在不修改 product code 的情況下調整程式碼,讓我們可以使用 PHPUnit 進行開發並驗證程式的正確性


上一篇
Day 26. 兩個物件互動測試 - 變更 Mock 對象讓測試更穩固
下一篇
Day 28. 消除重覆程式碼 - 抽象類別
系列文
開心撰寫 PHPUnit30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言