[C++] [ITSA競賽題目][57-4] 尋找字串中最常出現的樣板長度及內容

c++ itsa

小碼農米爾 2017-10-10 15:53:26 ‧ 12895 瀏覽

分享至

題目出自ITSA競賽，解答僅供參考

輸入:

3
QweTghJreDfeT
QweTghJreDfeQweEsazFredQweqw
dcBadcBadbdcBa

解答:

#include <stdio.h>
#include <stdlib.h>
#include <cstring>
using namespace std;

int search(char *str, int length, int start, int end);

int main(void)
{
    int n;
    scanf("%d", &n);

    while (n > 0)
    {
        char str[600];
        int length = 0;

        scanf("%s", str);
        while (str[length] != '\0')
        {
            length++;
        }

        char styles[500][600]; //樣式集合
        int index = 0;         //樣式集合的索引
        int max = 0;           //最高次數

        //for迴圈繞出所有樣式
        for (int i = 0; i < length - 1; i++)
        {
            for (int j = i + 1; j < length; j++)
            {
                //樣式長度超過字串長度的一半
                if (j - i + 1 > length / 2)
                    continue;
                    
                int count = search(str, length, i, j);
                if (count >= max)
                {
                    if (count > max)
                    {
                        //初始化樣式集合
                        index = 0;
                        max = count;
                    }

                    //將樣式加入樣式集合
                    int offset = 0;
                    for (int k = i; k <= j; k++)
                    {
                        styles[index][offset] = str[k];
                        offset++;
                    }
                    styles[index][offset] = '\0';
                    index++;
                }
            }
        }

        //氣泡排序
        char temp[600];
        for (int i = 0; i < index - 1; i++)
        {
            for (int j = i + 1; j < index; j++)
            {
                int offset = 0;
                while (true)
                {
                    //前面字元小於後面字元，或前面字元先遇到結尾
                    if (styles[i][offset] < styles[j][offset] || 
                        styles[i][offset] == '\0')
                    {
                        break;
                    }
                    //後面字元小於前面字元，或後面字元先遇到結尾，交換位置
                    if (styles[i][offset] > styles[j][offset] || 
                        styles[j][offset] == '\0')
                    {
                        strcpy(temp, styles[i]);
                        strcpy(styles[i], styles[j]);
                        strcpy(styles[j], temp);
                        break;
                    }
                    offset++;
                }
            }
        }

        //印出結果
        printf("%d ", max);
        for (int i = 0; i < index; i++)
        {
            if (i != 0)
                printf(" ");
            printf("%s", styles[i]);
        }
        printf("\n");

        n--;
    }

    system("pause");
    return 0;
}

//return 返回樣式出現幾次
//str    字串
//length 字串長度
//start  要比對的樣式的開始索引
//end    要比對的樣式的結束索引
int search(char *str, int length, int start, int end)
{
    int count = 0;
    for (int i = start; i < length; i++)
    {
        //找到第一個符合的字元
        if (str[i] == str[start])
        {
            bool isMatch = true;
            int offset = 0;
            for (int j = start; j <= end; j++)
            {
                //start到end之間只要有一個字元不符，或遇到結尾
                if (str[i + offset] == '\0' || str[i + offset] != str[j])
                {
                    isMatch = false;
                    break;
                }
                offset++;
            }
            if (isMatch)
            {
                count++;
                //i加上比對的樣式長度
                i += offset - 1;
            }
        }
    }
    return count;
}

說明:
先用 for 迴圈將樣式繞出來，再丟進 search 函數算出樣式在字串中出現的次數，中間跳過超過字串長度一半的樣式，字串的繞法如下，以 abcdef 為例:
ab abc bc bcd cd cde de def ef

for (int i = start; i < length; i++)

search 函數內的第一個 for 迴圈 i 從 start 開始，因為我發現如果字串內有重覆的樣式，例如 abab，會重覆搜尋 ab 兩次並都返回2，main 取 max 時會取到重覆的樣式，所以偷吃步把 i=0 改成 i=start，第二次的 ab 就會返回1而不是2，main 就不會取到重覆的樣式，因為後面的次數一定會比前面的少。

i += offset - 1;

search 函數內 offset 的作用是，如果樣式符合就要跳過整個樣式，不重覆搜尋，樣式與樣式不重疊，例如 ababac 使用 aba 搜尋，會返回1而不是2，中間的 a 在第一次符合後就會因為 offset 而被跳過。

結語:
這題覺得沒有寫得很好，重覆的樣式會被重覆搜尋，會造成效能上不必要的損耗，有想過在進入 search 函數之前先擋掉，不過後來不想增加閱讀複雜度所以沒有加，styles 陣列本來想做成動態的，後來因為要處理記憶體釋放問題，程式變得很醜所以放棄用最原始寫死的方式，這些都是可以繼續優化的地方。
氣泡排序那邊本來寫了一大堆程式處理字串調換，後來想到可以用 strcpy 來做，就變得很簡潔 XD

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

ChatGPT Business & Codex 如何從零開始?

IT邦幫忙

[C++] [ITSA競賽題目][57-4] 尋找字串中最常出現的樣板長度及內容

尚未有邦友留言

標記使用者