Python學習筆記: 批次刪除word docx 檔之頁首及頁尾（頁碼）

python word 批次移除頁首頁尾

mackuo 2023-05-31 08:16:22 ‧ 1479 瀏覽

分享至

本文同步發表於小弟自架網站（非釣魚也無廣告，純分享）：

微確幸資訊站

在實做本篇前，如果你的word檔案是舊版doc附檔名，
要先參考前一篇，將所有word舊版doc檔轉換成docx檔：
Python學習筆記: 批次將word舊版 doc 檔轉換成 docx 檔

在Python批次處理Word檔案的過程，可以使用docx這個模組，

例如收了20幾個word檔案，想要將word個別檔案的頁首和頁碼等資料去除後，
再來批次轉成pdf檔案編頁碼。

但這個模組只能處理Word的docx副檔名。

以下為示範檔案的下載連結：
https://drive.google.com/file/d/189VuhPyHImQV88592M9uwjSMcJgm6Bwu/view?usp=share_link

先看一下資料夾中有5個示範檔案，有些有頁首，有些有頁碼，也有些二者皆有：

打開「附件2-1_示範檔案3_有頁首_有頁碼.docx」檔案，
看一下頁首是一張相片，頁碼是P.1，P.2。

接下來用程式處理這些示範檔案：

import os
from docx import Document

# 設定目標資料夾
file_path = "d:\\temp\\test"
 
# 找出所有doc檔做成list
files = [x for x in os.listdir(file_path) if x.endswith(".docx")]

for file in files:
    docx_file = os.path.join(file_path, file)
    edited_docx_file = os.path.join(file_path, os.path.splitext(file)[0] + "_edited" + ".docx")
    document = Document(docx_file)
    for section in document.sections:
        section.different_first_page_header_footer = False
        section.header.is_linked_to_previous = True # 如果設定為False，則保留頁首
        section.footer.is_linked_to_previous = True # 如果設定為False，則保留頁尾
        print(f'正在處理{docx_file}')
        print('----------------------')
        
    document.save(edited_docx_file)
     
print('所有word檔移除頁首、頁尾：已完成')