文字計數器 - 統計檔案中的字數、行數、字元數
今天讓我們來實作第一個實用的命令行工具:文字計數器。這個工具類似於 Unix 系統中的 wc 命令,能夠統計文字檔案的字數、行數、字元數等資訊。透過這個專案,我們將學習到 Rust 的檔案處理、字串操作,以及命令行參數解析。
專案目標
我們要建立一個能夠:
統計檔案的行數
統計檔案的字數
統計檔案的字元數
支援多個檔案同時處理
提供清晰的輸出格式
cargo new wor_counter
cd word_counter
src/main.rs
use std::env;
use std::fs::File;
use std::io::{BufRead, BufReader, Result};
use std::path::Path;
#[derive(Debug, Default)]
struct FileStats {
    lines: usize,
    words: usize,
    chars: usize,
    bytes: usize,
}
impl FileStats {
    fn new() -> Self {
        FileStats {
            lines: 0,
            words: 0,
            chars: 0,
            bytes: 0,
        }
    }
    fn add(&mut self, other: &FileStats) {
        self.lines += other.lines;
        self.words += other.words;
        self.chars += other.chars;
        self.bytes += other.bytes;
    }
}
fn count_file_stats<P: AsRef<Path>>(filename: P) -> Result<FileStats> {
    let file = File::open(&filename)?;
    let reader = BufReader::new(file);
    
    let mut stats = FileStats::new();
    
    for line_result in reader.lines() {
        let line = line_result?;
        
        stats.lines += 1;
        stats.chars += line.chars().count() + 1; // +1 for newline
        stats.bytes += line.len() + 1; // +1 for newline
        
        // 統計字數:以空白字符分割
        let words_in_line = line
            .split_whitespace()
            .filter(|word| !word.is_empty())
            .count();
        stats.words += words_in_line;
    }
    
    Ok(stats)
}
fn print_stats(stats: &FileStats, filename: Option<&str>) {
    match filename {
        Some(name) => {
            println!("{:>8} {:>8} {:>8} {:>8} {}", 
                     stats.lines, 
                     stats.words, 
                     stats.chars, 
                     stats.bytes,
                     name);
        }
        None => {
            println!("{:>8} {:>8} {:>8} {:>8} total", 
                     stats.lines, 
                     stats.words, 
                     stats.chars, 
                     stats.bytes);
        }
    }
}
fn print_header() {
    println!("{:>8} {:>8} {:>8} {:>8} {}", 
             "lines", "words", "chars", "bytes", "filename");
    println!("{}", "-".repeat(50));
}
fn main() -> Result<()> {
    let args: Vec<String> = env::args().collect();
    
    if args.len() < 2 {
        eprintln!("Usage: {} <file1> [file2] [file3] ...", args[0]);
        std::process::exit(1);
    }
    
    let filenames = &args[1..];
    let mut total_stats = FileStats::new();
    let mut processed_files = 0;
    
    // 如果處理多個檔案,顯示表頭
    if filenames.len() > 1 {
        print_header();
    }
    
    for filename in filenames {
        match count_file_stats(filename) {
            Ok(stats) => {
                print_stats(&stats, Some(filename));
                total_stats.add(&stats);
                processed_files += 1;
            }
            Err(e) => {
                eprintln!("Error processing file '{}': {}", filename, e);
            }
        }
    }
    
    // 如果處理了多個檔案,顯示總計
    if processed_files > 1 {
        println!("{}", "-".repeat(50));
        print_stats(&total_stats, None);
    }
    
    Ok(())
}
#[cfg(test)]
mod tests {
    use super::*;
    use std::io::Write;
    use tempfile::NamedTempFile;
    #[test]
    fn test_count_simple_file() -> Result<()> {
        let mut temp_file = NamedTempFile::new()?;
        writeln!(temp_file, "Hello world")?;
        writeln!(temp_file, "Rust is awesome")?;
        
        let stats = count_file_stats(temp_file.path())?;
        
        assert_eq!(stats.lines, 2);
        assert_eq!(stats.words, 5); // "Hello world Rust is awesome"
        assert_eq!(stats.chars, 28); // 包含換行符
        
        Ok(())
    }
    #[test]
    fn test_empty_file() -> Result<()> {
        let temp_file = NamedTempFile::new()?;
        
        let stats = count_file_stats(temp_file.path())?;
        
        assert_eq!(stats.lines, 0);
        assert_eq!(stats.words, 0);
        assert_eq!(stats.chars, 0);
        assert_eq!(stats.bytes, 0);
        
        Ok(())
    }
}
我們加入 tempfile 的 dependency
[package]
name = "word_counter"
version = "0.1.0"
edition = "2021"
[dependencies]
[dev-dependencies]
tempfile = "3.8"
我們定義 FileStat
#[derive(Debug, Default)]
struct FileStats {
    lines: usize,    // 行數
    words: usize,    // 字數
    chars: usize,    // 字元數
    bytes: usize,    // 位元組數
}
使用方式
建立測試檔案:
bashecho "Hello Rust World
This is a test file
Let's count some words" > test.txt
編譯並執行:
bashcargo build --release
./target/release/word_counter test.txt
輸出結果:
3        8       43       43 test.txt
處理多個檔案:
bash./target/release/word_counter file1.txt file2.txt file3.txt