文字計數器 - 統計檔案中的字數、行數、字元數
今天讓我們來實作第一個實用的命令行工具:文字計數器。這個工具類似於 Unix 系統中的 wc 命令,能夠統計文字檔案的字數、行數、字元數等資訊。透過這個專案,我們將學習到 Rust 的檔案處理、字串操作,以及命令行參數解析。
專案目標
我們要建立一個能夠:
統計檔案的行數
統計檔案的字數
統計檔案的字元數
支援多個檔案同時處理
提供清晰的輸出格式
cargo new wor_counter
cd word_counter
src/main.rs
use std::env;
use std::fs::File;
use std::io::{BufRead, BufReader, Result};
use std::path::Path;
#[derive(Debug, Default)]
struct FileStats {
lines: usize,
words: usize,
chars: usize,
bytes: usize,
}
impl FileStats {
fn new() -> Self {
FileStats {
lines: 0,
words: 0,
chars: 0,
bytes: 0,
}
}
fn add(&mut self, other: &FileStats) {
self.lines += other.lines;
self.words += other.words;
self.chars += other.chars;
self.bytes += other.bytes;
}
}
fn count_file_stats<P: AsRef<Path>>(filename: P) -> Result<FileStats> {
let file = File::open(&filename)?;
let reader = BufReader::new(file);
let mut stats = FileStats::new();
for line_result in reader.lines() {
let line = line_result?;
stats.lines += 1;
stats.chars += line.chars().count() + 1; // +1 for newline
stats.bytes += line.len() + 1; // +1 for newline
// 統計字數:以空白字符分割
let words_in_line = line
.split_whitespace()
.filter(|word| !word.is_empty())
.count();
stats.words += words_in_line;
}
Ok(stats)
}
fn print_stats(stats: &FileStats, filename: Option<&str>) {
match filename {
Some(name) => {
println!("{:>8} {:>8} {:>8} {:>8} {}",
stats.lines,
stats.words,
stats.chars,
stats.bytes,
name);
}
None => {
println!("{:>8} {:>8} {:>8} {:>8} total",
stats.lines,
stats.words,
stats.chars,
stats.bytes);
}
}
}
fn print_header() {
println!("{:>8} {:>8} {:>8} {:>8} {}",
"lines", "words", "chars", "bytes", "filename");
println!("{}", "-".repeat(50));
}
fn main() -> Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() < 2 {
eprintln!("Usage: {} <file1> [file2] [file3] ...", args[0]);
std::process::exit(1);
}
let filenames = &args[1..];
let mut total_stats = FileStats::new();
let mut processed_files = 0;
// 如果處理多個檔案,顯示表頭
if filenames.len() > 1 {
print_header();
}
for filename in filenames {
match count_file_stats(filename) {
Ok(stats) => {
print_stats(&stats, Some(filename));
total_stats.add(&stats);
processed_files += 1;
}
Err(e) => {
eprintln!("Error processing file '{}': {}", filename, e);
}
}
}
// 如果處理了多個檔案,顯示總計
if processed_files > 1 {
println!("{}", "-".repeat(50));
print_stats(&total_stats, None);
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
use tempfile::NamedTempFile;
#[test]
fn test_count_simple_file() -> Result<()> {
let mut temp_file = NamedTempFile::new()?;
writeln!(temp_file, "Hello world")?;
writeln!(temp_file, "Rust is awesome")?;
let stats = count_file_stats(temp_file.path())?;
assert_eq!(stats.lines, 2);
assert_eq!(stats.words, 5); // "Hello world Rust is awesome"
assert_eq!(stats.chars, 28); // 包含換行符
Ok(())
}
#[test]
fn test_empty_file() -> Result<()> {
let temp_file = NamedTempFile::new()?;
let stats = count_file_stats(temp_file.path())?;
assert_eq!(stats.lines, 0);
assert_eq!(stats.words, 0);
assert_eq!(stats.chars, 0);
assert_eq!(stats.bytes, 0);
Ok(())
}
}
我們加入 tempfile 的 dependency
[package]
name = "word_counter"
version = "0.1.0"
edition = "2021"
[dependencies]
[dev-dependencies]
tempfile = "3.8"
我們定義 FileStat
#[derive(Debug, Default)]
struct FileStats {
lines: usize, // 行數
words: usize, // 字數
chars: usize, // 字元數
bytes: usize, // 位元組數
}
使用方式
建立測試檔案:
bashecho "Hello Rust World
This is a test file
Let's count some words" > test.txt
編譯並執行:
bashcargo build --release
./target/release/word_counter test.txt
輸出結果:
3 8 43 43 test.txt
處理多個檔案:
bash./target/release/word_counter file1.txt file2.txt file3.txt