2025 iThome鐵人賽
「 Flutter :30天打造念佛App,跨平台從Mobile到VR,讓極樂世界在眼前實現 ! 」
Day 15
「 Flutter 語音辨識 實戰應用篇 — 生活在地球的勇者啊,你聽過阿彌陀佛嗎(6) 」
昨天我們已經初步認識 地端語音轉文字,也知道地端與雲端語音轉文字的差異。
今天我們就來進一步認識 「 sherpa_onnx 」,
並且從Example Code 知道如何實踐flutter App的「 離線即時語音轉文字 」!
Day15 文章目錄:
一、sherpa_onnx
二、離線即時語音轉文字
三、改用純中文模型
1. 簡介
sherpa-onnx
是一個開源的地端語音辨識套件/工具庫,
它允許我們在不同平台上(手機、電腦等)將語音模型放入。
它不需要網路連線,可以直接在設備上完成離線語音轉文字等功能。
2. 功能
語音轉文字、語音合成(文字轉語音)、聲源分離
語者辨識、語者分段、語者驗證(比對是否同一人)
語言識別、音訊分類、聲音活動檢測
關鍵字偵測、添加標點符號、語音增強/降噪
3. 支援的平台、語言
4. 採用sherpa-onnx的專案
(1) BreezeApp from MediaTek Research #
(2) 騰訊會議摸魚工具 TMSpeech
(3) Flutter-EasySpeechRecognition
(4) Open-XiaoAI KWS
5. 授權 License
授權重點 | 內容 |
---|---|
可免費商用 | 可以將 sherpa-onnx 用在商業 App,無須付費或取得特別授權。 |
可修改 / 再散布 | 可以修改程式碼、發佈修改版,但必須保留授權聲明並註明更改。 |
專利保護 | 使用時不怕專利侵權(在授權人可授權的範圍內)。 |
需保留 LICENSE/NOTICE | 若將 sherpa-onnx 打包進 App,要在文件或關於頁面附上 Apache 2.0 授權聲明。 |
免責條款 | 若程式有 bug 導致損失,原作者不予負責。 |
Example
https://github.com/Jason-chen-coder/Flutter-EasySpeechRecognition
lib/
download_model.dart # 下載/解壓進度與模型名稱
online_model.dart # 將下載後的 onnx 檔組成 OnlineModelConfig
streaming_asr.dart # 主畫面 + 錄音 + sherpa-onnx 串流解碼
utils.dart # 下載、解壓、PCM 轉 Float32 等工具
main.dart # 入口,注入 Provider 後載入 StreamingAsrScreen
widgets/
download_progress_dialog.dart
1. pubspec.yaml
dependencies:
flutter:
sdk: flutter
sherpa_onnx: ^1.10.45
record: ^5.1.0 # 錄音串流
archive: ^4.0.3 # 解壓縮
http: ^1.3.0 # 下載模型
provider: ^6.1.2 # 簡單狀態管理
path_provider: ^2.1.3 # 取得常用目錄
path: ^1.9.0
url_launcher: ^6.2.6
flutter:
uses-material-design: true
assets:
- assets/ # 專案預留;用於預先放入模型,不需要使用者啟動時下載
2. 權限設置
(1) iOS
<key>NSMicrophoneUsageDescription</key>
<string>需要使用麥克風進行語音辨識</string>
(2) Android
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
3. 下載與解壓進度
(1)下載模型
import 'package:flutter/cupertino.dart';
class DownloadModel with ChangeNotifier {
String _modelName =
"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20";
String get modelName => _modelName;
void setModelName(String value) {
_modelName = value;
notifyListeners();
}
double _progress = 0;
double get progress => _progress;
void setProgress(double value) {
if (value >= 1.0) {
_progress = 1;
} else {
_progress = value;
}
notifyListeners();
}
double _unzipProgress = 0;
double get unzipProgress => _unzipProgress;
void setUnzipProgress(double value) {
if (value >= 1.0) {
_unzipProgress = 1;
} else {
_unzipProgress = value;
}
notifyListeners();
}
}
Future<void> downloadModelAndUnZip(BuildContext context, String modelName) async {
final url = 'https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$modelName.tar.bz2';
final downloadModel = Provider.of<DownloadModel>(context, listen: false);
final dir = await getApplicationDocumentsDirectory();
final modelDir = join(dir.path, modelName);
final bz2Path = join(dir.path, '$modelName.tar.bz2');
if (await Directory(modelDir).exists()) return; // 已解壓完成
if (!await File(bz2Path).exists()) { // 還沒下載
final confirmed = await _showDownloadConfirmationDialog(context);
if (!confirmed) return;
// 顯示進度視窗
showDialog(context: context, barrierDismissible: false,
builder: (_) => DownloadProgressDialog());
try {
final req = http.Request('GET', Uri.parse(url));
final resp = await http.Client().send(req);
final total = resp.contentLength ?? 0;
var received = 0;
final sink = File(bz2Path).openWrite();
await resp.stream.forEach((chunk) {
sink.add(chunk); received += chunk.length;
downloadModel.setProgress(total > 0 ? received / total : 0);
});
await sink.flush(); await sink.close();
await _unzipDownloadedFile(bz2Path, dir.path, context);
} catch (e) {
if (Navigator.canPop(context)) Navigator.of(context).pop();
await File(bz2Path).delete().catchError((_) {});
showDialog(context: context, builder: (_) => AlertDialog(
title: const Text('Download Failed'),
content: Text(e.toString()),
actions: [TextButton(onPressed: ()=>Navigator.pop(context), child: const Text('OK'))],
));
}
}
}
(2)解壓縮模型
Future<void> _unzipDownloadedFile(
String bz2Path, String dstDir, BuildContext context) async {
final m = Provider.of<DownloadModel>(context, listen: false);
m.setUnzipProgress(0.1);
// bzip2 → tar
final bzBytes = File(bz2Path).readAsBytesSync();
final tarBytes = BZip2Decoder().decodeBytes(bzBytes);
final tar = TarDecoder().decodeBytes(tarBytes);
m.setUnzipProgress(0.4);
final total = tar.files.length;
var done = 0;
for (final f in tar.files) {
final out = join(dstDir, f.name);
if (f.isFile) {
File(out)..createSync(recursive: true)..writeAsBytesSync(f.content as List<int>);
} else {
Directory(out).createSync(recursive: true);
}
done++;
m.setUnzipProgress(0.4 + 0.6 * done / total);
}
if (Navigator.canPop(context)) Navigator.of(context).pop();
_showSuccessDialog(context); // 提示完成
}
(3)音訊 Int16 PCM 轉 Float32(要餵給 sherpa-onnx)
Float32List convertBytesToFloat32(Uint8List bytes, [Endian endian = Endian.little]) {
final data = ByteData.view(bytes.buffer);
final out = Float32List(bytes.length ~/ 2);
for (var i = 0; i < bytes.length; i += 2) {
final s = data.getInt16(i, endian);
out[i ~/ 2] = s / 32768.0;
}
return out;
}
4. 將下載完成的模型組成Config
Future<sherpa_onnx.OnlineModelConfig> getModelConfigByModelName({
required String modelName,
}) async {
final dir = await getApplicationDocumentsDirectory();
final root = join(dir.path, modelName);
switch (modelName) {
case "sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20":
return sherpa_onnx.OnlineModelConfig(
transducer: sherpa_onnx.OnlineTransducerModelConfig(
encoder: '$root/encoder-epoch-99-avg-1.int8.onnx',
decoder: '$root/decoder-epoch-99-avg-1.onnx',
joiner : '$root/joiner-epoch-99-avg-1.onnx',
),
tokens: '$root/tokens.txt',
modelType: 'zipformer',
);
default:
throw ArgumentError('Unsupported modelName: $modelName');
}
}
5. 重要成員與初始化
//建立辨識器
Future<sherpa_onnx.OnlineRecognizer> createOnlineRecognizer(String modelName) async {
final model = await getModelConfigByModelName(modelName: modelName);
final cfg = sherpa_onnx.OnlineRecognizerConfig(model: model, ruleFsts: '');
return sherpa_onnx.OnlineRecognizer(cfg);
}
class _StreamingAsrScreenState extends State<StreamingAsrScreen> {
late final TextEditingController _controller; // 顯示辨識結果
late final AudioRecorder _audioRecorder; // 錄音工具
String _last = ''; // 保存上一句最終辨識結果
int _index = 0; // 句子編號
bool _isInitialized = false; // sherpa-onnx 初始化狀態
sherpa_onnx.OnlineRecognizer? _recognizer; // sherpa-onnx 辨識器
sherpa_onnx.OnlineStream? _stream; // 對應的音訊串流
int _sampleRate = 16000; // 取樣率
StreamSubscription<RecordState>? _recordSub; // 錄音狀態監聽
RecordState _recordState = RecordState.stop; // 當前錄音狀態(start/stop/pause)
@override
void initState() {
super.initState();
// 初始化錄音器
_audioRecorder = AudioRecorder();
// 初始化文字控制器
_controller = TextEditingController();
// 監聽錄音狀態
_recordSub = _audioRecorder.onStateChanged().listen(_updateRecordState);
}
}
6. 開始語音轉文字
Future<void> _start() async {
final dm = context.read<DownloadModel>();
final name = dm.modelName;
// 還在下載/解壓 → 顯示進度窗直接返回
final downloading = dm.progress > 0 && dm.progress < 1;
final unziping = dm.unzipProgress > 0 && dm.unzipProgress < 1;
if (downloading || unziping) {
showDialog(context: context, barrierDismissible: false,
builder: (_) => DownloadProgressDialog());
return;
}
// 尚未存在 → 下載;只有壓縮檔 → 解壓
if (await needsDownload(name)) { downloadModelAndUnZip(context, name); return; }
if (await needsUnZip(name)) { unzipModelFile(context, name); return; }
if (!_isInitialized) {
sherpa_onnx.initBindings();
_recognizer = await createOnlineRecognizer(name);
_stream = _recognizer!.createStream();
_isInitialized = true;
}
// 啟動錄音串流
if (await _audioRecorder.hasPermission()) {
const cfg = RecordConfig(encoder: AudioEncoder.pcm16bits, sampleRate: 16000, numChannels: 1);
final stream = await _audioRecorder.startStream(cfg);
stream.listen((bytes) {
final f32 = convertBytesToFloat32(Uint8List.fromList(bytes));
_stream!.acceptWaveform(samples: f32, sampleRate: 16000);
// 解碼
while (_recognizer!.isReady(_stream!)) {
_recognizer!.decode(_stream!);
}
final text = _recognizer!.getResult(_stream!).text;
var toShow = _last;
if (text.isNotEmpty) {
toShow = (_last.isEmpty) ? '$_index: $text' : '$_index: $text\n$_last';
}
//語者停頓的句尾,顯示辨識結果並重置stream
if (_recognizer!.isEndpoint(_stream!)) {
_recognizer!.reset(_stream!);
if (text.isNotEmpty) { _last = toShow; _index += 1; }
}
_controller.value = TextEditingValue(
text: toShow,
selection: TextSelection.collapsed(offset: toShow.length),
);
});
}
}
7. 停止語音轉文字
Future<void> _stop() async {
_stream!.free();
_stream = _recognizer!.createStream(); // 建立新stream
await _audioRecorder.stop();
}
文件連結:
https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app.html
1.改模型名稱
class DownloadModel with ChangeNotifier {
String _modelName =
// 改這一行:中文(WenetSpeech)串流模型的資料夾名稱
"icefall-asr-zipformer-streaming-wenetspeech-20230615";
...
}
2.getModelConfigByModelName() 新增對應 case
Future<sherpa_onnx.OnlineModelConfig> getModelConfigByModelName({
required String modelName,
}) async {
final directory = await getApplicationDocumentsDirectory();
final modulePath = join(directory.path, modelName);
switch (modelName) {
//純中文模型
case "icefall-asr-zipformer-streaming-wenetspeech-20230615":
final m = modulePath;
return sherpa_onnx.OnlineModelConfig(
transducer: sherpa_onnx.OnlineTransducerModelConfig(
encoder: '$m/exp/encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx',
decoder: '$m/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx',
joiner : '$m/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx',
),
tokens: '$m/data/lang_char/tokens.txt',
modelType: 'zipformer2',
);
// 原本雙語模型(保留)
case "sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20":
final m2 = modulePath;
return sherpa_onnx.OnlineModelConfig(
transducer: sherpa_onnx.OnlineTransducerModelConfig(
encoder: '$m2/encoder-epoch-99-avg-1.int8.onnx',
decoder: '$m2/decoder-epoch-99-avg-1.onnx',
joiner : '$m2/joiner-epoch-99-avg-1.onnx',
),
tokens: '$m2/tokens.txt',
modelType: 'zipformer',
);
default:
throw ArgumentError('Unsupported modelName: $modelName');
}
}
重點 | 內容 |
---|---|
sherpa_onnx | 開源的地端語音套件 |
離線即時語音轉文字 | 核心實作步驟 |
改用純中文模型 | getModelConfigByModelName()新增case |