因為最終要產出的教材,是靜態的html檔案,所以資料最好能轉成成方便Javascript處理的格式,也就是JSON或是.js檔。
既然微軟有OpenXML SDK,透過他來存取pptx檔,取出其中的資訊把它轉成JSON,也許是一個比較快的方法。那麼就先來試試看OpenXML SDK。
公司電腦是有Visual Studio,但是那是公司電腦...我自己是使用Mac,所以先來試用Mac上的開發工具。微軟的.NET Core可以在OSX上跑,不幸的是...很多套件還不支援,包含OpenXML SDK,所以先來用Mono。另外,Xamarin Studio可以直接開發Mono的專案,而且內建NuGet套件管理工具以及語法完成等方便的功能。就先用它了。
怎麼安裝就不說了,先開個Console專案,用NuGet裝好OpenXML SDK,就可以開始嘗試OpenXML SDK的諸多功能。
最初的動機是希望拿到一個pptx檔後,透過這個Prober就可以看到裡面使用的物件結構。其實沒寫完,但是...因為從物件的屬性往下遞迴,所以只要給根物件之後,他會把過Reflection所有屬性以及下一層的物件都印出來,這樣只要給他PresentationPart
物件就會一直往下跑。
先來看程式。首先是Console的主程式Program.cs:
using System;
using SlideProbe.Utils;
namespace SlideProbe
{
class MainClass
{
public static void Main(string[] args)
{
if (args.Length < 2)
{
Console.WriteLine("Usage: mono SlideProbe.exe [mode] [pptx file path]");
Help();
return;
}
string mode = args[0];
string file = args[1];
try
{
switch (mode)
{
case "presentation":
PresentationInfo pi = new PresentationInfo(file);
string[] keys = pi.AllKeys();
foreach (string key in keys)
{
pi.GetInfo(key);
}
break;
case "slide":
Console.WriteLine("[slide] mode not done yet.");
break;
case "layout":
Console.WriteLine("[layout] mode not done yet.");
break;
case "master":
Console.WriteLine("[master] mode not done yet.");
break;
case "theme":
Console.WriteLine("[theme] mode not done yet.");
break;
default:
Console.WriteLine("Error: Provided mode not supported.");
Help();
break;
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.WriteLine("-----------");
Console.WriteLine("Usage: mono SlideProbe.exe [mode] [pptx file path]");
Help();
}
}
private static void Help()
{
Console.WriteLine("");
Console.WriteLine("Supported mode:");
Console.WriteLine("[presentation] provide info about the presentation file.");
Console.WriteLine("[slide] provide info about every slides.");
Console.WriteLine("[layout] provide info about slide layout used by each slide.");
Console.WriteLine("[master] provide info about slide master used by each slide layout.");
Console.WriteLine("[theme] provide info about the theme used by each slide layout.");
}
}
}
負責做探測的程式放在Utils這個命名空間,首先是一個簡單的基底類別InfoBase.cs:
using System;
using System.Reflection;
using System.Collections.Generic;
using DocumentFormat.OpenXml.Packaging;
namespace SlideProbe.Utils
{
public class InfoBase
{
public HashSet<string> cache;
public string indent;
public InfoBase()
{
cache = new HashSet<string>();
indent = " ";
}
public void GetInfo(string key)
{
Type type = GetType();
string member = "Get" + key + "Info";
try
{
MethodInfo mi = type.GetMethod(member);
if (mi == null)
{
Console.WriteLine("Error(1): Method \"" + member + "\" of class " + type.FullName + " not found.");
return;
}
mi.Invoke(this, BindingFlags.Default, null, null, null);
}
catch (Exception ex)
{
if (ex.InnerException != null)
{
Console.WriteLine("Error: " + ex.InnerException.Message);
}
else
{
Console.WriteLine("Error: " + ex.Message);
}
}
}
public void ShowProperties(object o)
{
ShowProperties(o, "");
}
protected void ShowProperties(object o, string prefix)
{
if (null == o)
{
return;
}
Type type = o.GetType();
//Console.WriteLine("");
Console.BackgroundColor = ConsoleColor.Blue;
Console.ForegroundColor = ConsoleColor.White;
Console.Write(prefix + type.FullName + " : " + type.BaseType.FullName);
Console.BackgroundColor = ConsoleColor.Black;
Console.ResetColor();
if (!cache.Contains(o.GetType().FullName))
{
cache.Add(o.GetType().FullName);
Console.WriteLine("");
}
else
{
Console.Write(" (type probed.)");
Console.WriteLine("");
return;
}
foreach (PropertyInfo pi in type.GetProperties())
{
if (pi.CanRead)
{
int l = pi.GetIndexParameters().Length;
if (l > 0)
{
//Console.Write(prefix + "1[Indexed]\t" + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
Console.ForegroundColor = ConsoleColor.Gray;
Console.Write("(Indexed Property)");
}
else
{
var value = pi.GetValue(o, null);
if (null != value)
{
Type vt = value.GetType();
if (vt.FullName.Contains("DocumentFormat.OpenXml"))
{
if (vt.FullName.Contains("DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer"))
{
Console.BackgroundColor = ConsoleColor.Black;
//Console.Write(prefix + "7[Part]" + indent + pi.Name + " (" + vt.FullName + ") : ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
var pairs = value as IEnumerable<IdPartPair>;
if (pairs != null)
{
Console.WriteLine("");
foreach (var pair in (IEnumerable<IdPartPair>)value)
{
if (pair.OpenXmlPart != null)
{
ShowDetail(o, pair.OpenXmlPart, prefix);
}
else
{
//Console.Write(prefix + "6" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
}
}
}
}
else
{
Console.BackgroundColor = ConsoleColor.Black;
//Console.Write(prefix + "5" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
if (value != null)
{
ShowDetail(o, value, "");
}
else
{
//Console.Write(prefix + "6" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
}
}
}
else
{
//Console.Write(prefix + "2[Others]" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
Console.ForegroundColor = ConsoleColor.Gray;
Console.Write(value);
Console.ResetColor();
}
}
else {
//Console.Write(prefix + "3[Null]" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
}
}
}
else
{
//Console.Write(prefix + "4[WriteOnly]" + indent + pi.Name + ": ");
Console.Write(prefix + indent + Decorate(pi.Name) + ": ");
}
Console.WriteLine("");
}
}
protected void ShowDetail(object o, object value, string prefix)
{
if (value != null)
{
Console.ForegroundColor = ConsoleColor.Gray;
Type sub = o.GetType();
if (sub.IsPrimitive)
{
Console.Write(value);
Console.ResetColor();
}
else
{
Console.ResetColor();
ShowProperties(value, prefix + indent);
}
}
}
protected string Decorate(string name)
{
return "[" + name + "]";
}
}
}
為了避免程式跑到堆疊溢位...裡面加上了一些限制:
然後是實際要呼叫的類別,因為要從PresentationPart開始探查,所以叫做PresentationInfo.cs:
using System;
using System.Reflection;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Presentation;
using DocumentFormat.OpenXml.Packaging;
namespace SlideProbe.Utils
{
public class PresentationInfo : InfoBase
{
private string file;
public PresentationDocument ppt;
public PresentationInfo(string _file)
{
file = _file;
ppt = PresentationDocument.Open(file, false);
}
public string[] AllKeys()
{
return new string[] {
"PresentationPart"/*,
"Presentation"*/
};
}
public void GetPresentationPartInfo()
{
ShowProperties(ppt.PresentationPart);
}
public void GetPresentationInfo()
{
ShowProperties(ppt.PresentationPart.Presentation);
}
}
}
因為主要的工作其實在InfoBase.cs做完了,所以程式只有一點。另外Presentation其實透過PresentationPart.Presentation就會找到,所以就直接放棄。
程式跑起來像這樣:
最後一行結果:
另外用wc -l
統計了一下,包含空行總共輸出了5243行...
藍底白字的是類別的資訊,方括號框起來的是Property Name。另外,有一個特別的Property名字通常叫做「Parts」,這裡面會有不同Part文件的關聯性列表。在InfoBase.cs裡面有特別處理。
程式中沒有列出「方法」...因為重複的太多,所以就省略了。不過有一個方法Descends()最好是一開始就知道的,後面會再提到。
通過觀察,可以發現OpenXML SDK的架構大概是像這樣:
PresentationPart.Presentation
中DocumentFormat.OpenXml.Packaging.PresentationPart
DocumentFormat.OpenXml.Presentation.Presentation
之後也用node.js來做做看,不過這樣就要直接碰xml檔XD
另外,程式檔我打包了,可以直接下載來用用看:SlideProbe.zip