裏頭包含了 2004 ~ 2018 他們蒐集到的資料,全都是 MIDI 檔,你可以隨意選你想用的音樂,我這邊自己挑了 1281 首,處理的時候,我們不管 control change,只在乎 note_on (照理說應該要加 control change 下去訓練,但這樣處理資料變很麻煩,那時候又有時間壓力所以就沒又加了XD,改成最後生成的時候隨機添加)。
paths = []
songs = []
#append every filepath in your_music_folder folder to paths[]
for r, d, f in os.walk("./your_music_folder"):
for file in f:
if '.mid' in file:
paths.append(os.path.join(r, file))
#for each path in the array, create a Mido object and append it to songs
for path in paths:
mid = MidiFile(path, type = 1)
songs.append(mid)
del paths
from math import sqrt
seq_len = 256
gen_len = int(sqrt(seq_len))
notes = []
dataset = []
chunk = []
# for each in midi object in list of songs
for i in range(len(songs)):
for msg in songs[i]:
# filtering out meta messages
if not msg.is_meta:
# filtering out control changes
if (msg.type == 'note_on'):
notes.append(msg.note)
for j in range(1, len(notes)):
chunk.append(notes[j])
#save each 16 note chunk
if (j % seq_len == 0):
dataset.append(chunk)
chunk = []
print(f"Processing {i} Song")
chunk = []
notes = []
del chunk
del notes
train_data = np.array(dataset)
np.save("preprocess_data.npz",train_data)
到此,資料前處理的部分就完結了,明天就可以拿這個來訓練了!