今天來refactor DB schema以及ETL有關這兩個的review可以看昨天的文章[Day 8] Food Delivery Platform Code review (二)
昨天有提到Schema的設計上缺少了foreign key還有正規化與命名一致的問題,在不修改大架構的情況下refactor的方向為:
Table restaurant {
  id SERIAL [pk, increment]
  name TEXT [not null]
  cash_balance NUMERIC [not null]
}
Table menu {
  id SERIAL [pk, increment]
  dish_name TEXT [not null]
  price NUMERIC [not null]
  restaurant_id INT [ref: > restaurant.id, not null]
}
Table open_hours {
  id SERIAL [pk, increment]
  week TEXT [not null]
  open TIME [not null]
  close TIME [not null]
  restaurant_id INT [ref: > restaurant.id, not null]
}
Table users {
  id INT [pk]
  name TEXT [not null]
  cash_balance NUMERIC [not null]
}
Table purchase_history {
  id SERIAL [pk, increment]
  menu_id INT [ref: > menu.id, not null]
  transaction_amount NUMERIC [not null]
  transaction_date TIMESTAMP [not null]
  user_id INT [ref: > users.id, not null]
}


ETL 優化的重點為
# ... [其他不變的部分]
def til_days(first_day, end_day, open_time_open, open_time_close, restaurant_id):
    insert_data = []
    # 這部分邏輯保持不變
    # ...
    return insert_data
def init_db_restaurant(conn, log):
    with open("../data/restaurant_with_menu.json", 'r') as load_f:
        load_dict = json.load(load_f)
    cur = conn.cursor()
    for restaurant in load_dict:
        # ... [其他不變的部分]
        
        # insert openhours table
        open_days = restaurant.get('openingHours').split('/')
        statement = 'INSERT INTO openhours (week, open, close, restaurant_id) VALUES (%s, %s, %s, %s)'
        insert_data = []
        for day in open_days:
            # 使用函數重構的部分邏輯來獲取insert_data
            day_data = parse_day(day, restaurant_id)
            insert_data.extend(day_data)
            
        cur.executemany(statement, insert_data)
    
    conn.commit()
    cur.close()
# ... [其他不變的部分]
def init_db_users(conn, log):
    with open("../data/users_with_purchase_history.json", 'r') as load_f:
        load_dict = json.load(load_f)
    cur = conn.cursor()
    for user in load_dict:
        # ... [其他不變的部分]
        
        cur.executemany(statement, insert_data)
    conn.commit()
    cur.close()
# ... [其他不變的部分]