iT邦幫忙

2024 iThome 鐵人賽

DAY 23
0
Python

Python自修系列 第 23

DAY23:數據爬取和分析功能

  • 分享至 

  • xImage
  •  

scraper.py

import requests
from bs4 import BeautifulSoup

def scrape_data(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    data = []
    for item in soup.find_all('div', class_='item'):
        name = item.find('h2').text
        description = item.find('p').text
        data.append({'name': name, 'description': description})
    return data

更新 app.py

from flask import Flask, jsonify, request, abort
from scraper import scrape_data  # 引入我們的爬蟲

app = Flask(__name__)

# 模擬數據庫中的數據
data_store = [
    {"id": 1, "name": "Item 1", "description": "This is item 1"},
    {"id": 2, "name": "Item 2", "description": "This is item 2"},
    {"id": 3, "name": "Item 3", "description": "This is item 3"}
]

# 根路由,返回首頁
@app.route('/')
def index():
    return "Welcome to the Flask API"

# 獲取所有項目
@app.route('/api/items', methods=['GET'])
def get_items():
    return jsonify(data_store)

# 獲取單個項目
@app.route('/api/items/<int:item_id>', methods=['GET'])
def get_item(item_id):
    item = next((item for item in data_store if item['id'] == item_id), None)
    if item is None:
        abort(404)
    return jsonify(item)

# 創建新項目
@app.route('/api/items', methods=['POST'])
def create_item():
    if not request.json or not 'name' in request.json:
        abort(400)
    new_item = {
        'id': data_store[-1]['id'] + 1 if data_store else 1,
        'name': request.json['name'],
        'description': request.json.get('description', "")
    }
    data_store.append(new_item)
    return jsonify(new_item), 201

# 更新項目
@app.route('/api/items/<int:item_id>', methods=['PUT'])
def update_item(item_id):
    item = next((item for item in data_store if item['id'] == item_id), None)
    if item is None:
        abort(404)
    if not request.json:
        abort(400)
    item['name'] = request.json.get('name', item['name'])
    item['description'] = request.json.get('description', item['description'])
    return jsonify(item)

# 刪除項目
@app.route('/api/items/<int:item_id>', methods=['DELETE'])
def delete_item(item_id):
    item = next((item for item in data_store if item['id'] == item_id), None)
    if item is None:
        abort(404)
    data_store.remove(item)
    return jsonify({'result': True})

# 新增爬取數據的 API 路由
@app.route('/api/scrape', methods=['POST'])
def scrape():
    if not request.json or not 'url' in request.json:
        abort(400)
    url = request.json['url']
    try:
        scraped_data = scrape_data(url)
    except Exception as e:
        abort(500, description=f"Error during scraping: {str(e)}")
    return jsonify(scraped_data), 200

if __name__ == '__main__':
    app.run(debug=True)

POST /api/scrape

curl -X POST -H "Content-Type: application/json" -d '{"url": "http://example.com"}' http://127.0.0.1:5000/api/scrape


上一篇
DAY22:構建一個簡單的 API
下一篇
DAY24:Flask 的擴展 Flask-Login
系列文
Python自修30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言