iT邦幫忙

2024 iThome 鐵人賽

DAY 19
0
Software Development

Django 2024: 從入門到SaaS實戰系列 第 23

Django REST framework: 打造高效 API-流量限制、分頁與過濾

  • 分享至 

  • xImage
  •  

在客戶端透過API請求中,如果客戶端想要提升查找數據的精確性與效率?

而站在服務端的角度,透過認證與權限的設計來達到提升安全性的效果,那如果用戶的確具備對應的權限與身份,又有哪些方式能防止API被濫用?

又或是想要減輕服務端的壓力,都應該如何更近一步優化我們的API?

我們今天來探討在Django REST framework(DRF)中可以透過哪些方式來進行API的優化:流量限制、過濾與分頁

今日重點:

  • DRF中設置流量限制
    • DRF中限流的基礎原理
    • DRF中不同類別的限流實踐
    • 配置限流
  • 在DRF中使用分頁
  • 過濾與排序

DRF中設置流量限制

所謂的限制流量(限流),也就是如果單一客戶端的請求頻率超過了允許的限制,便會給予限制,實務上通常會返回一個429 Too Many Requests響應

而在DRF中,會根據幾種方式來找到API請求的來源,來判定客戶端的請求是否超出限制:

  • AnonRateThrottle:應用於所有匿名用戶,根據客戶端的IP位址來追蹤請求
  • UserRateThrottle:應用於已認證的用戶。根據用戶ID來追蹤請求
  • ScopedRateThrottle:這種類型的限流允許你根據視圖,也就是API endpoint本身進行限流

DRF中限流的基礎原理

上述提到的DRF提供的限流類別,都是繼承了BaseThrottle ,我們來看一下他是具備什麼方法

class BaseThrottle:
    """
    Rate throttling of requests.
    """

    def allow_request(self, request, view):
        """
        Return `True` if the request should be allowed, `False` otherwise.
        """
        raise NotImplementedError('.allow_request() must be overridden')

    def get_ident(self, request):
        """
        Identify the machine making the request by parsing HTTP_X_FORWARDED_FOR
        if present and number of proxies is > 0. If not use all of
        HTTP_X_FORWARDED_FOR if it is available, if not use REMOTE_ADDR.
        """
        xff = request.META.get('HTTP_X_FORWARDED_FOR')
        remote_addr = request.META.get('REMOTE_ADDR')
        num_proxies = api_settings.NUM_PROXIES

        if num_proxies is not None:
            if num_proxies == 0 or xff is None:
                return remote_addr
            addrs = xff.split(',')
            client_addr = addrs[-min(num_proxies, len(addrs))]
            return client_addr.strip()

        return ''.join(xff.split()) if xff else remote_addr

    def wait(self):
        """
        Optionally, return a recommended number of seconds to wait before
        the next request.
        """
        return None

  • allow_request:決定是否允許請求,因為是抽象方法,在子類別中需要自定義相關邏輯。當我們想要自定義限流時通常會改寫這個方法
  • wait:返回客戶端在下一個請求前應該等待的時間,單位為秒
  • get_ident:默認情況下是去判斷請求的ip位址,並且根據是否有代理伺服器,盡可能準確辨認初始客戶端ip

這時候可能會想,那這樣DRF要怎麼記住每一個客戶端的資料?

這時候我們可以看SimpleRateThrottle類別,該類別繼承BaseThrottle,並且被AnonRateThrottle、UserRateThrottle、ScopedRateThrottle所繼承

class SimpleRateThrottle(BaseThrottle):
    """
    A simple cache implementation, that only requires `.get_cache_key()`
    to be overridden.

    The rate (requests / seconds) is set by a `rate` attribute on the Throttle
    class.  The attribute is a string of the form 'number_of_requests/period'.

    Period should be one of: ('s', 'sec', 'm', 'min', 'h', 'hour', 'd', 'day')

    Previous request information used for throttling is stored in the cache.
    """
    cache = default_cache
    timer = time.time
    cache_format = 'throttle_%(scope)s_%(ident)s'
    scope = None
    THROTTLE_RATES = api_settings.DEFAULT_THROTTLE_RATES

    def __init__(self):
        if not getattr(self, 'rate', None):
            self.rate = self.get_rate()
        self.num_requests, self.duration = self.parse_rate(self.rate)

    def get_cache_key(self, request, view):
        """
        Should return a unique cache-key which can be used for throttling.
        Must be overridden.

        May return `None` if the request should not be throttled.
        """
        raise NotImplementedError('.get_cache_key() must be overridden')

    def get_rate(self):
        """
        Determine the string representation of the allowed request rate.
        """
        if not getattr(self, 'scope', None):
            msg = ("You must set either `.scope` or `.rate` for '%s' throttle" %
                   self.__class__.__name__)
            raise ImproperlyConfigured(msg)

        try:
            return self.THROTTLE_RATES[self.scope]
        except KeyError:
            msg = "No default throttle rate set for '%s' scope" % self.scope
            raise ImproperlyConfigured(msg)

    def parse_rate(self, rate):
        """
        Given the request rate string, return a two tuple of:
        <allowed number of requests>, <period of time in seconds>
        """
        if rate is None:
            return (None, None)
        num, period = rate.split('/')
        num_requests = int(num)
        duration = {'s': 1, 'm': 60, 'h': 3600, 'd': 86400}[period[0]]
        return (num_requests, duration)

    def allow_request(self, request, view):
        """
        Implement the check to see if the request should be throttled.

        On success calls `throttle_success`.
        On failure calls `throttle_failure`.
        """
        if self.rate is None:
            return True

        self.key = self.get_cache_key(request, view)
        if self.key is None:
            return True

        self.history = self.cache.get(self.key, [])
        self.now = self.timer()

        # Drop any requests from the history which have now passed the
        # throttle duration
        while self.history and self.history[-1] <= self.now - self.duration:
            self.history.pop()
        if len(self.history) >= self.num_requests:
            return self.throttle_failure()
        return self.throttle_success()

    def throttle_success(self):
        """
        Inserts the current request's timestamp along with the key
        into the cache.
        """
        self.history.insert(0, self.now)
        self.cache.set(self.key, self.history, self.duration)
        return True

    def throttle_failure(self):
        """
        Called when a request to the API has failed due to throttling.
        """
        return False

    def wait(self):
        """
        Returns the recommended next request time in seconds.
        """
        if self.history:
            remaining_duration = self.duration - (self.now - self.history[-1])
        else:
            remaining_duration = self.duration

        available_requests = self.num_requests - len(self.history) + 1
        if available_requests <= 0:
            return None

        return remaining_duration / float(available_requests)
  • 首先來看屬性的部分:
    • cache與cache_format:這就能解釋我們前面的疑問,DRF是透過儲存快取來知道對應的庫戶端的相關資料
    • scope:獲取設置的節流速率
    • THROTTLE_RATES:從DRF設置中獲取的默認節流速率
  • __init__:如果沒設置scope,則透過get_rate方法拿到對應的速率,接著透過parse_rate方法,返回在特定的時間窗口內允許的請求數
  • allow_request:在這個核心的方法中,先把已經不再時間窗口(self.duration)內的紀錄刪除,再來判斷時間窗口內的請求數量是否超標
  • throttle_success:當允許請求時,調用該方法來紀錄客戶端資料到快取中

根據上面的方法我們了解到,我們需要先設置默認或是特定視圖中允許請求的頻率,當客戶端發起請求時,根據快取的資料以及設置的請求頻率來決定是否要接受請求。如果超標則不接受,反之則紀錄客戶端資料到快取

DRF中不同類別的限流實踐

現在我們知道基礎的現流實現方式,我們就可以來看各類別的區別

AnonRateThrottle

如果用戶有註冊過則不需進行限流,反之則紀錄客戶端ip

class AnonRateThrottle(SimpleRateThrottle):
    """
    Limits the rate of API calls that may be made by a anonymous users.

    The IP address of the request will be used as the unique cache key.
    """
    scope = 'anon'

    def get_cache_key(self, request, view):
        if request.user and request.user.is_authenticated:
            return None  # Only throttle unauthenticated requests.

        return self.cache_format % {
            'scope': self.scope,
            'ident': self.get_ident(request)
        }

UserRateThrottle

如果是註冊的用戶,使用用戶id當作快取的key,反之則紀錄ip位址

class UserRateThrottle(SimpleRateThrottle):
    """
    Limits the rate of API calls that may be made by a given user.

    The user id will be used as a unique cache key if the user is
    authenticated.  For anonymous requests, the IP address of the request will
    be used.
    """
    scope = 'user'

    def get_cache_key(self, request, view):
        if request.user and request.user.is_authenticated:
            ident = request.user.pk
        else:
            ident = self.get_ident(request)

        return self.cache_format % {
            'scope': self.scope,
            'ident': ident
        }

ScopedRateThrottle

比較不一樣的地方是,在初始化時不會去判斷限流頻率,而是在調用視圖時才進行紀錄,並且紀錄方式與UserRateThrottle相同

class ScopedRateThrottle(SimpleRateThrottle):
    """
    Limits the rate of API calls by different amounts for various parts of
    the API.  Any view that has the `throttle_scope` property set will be
    throttled.  The unique cache key will be generated by concatenating the
    user id of the request, and the scope of the view being accessed.
    """
    scope_attr = 'throttle_scope'

    def __init__(self):
        # Override the usual SimpleRateThrottle, because we can't determine
        # the rate until called by the view.
        pass

    def allow_request(self, request, view):
        # We can only determine the scope once we're called by the view.
        self.scope = getattr(view, self.scope_attr, None)

        # If a view does not have a `throttle_scope` always allow the request
        if not self.scope:
            return True

        # Determine the allowed request rate as we normally would during
        # the `__init__` call.
        self.rate = self.get_rate()
        self.num_requests, self.duration = self.parse_rate(self.rate)

        # We can now proceed as normal.
        return super().allow_request(request, view)

    def get_cache_key(self, request, view):
        """
        If `view.throttle_scope` is not set, don't apply this throttle.

        Otherwise generate the unique cache key by concatenating the user id
        with the `.throttle_scope` property of the view.
        """
        if request.user and request.user.is_authenticated:
            ident = request.user.pk
        else:
            ident = self.get_ident(request)

        return self.cache_format % {
            'scope': self.scope,
            'ident': ident
        }

配置限流

如果要設置限流,可以直接在settings.py中配置全局變數,設置想要配置的限流類別與頻率

REST_FRAMEWORK = {
    'DEFAULT_THROTTLE_CLASSES': [
        'rest_framework.throttling.AnonRateThrottle',
        'rest_framework.throttling.UserRateThrottle',

    ],
    'DEFAULT_THROTTLE_RATES': {
        'anon': '2/min',
        'user': '10/min'
    }
}

在視圖中便可以直接調用

from rest_framework.throttling import UserRateThrottle

class WorkspaceViewSet(viewsets.ModelViewSet):
    throttle_classes = [UserRateThrottle]

那如果我們想要修改全局的頻率,可以新建throttles.py在app目錄下

from rest_framework.throttling import UserRateThrottle, ScopedRateThrottle

class WorkspaceThrottle(UserRateThrottle):
    THROTTLE_RATES = {"user": "5/day"}
    
# 如果是要配置scope的話,可以這樣寫
class DocumentThrottle(ScopedRateThrottle):
	  THROTTLE_RATES = {"document": "10/day"}

最後在視圖中調用即可

在DRF中使用分頁

分頁在API中扮演重要的角色:

  • 性能優化:通過限制每次請求返回的資料量來減輕服務端的負載壓力,也能提升用戶體驗
  • 資源管理:在處理大量資料時能夠更有效的管理伺服器資源
  • 靈活性:可以支持客戶端根據需求來響應資料量

在DRF中有幾種類

  • PageNumberPagination:可以根據頁碼來請求特定的資料頁
  • LimitOffsetPagination:讓客戶端可以指定返回的每頁的資料量與起始位置(offset)
  • CursorPagination:加密分頁器,客戶端不能跳到任意位置,只提供前進與後退控制。但是必須要模型中有created欄位

使用的方法可以全局設定

REST_FRAMEWORK = {
    'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.PageNumberPagination',
    'PAGE_SIZE': 10
}

或是自定義類別來修改參數

from rest_framework.pagination import LimitOffsetPagination

class CustomPagination(LimitOffsetPagination):
    default_limit = 10
    max_limit = 100
    
    
class WorkspaceViewSet(viewsets.ModelViewSet):
    pagination_class = CustomPagination

而請求時則是直接調用想要的資料範圍

http://127.0.0.1:8000/note/workspaces/?page=1 # PageNumberPagination
http://127.0.0.1:8000/note/workspaces/?limit=1&offset=0 # LimitOffsetPagination

過濾與排序

實現過濾的功能,在API請求時也相當重要,除了提高效率之外,也能改善用戶體驗,使客戶端能快速找到所需的資訊。同時能夠直接在調用API時根據特定欄位來進行排序,也能進一步提升客戶端體驗

如果使用DRF原生方法進行過濾,可以分成視圖函式與視圖類別:

此外DRF還提供了不同的API來快速使用過濾功能

from rest_framework import filters

class DocumentViewSet(viewsets.ModelViewSet):
    queryset = Document.objects.all()
    serializer_class = DocumentSerializer
    permission_classes = [permissions.IsAuthenticated]
    filter_backends = [filters.SearchFilter]
    search_fields = ["title", "content"]
  • 在視圖類別中可以設定filter_backends屬性為filters.SearchFilter
  • search_fields中通常我們會選擇具備text type的欄位進行搜尋
  • 如果是JSON欄位,可以透過雙底線來進行資料的解構
# 來源於官方文檔
search_fields = ['data__breed', 'data__owner__other_pets__0__name']

當我們在視圖中配置filter時,我們就能透過路由來對特定來為進行搜尋

https://ithelp.ithome.com.tw/upload/images/20241005/20161866UxOTTZ8pYe.png

此外我們也能根據外鍵或是多對多的欄位進行搜索,例如以我們的Document model來說,我們也能透過workspace欄位拿到owner的名稱

class DocumentViewSet(viewsets.ModelViewSet):
    queryset = Document.objects.all()
    serializer_class = DocumentSerializer
    permission_classes = [permissions.IsAuthenticated]
    filter_backends = [filters.SearchFilter]
    search_fields = ["workspace__owner__username"]# 透過雙底線來解構

https://ithelp.ithome.com.tw/upload/images/20241005/20161866B7o5d9xDY1.png

而在排序方面,DRF則是提供OrderingFilter類別快速設定排序功能

class DocumentViewSet(viewsets.ModelViewSet):
    queryset = Document.objects.all()
    serializer_class = DocumentSerializer
    permission_classes = [permissions.IsAuthenticated]
    filter_backends = [filters.SearchFilter, filters.OrderingFilter] # 一起配置不衝突
    search_fields = ["workspace__owner__username"]
    ordering_fields = ["created_at", "updated_at", "id"] # 設定想要排序的欄位

路由配置

# 正冪排序
http://127.0.0.1:8000/note/documents/?ordering=created_at

# 降冪排序
http://127.0.0.1:8000/note/documents/?ordering=-created_at

# 多個欄位排序時,可以根據順序
http://127.0.0.1:8000/note/documents/?ordering=-id,created_at

今日總結

在這個章節中,我們針對提升API效能、客戶端體驗與減輕服務端壓力來進行介紹

除了認證與權限之外,透不不同方式的限流也能避免API被濫用

此外提供分頁與過濾的功能,也能避免傳遞過多不必要的資料,讓客戶端能夠更精準的取得所需的資料範圍

參考資料

  • throttling:https://www.django-rest-framework.org/api-guide/throttling/
  • pagination:https://www.django-rest-framework.org/api-guide/pagination/
  • filter:https://www.django-rest-framework.org/api-guide/filtering/

上一篇
Django REST framework: 掌握 JWT、CORS 和 Cookie 處理技巧
下一篇
Django REST framework: 最後一哩路-自動生成API文件
系列文
Django 2024: 從入門到SaaS實戰31
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言