In normally, Python is running under single-process, single-thread, and single routine.
We could use some moudules to control them, like:multiprocessing
to control processthreading
to control threadasyncio
to control coroutine
The following article introduces some application of coroutine:
A simple example of time.sleep for running five times with sleep time = 1 sec.
It takes 5 sec.
import time
start_time = time.time()
def sleep_sec(sec):
print('start at: ', time.time() - start_time)
time.sleep(sec)
print('end at: ', time.time() - start_time)
def main():ㄅ
for i in range(5):
sleep_sec(1)
print('end of main: ', time.time() - start_time)
main()
start at: 0.0001671314239501953
end at: 1.0058751106262207
start at: 1.005995512008667
end at: 2.006303310394287
start at: 2.0064895153045654
end at: 3.0076138973236084
start at: 3.007782220840454
end at: 4.0089030265808105
start at: 4.0090556144714355
end at: 5.009898662567139
end of main: 5.010054588317871
Using asyncio, the end time become about 1 sec which running a time cost.
import time
import asyncio
start_time = time.time()
async def sleep_sec(sec):
print('start at: ', time.time() - start_time)
await asyncio.sleep(sec)
print('end at: ', time.time() - start_time
async def main():
tasks = list()
for i in range(5):
tasks.append(asyncio.create_task(sleep_sec(1)))
for task in tasks:
await task
print('end of main: ', time.time() - start_time)
await main()
start at: 9.965896606445312e-05
start at: 0.0002307891845703125
start at: 0.0002894401550292969
start at: 0.00034356117248535156
start at: 0.000396728515625
end at: 1.0007236003875732
end at: 1.0008800029754639
end at: 1.000946283340454
end at: 1.0010099411010742
end at: 1.001070261001587
end of main: 1.001145601272583
Another similar example.
Use task to run it in the side and keep running for following code at the same time.
It no need to use await task
except need to use the output of the task.
import time
import asyncio
start_time = time.time()
async def sleep_sec(sec):
print('start at: ', time.time() - start_time)
print('input sec: ', sec)
await asyncio.sleep(sec)
print('end at: ', time.time() - start_time)
async def main():
task = asyncio.create_task(sleep_sec(1))
print('asyncio.sleep(3)')
await asyncio.sleep(3)
# await task
print('end of main: ', time.time() - start_time)
await main()
asyncio.sleep(3)
start at: 0.0004246234893798828
input sec: 1
end at: 1.0014164447784424
end of main: 3.002758264541626
In most time to use coroutine is for web-crawling, because it would be very inefficient when the requests are quite long or lots of.
For a very simple example to send a requests.get five times.
import time
import requests
start_time = time.time()
def req_get():
url = "https://www.google.com"
print('start at: ', time.time() - start_time)
r = requests.get(url)
print('end at: ', time.time() - start_time)
def main():
for i in range(5):
req_get()
print('end of main: ', time.time() - start_time)
main()
start at: 0.00021576881408691406
end at: 0.0635523796081543
start at: 0.06368589401245117
end at: 0.13397932052612305
start at: 0.13453364372253418
end at: 0.20851564407348633
start at: 0.2091212272644043
end at: 0.28153157234191895
start at: 0.2821779251098633
end at: 0.35675716400146484
end of main: 0.35712122917175293
From the output we could know that it continually sends requests instead of waiting the response and then send next requests. So, the end time is less than above for one over four times.
import time
import requests
import asyncio
start_time = time.time()
async def req_get():
url = "https://www.google.com"
print('start at: ', time.time() - start_time)
r = await loop.run_in_executor(None, requests.get, url)
print('end at: ', time.time() - start_time)
async def main():
tasks = list()
for i in range(5):
tasks.append(asyncio.create_task(req_get()))
for task in tasks:
await task
print('end of main: ', time.time() - start_time)
await main()
start at: 0.0001125335693359375
start at: 0.0011854171752929688
start at: 0.004603147506713867
start at: 0.006804466247558594
start at: 0.009021997451782227
end at: 0.06898927688598633
end at: 0.08475136756896973
end at: 0.08681845664978027
end at: 0.08908772468566895
end at: 0.09105420112609863
end of main: 0.09117269515991211
Update
For requests
module, there is another useful modeul aiohttp
.
# import nest_asyncio
# nest_asyncio.apply()
import time
import aiohttp
import asyncio
start_time = time.time()
async def fetch(client):
async with client.get("https://www.google.com") as resp:
assert resp.status == 200
return await resp.text()
async def req_get():
print('start at: ', time.time() - start_time)
async with aiohttp.ClientSession() as s:
r = await fetch(s)
print('end at: ', time.time() - start_time)
tasks = [req_get() for _ in range(5)]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))
start at: 0.0004086494445800781
start at: 0.007526874542236328
start at: 0.008102178573608398
start at: 0.008558034896850586
start at: 0.009565353393554688
end at: 0.06927990913391113
end at: 0.07057762145996094
end at: 0.07248950004577637
end at: 0.0732574462890625
end at: 0.07469344139099121
aiohttp
is a little faster than requests
.