performance - python multiprocessing efficiency -
timer function calculate cost time between multi_core_1 , multi_core_2
multi_core_1
results = p.map_async(deal, urls)
multi_core_2
for url in urls: results = p.map_async(deal, url)
code
# !/usr/bin/env python # -*- coding:utf-8 -*- import time import logging functools import wraps multiprocessing.dummy import pool, queue, manager, freeze_support import requests urls = [ 'http://www.baidu.com', 'http://home.baidu.com/', ``` 100 urls ] def timer(func): @wraps(func) def wrapper(*args, **kwargs): t = time.time() = func(*args, **kwargs) logging.warn('%s cost %s' % (func.__name__, (time.time()-t))) return return wrapper def deal(url): return requests.get(url).status_code @timer def multi_core_1(): freeze_support() p = pool(8) results = p.map_async(deal, urls) p.close() p.join() @timer def multi_core_2(): freeze_support() p = pool(8) url in urls: results = p.map_async(deal, url) p.close() p.join() if __name__ == '__main__': multi_core_1() multi_core_2()
result
> python test.py warning:root:multi_core_1 cost 1.3149404525756836 warning:root:multi_core_2 cost 0.2142746448516845
question
so wonder how multi_core_2() can faster multi_core_1()
in second function you're using map_async
wrong.
map_async
takes function apply , iterable.
when pass string iterable, treats each character in string element. each url in list tries apply deal
function each character individually ('h', 't', 't', etc). fails in requests.get
outright, fails, doesn't have load page , faster; it's broken code doesn't work, though.
you're assigning results
on each loop iteration, it'll overwrite on each new url , contain error codes last url string.
before checking function performance, make sure function works intended.
Comments
Post a Comment