python - improving heapq efficiency -


so find myself taking advantage of heapq calculations. however, problem working on, runs slow because heap gets pretty big.

i thought had option speed up. rather creating giant heap, make heap of heaps. but, surprisingly me, "more efficient" code slower. there's bit more overhead in more efficient code, thought win lot. having stripped down problem, i've got 2 functions same net calculation. f1 "naive" (and faster) version. f2 "improved" (but slower) version. random number generation in both, use same seed, same thing.

import random import heapq def f1():     random.seed(1)     q=[0]     while q:         value = heapq.heappop(q)         #print value         if value<0.5:             counter in range(16):                 heapq.heappush(q,value + 0.1 + (random.random()/1000))     print value  def f2():     random.seed(1)     q=[[0]]     while q:         subq = heapq.heappop(q)         value = heapq.heappop(subq)         #print value         if subq:             heapq.heappush(q,subq)         newq = []         if value<0.5:             counter in range(16):                 newq.append(value + 0.1 + (random.random()/1000))             heapq.heapify(newq)             heapq.heappush(q,newq)     print value 

why heap of heaps (f2) run slower? should call heappush same number of times, , heappop twice many times. size of heaps should smaller, expected run faster.

so wasn't pushing code hard enough. here's modified code. when subq made large, benefit after appears.

def f1(m,n):     random.seed(1)     q=[0]     counter in range(m):         value = heapq.heappop(q)         #print value         newcounter in range(n):             heapq.heappush(q,random.random())     print value #should same both methods, test  def f2(m,n):     random.seed(1)     q=[[0]]     counter in range(1000000):         subq = heapq.heappop(q)         value = heapq.heappop(subq)         #print value         if subq:             heapq.heappush(q,subq)         newq = []         newcounter in range(n):             newq.append(random.random())         heapq.heapify(newq)         heapq.heappush(q,newq)     print value #should same both methods, test 

when profile f1(1000000,10) , f2(1000000,10) run times of 10.7 seconds , 14.8 seconds. relevant details are:

f1:

ncalls tottime percall cumtime percall filename:lineno(function)

1000000 1.793 0.000 1.793 0.000 {_heapq.heappop}

10000000 3.856 0.000 3.856 0.000 {_heapq.heappush}

f2:

1000000 1.095 0.000 1.095 0.000 {_heapq.heapify}

2000000 2.628 0.000 2.628 0.000 {_heapq.heappop}

1999999 2.245 0.000 2.245 0.000 {_heapq.heappush}

10000000 1.114 0.000 1.114 0.000 {method 'append' of 'list' objects}

so net f2 loses out because of heappop, heapify , append. better on heappush.

but when go , challenge larger internal loop , run f1(1000,100000) , f2(1000,100000)

f1:

1000 0.015 0.000 0.015 0.000 {_heapq.heappop}

100000000 28.730 0.000 28.730 0.000 {_heapq.heappush}

f2:

1000 19.952 0.020 19.952 0.020 {_heapq.heapify}

2000 0.011 0.000 0.011 0.000 {_heapq.heappop}

1999 0.006 0.000 0.006 0.000 {_heapq.heappush}

100000000 6.977 0.000 6.977 0.000 {method 'append' of 'list' objects}

so better on heappush, , it's enough f2 runs faster (69 seconds versus 75 seconds).

so turns out hadn't pushed code sufficiently hard. needed things large enough many calls heappush became slower many calls heapify.


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -