-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
The execution time of blew code differs greatly when n is large.
import spotfire.sbdf as sb
import random
import pandas as pd
import numpy as np
n=100000
rand = [random.normalvariate(mu=0, sigma=1) for n in range(n)]
sb.export_data(rand,"d:/tmp/slow.sbdf")
If n=100,000 it only took 2 seconds.
If n=1,000,000 it took more than 14 minutes, both CPU(5% on a machine with 20 logical processors ) and memory(140MB) usage is not high.
The execution time of "random.normalvariate()" doesn't change much so sb.export_data() is contributing the most.
If casting rand to pandas DataFrame / Series or numpy array in advance, then it took less than 1 second to finish.
rand = np.array([random.normalvariate(mu=0, sigma=1) for n in range(n)])
Above took 0.7 seconds.
rand = pd.DataFrame([random.normalvariate(mu=0, sigma=1) for n in range(n)])
Above took 0.4 seconds. (fastest)
rand = pd.Series([random.normalvariate(mu=0, sigma=1) for n in range(n)])
Above took 1.4 seconds. (slowest)
It took me quite a while to figure out this problem and it's hard to notice because sb.export_data() does accept plain python list as the first argument.
Please improve this.
Metadata
Metadata
Assignees
Labels
No labels