Computing persistence diagram in parallel

cai.507@...
 
Edited

Hi Dmitriy,

I am trying to compute persistence diagram for different graphs in parallel in python2. I use joblib to do multiprocessing computation. Here is a toy example:

def computePD(i):
    import dionysus as d
    import numpy as np
    np.random.seed(42)
    f1 = d.fill_rips(np.random.random((i+10, 2)), 2, 1)
    m1 = d.homology_persistence(f1)
    dgms1 = d.init_diagrams(m1, f1)
    return dgms1[1]
def get_dgms(n_jobs=1):
    from joblib import delayed, Parallel
    return Parallel(n_jobs=n_jobs)(delayed(computePD)(i) for i in range(10))
result = get_dgms(1)

However, when I change the n_jobs to 2(result = get_dgms(2)), I get a wired result.
[Diagram with 0 points,
 Diagram with 6822207245174201887 points,
 Diagram with 0 points,
 Diagram with 6822207245174201887 points,
 Diagram with 0 points,
 Diagram with 6822207245174201887 points,
 Diagram with 6148914691236517774 points,
 Diagram with 0 points,
 Diagram with 6822207245174201887 points,
 Diagram with 0 points]
 
The correct result should be something like
[Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 1 points,
 Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 2 points,
 Diagram with 4 points]

At first, I thought I didn't use joblib correctly so I change computePD to some simple function like the following
def computePD(i):
    return i ** 2
then I call the same function with different n_jobs, the result is the same when I change the n_jobs, just as expected.

Again, I am not sure the problem lies in Dionysus or misuse of joblib. Any suggestions will be appreciated.

Best,
Chen

 
 

Join dionysus@groups.io to automatically receive all group messages.