[FEATURE] Add Multiprocess Capabilities! :) #78

windowshopr · 2021-12-25T20:15:26Z

I know in the documentation, or on an article I read (can't remember which) it said that PyGAD didn't perform well enough in multiprocessing to warrant adding it as a feature, however I have a GREAT need for it with a lot of my fitness functions that I create using PyGAD. Would be awesome to see it get implemented as another feature before running a GA search.

I envision something like adding a parameter use_multiprocessing = True, and num_workers = multiprocessing.cpu_count(), and if those are enabled, start a process pool for each chromosome in the current population, so each population item gets its own worker. When the generation is done, the pool is closed, and then when the next generation starts, the pool fires up again for the new population. Pseudo-code would look something like:

import concurrent.futures

if use_multiprocessing == True:
    with concurrent.futures.ProcessPoolExecutor(max_workers=num_workers) as executor:
        results = [executor.submit(fitness_func, solution, solution_idx) for solution, solution_idx in current_population]
        for f in concurrent.futures.as_completed(results):
            ind_solution_result = f.result() #[0]
            # Logic for what to do with the individual solution stuff here
        executor.shutdown(wait=True)
else:
    #...the rest of the default PyGAD behaviour

...I recognize this COULD be a big undertaking, but doing it this way would allow the current population of chromosomes/generation to be gone through much quicker than having to wait for a linear progression when more cpu cores are available.

You COULD also create several ga_instance's to run simultaneously yes, but I think being able to get through the generations themselves quicker is a better idea.

Would love to see this get implemented as I love PyGAD and don't really want to switch to DEAP as PyGAD is much easier to control/use IMO.

The text was updated successfully, but these errors were encountered:

Stoops-ML · 2021-12-30T09:33:26Z

You can parallelise the solutions in each generation as documented in PyGADs documentation here

windowshopr · 2021-12-30T18:12:33Z

That’s a great tutorial and I’ve read it before, but that’s pretty specific for PyTorch models/assuming each solution is a new set of model weights, which doesn’t apply at all to what I’m using it for. Would be cool to see that kind of behaviour implemented in the base PyGAD class so it’s a little more extensible?

windowshopr · 2021-12-30T19:14:09Z

I took a stab at creating what I needed, untested as of now, but will be checking on it in the next week or so. If it's working, I'll create a PR

windowshopr · 2021-12-30T19:38:01Z

See #80

Stoops-ML · 2021-12-31T12:24:44Z

That’s a great tutorial and I’ve read it before, but that’s pretty specific for PyTorch models/assuming each solution is a new set of model weights, which doesn’t apply at all to what I’m using it for. Would be cool to see that kind of behaviour implemented in the base PyGAD class so it’s a little more extensible?

The tutorial is not PyTorch specific and can be implemented for PyGAD using one set of model weights. In the tutorial the author overrides the cal_pop_fitness() method so that all solutions within a generation are run in parallel using multiprocessing.Pool.map().

windowshopr · 2021-12-31T18:51:27Z

So I've read over the article again, and see what you're saying, however it isn't working on my Windows machine.

I get the freeze_support() error message as the code isn't wrapped in the if __name__ == "__main__":, so I do that, but then get the error:

Traceback (most recent call last):
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "C:\Users\chalu\OneDrive\Desktop\Python_Scripts\Stock_RL_2021\stablebaselines_pygad.py", line 350, in fitness_wrapper
    return fitness_func(solution, 0)
  File "C:\Users\chalu\OneDrive\Desktop\Python_Scripts\Stock_RL_2021\stablebaselines_pygad.py", line 293, in fitness_func
    env = SubprocVecEnv([make_env(env, i) for i in range(num_cpu)])
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 106, in __init__
    process.start()
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\multiprocessing\process.py", line 110, in start
    'daemonic processes are not allowed to have children'
AssertionError: daemonic processes are not allowed to have children
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "stablebaselines_pygad.py", line 403, in <module>
    ga_instance.run()
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\site-packages\pygad\pygad.py", line 1251, in run
    self.last_generation_fitness = self.cal_pop_fitness()
  File "stablebaselines_pygad.py", line 358, in cal_pop_fitness
    pop_fitness = pool.map(fitness_wrapper, self.population)
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\multiprocessing\pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\chalu\AppData\Roaming\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
    raise self._value
AssertionError: daemonic processes are not allowed to have children

I'm assuming this is because I'm using stablebaselines3's SubprocVecEnv's function to create a subprocessed environment, even though I'm only setting the number of CPU's to 1 in that section anyway. But I will keep tweaking/remove that part of the stable baselines and see how I make out. Thanks!

ahmedfgad · 2022-01-01T18:07:47Z

@windowshopr, Supporting parallel processing is indeed a very good feature to be supported internally in PyGAD!

As @Stoops-ML said, the tutorial might be helpful.

Because most of the time the bottleneck is in the fitness function (mutation does not worth parallel processing), this could be internally supported.

Thanks for your suggestions!

@windowshopr

## PyGAD 2.17.0 Release Date: 8 July 2022 1. An issue is solved when the `gene_space` parameter is given a fixed value. e.g. gene_space=[range(5), 4]. The second gene's value is static (4) which causes an exception. 2. Fixed the issue where the `allow_duplicate_genes` parameter did not work when mutation is disabled (i.e. `mutation_type=None`). This is by checking for duplicates after crossover directly. #39 3. Solve an issue in the `tournament_selection()` method as the indices of the selected parents were incorrect. #89 4. Reuse the fitness values of the previously explored solutions rather than recalculating them. This feature only works if `save_solutions=True`. 5. Parallel processing is supported. This is by the introduction of a new parameter named `parallel_processing` in the constructor of the `pygad.GA` class. Thanks to [@windowshopr](https://github.com./windowshopr) for opening the issue [#78](#78) at GitHub. Check the [Parallel Processing in PyGAD](https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#parallel-processing-in-pygad) section for more information and examples.

@windowshopr

PyGAD 2.17.0 Release Date: 8 July 2022 1. An issue is solved when the `gene_space` parameter is given a fixed value. e.g. gene_space=[range(5), 4]. The second gene's value is static (4) which causes an exception. 2. Fixed the issue where the `allow_duplicate_genes` parameter did not work when mutation is disabled (i.e. `mutation_type=None`). This is by checking for duplicates after crossover directly. #39 3. Solve an issue in the `tournament_selection()` method as the indices of the selected parents were incorrect. #89 4. Reuse the fitness values of the previously explored solutions rather than recalculating them. This feature only works if `save_solutions=True`. 5. Parallel processing is supported. This is by the introduction of a new parameter named `parallel_processing` in the constructor of the `pygad.GA` class. Thanks to [@windowshopr](https://github.com./windowshopr) for opening the issue [#78](#78) at GitHub. Check the [Parallel Processing in PyGAD](https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#parallel-processing-in-pygad) section for more information and examples.

windowshopr closed this as completed Dec 30, 2021

ahmedfgad added the enhancement New feature or request label Feb 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add Multiprocess Capabilities! :) #78

[FEATURE] Add Multiprocess Capabilities! :) #78

windowshopr commented Dec 25, 2021 •

edited

Loading

Stoops-ML commented Dec 30, 2021

windowshopr commented Dec 30, 2021

windowshopr commented Dec 30, 2021

windowshopr commented Dec 30, 2021

Stoops-ML commented Dec 31, 2021

windowshopr commented Dec 31, 2021

ahmedfgad commented Jan 1, 2022

[FEATURE] Add Multiprocess Capabilities! :) #78

[FEATURE] Add Multiprocess Capabilities! :) #78

Comments

windowshopr commented Dec 25, 2021 • edited Loading

Stoops-ML commented Dec 30, 2021

windowshopr commented Dec 30, 2021

windowshopr commented Dec 30, 2021

windowshopr commented Dec 30, 2021

Stoops-ML commented Dec 31, 2021

windowshopr commented Dec 31, 2021

ahmedfgad commented Jan 1, 2022

windowshopr commented Dec 25, 2021 •

edited

Loading