Template Class ProcessGroupManager

Class Documentation

template<typename CombiDataType = double>
class ProcessGroupManager

The ProcessGroupManager is part of a ProcessManager and is responsible for communication with a single process group.

Through the use of non-blocking operations in the ProcessGroupManager, the ProcessManager can instruct multiple process groups at once.

Public Functions

explicit ProcessGroupManager(RankType pgroupRootID)

Constructor

Parameters:

pgroupRootID – the rank of each process of this group in the global communicator; this is the same as the process group number, as the global communicator contains one member of each process group

bool runfirst(Task<CombiDataType> *t)

signal the process group to initalize and run the task t

bool runnext()

signal the process group to run a time step on all of its tasks

bool initDsgus()

signal the process group to initialized its sparse grid data structures

bool exit()

signal the process group to exit

inline StatusType getStatus()

non-blocking call to retrieve status of process group

inline StatusType waitStatus()

blocks until process group finished computation

inline const TaskContainer<CombiDataType> &getTaskContainer() const

get a collection of all tasks currently assigned to this group

inline void removeTask(Task<CombiDataType> *t)

remove a task from the process group

this does not change the state in the workers!

bool combine()

signal to perform a system-wide combination

bool combineThirdLevel(const ThirdLevelUtils &thirdLevel, CombiParameters &params, bool isSendingFirst)

signal to perform a widely-distributed combination

based on TCP/socket setup with third level manager. cf. ProcessManager::combineThirdLevel

bool combineThirdLevelFileBased(const std::string &filenamePrefixToWrite, const std::string &writeCompleteTokenFileName, const std::string &filenamePrefixToRead, const std::string &startReadingTokenFileName)

signal to perform a whole widely-distributed combination

based on file-exchange mechanism (w/o third level manager); equivalent to calling combineThirdLevelFileBasedWrite and combineThirdLevelFileBasedReadReduce in succession

bool combineThirdLevelFileBasedWrite(const std::string &filenamePrefixToWrite, const std::string &writeCompleteTokenFileName)

signal to start a widely-distributed combination

based on file-exchange mechanism (w/o third level manager)

bool combineThirdLevelFileBasedReadReduce(const std::string &filenamePrefixToRead, const std::string &startReadingTokenFileName)

signal to reduce the results of a widely-distributed combination

based on file-exchange mechanism (w/o third level manager)

bool pretendCombineThirdLevelForWorkers(CombiParameters &params)

signal to pretend a widely-distributed combination

based on TCP/socket setup with third level manager; for testing the widely-distributed combination between the workers in and outside the third level process group

bool combineSystemWide()

signal to perform the first part of a system-wide combination, namely hierarchization and reduction (but not dehierarchization)

based on file-exchange mechanism (w/o third level manager)

bool updateCombiParameters(CombiParameters &params)

send new CombiParameters to the process group

bool isGroupFault()

Check if group fault occured at this combination step using the fault simulator.

bool addTask(Task<CombiDataType>*)

assign a task to the process group

bool refreshTask(Task<CombiDataType>*)
bool resetTasksWorker()

signal to delete all tasks in the process group

does not change the state in this ProcessGroupManager!

bool recompute(Task<CombiDataType> *t)

assign a task to the process group, and let the workers run a time step

used for fault tolerance; the task will be re-initialized from the current sparse grid solution

bool recoverCommunicators()

signal to recover the communicator ranks of the process group

bool parallelEval(const LevelVector &leval, const std::string &filename)

signal to interpolate the current solution from all component grids on this group at resolution level leval and write the results to a binary file readable with Paraview

void writeSparseGridMinMaxCoefficients(const std::string &filename)

signal the group to write minimum and maximum subspace coefficients to a file

Parameters:

filename – the filename to write to

void doDiagnostics(size_t taskID)

signal to perform diagnostics on the task with the given ID

can only be used with Tasks that implement the doDiagnostics method

void getLpNorms(int p, std::map<size_t, double> &norms)

signal to compute the Lp norm of the current component grids, and gather them

Parameters:

p – the p in Lp norm

Returns:

a map from task ID to Lp norm

std::vector<double> evalAnalyticalOnDFG(const LevelVector &leval)

signal to interpolate the analytical solution at the given resolution level leval

std::vector<double> evalErrorOnDFG(const LevelVector &leval)

signal to interpolate the analytical solution at the given resolution level leval and compute the difference to the current solution

Returns:

the Lp norms of the error: maximum norm, l1 norm, l2 norm

void interpolateValues(const std::vector<real> &interpolationCoordsSerial, std::vector<CombiDataType> &values, MPI_Request *request = nullptr, const std::string &filenamePrefix = "")

signal to interpolate the current component grids at the given interpolationCoordsSerial

non-blocking; the results will be written to values after the requests have completed

void writeInterpolatedValuesPerGrid(const std::vector<real> &interpolationCoordsSerial, const std::string &filenamePrefix)

signal to interpolate the current component grids at the given interpolationCoordsSerial and write results; one file per grid.

bool rescheduleAddTask(Task<CombiDataType> *task)

Adds a task to the process group. To be used for rescheduling.

Parameters:

task – The task to add.

Returns:

If task was successfully added.

Task<CombiDataType> *rescheduleRemoveTask(const LevelVector &lvlVec)

Removes a task from the process group. To be used for rescheduling.

Parameters:

lvlVec – The level vector of the task to remove.

Returns:

If successful the removed task or a nullptr if no task with the given level vector is found.

inline bool hasTask(size_t taskID)

returns true if the group is currently assigned the task with the given taskID

bool writeDSGsToDisk(const std::string &filenamePrefix)

signal to write the group’s sparse grid data structures to disk

using custom binary format with MPI-IO (and compression, if enabled)

Parameters:

filenamePrefix – the prefix of the filename to write to

bool readDSGsFromDisk(const std::string &filenamePrefix)

signal to read the group’s sparse grid data structures from disk

using custom binary format with MPI-IO (and decompression, if enabled)

Parameters:

filenamePrefix – the prefix of the filename to read from

void storeTaskReference(Task<CombiDataType> *t)

store the task for this process group

Does not change the state in the workers!