Use dynamic work assignment
Certain classes of problems result in load imbalances even if data is evenly distributed among tasks:
Sparse arrays - some tasks will have actual data to work on while others have mostly "zeros".
Adaptive grid methods - some tasks may need to refine their mesh while others don't.
N-body simulations - where some particles may migrate to/from their original task domain to another task's; where the particles owned by some tasks require more work than those owned by other tasks.
When the amount of work each task will perform is intentionally variable, or is unable to be predicted, it may be helpful to use a scheduler - task pool approach. As each task finishes its work, it queues to get a new piece of work.
It may become necessary to design an algorithm which detects and handles load imbalances as they occur dynamically within the code.