In order to optimize a parallel data mining in not just multiple computing nodes but different clouds separated by relatively high latencies, this paper addresses: (a) node deter- mination, i.e., “how many” and “which” computing nodes in federated clouds should be used, (b) synchronized completion, i.e., how to optimally apportion big-data across parallelized computation environments to ensure synchronization, where synchronization refers to completing all workload portions at the same time even when resources and inter-networks are heterogeneous and situated in multiple Internet-separated clouds, and (c) data partition determination, i.e., how to serialize different data chunks to computing nodes to avoid overflow or underflow to nodes.