ABSTRACT : |
Grid computing becomes the de facto platform for scientific computations that incorporates geographically and organizationally dispersed, heterogeneous resources. These scientific and data intensive computations require large and multiple datasets to be transferred to the data storage and compute nodes. As there is rapid growth in communication through internet, the data transfer becomes the major bottleneck for the end-to-end performance for these scientific applications. A most practical way of increasing the throughput is using multiple parallel streams. Currently GridFTP protocol is designed for point-to-single point parallel data transfer. However the issue of simultaneous -multiple files to multiple locations is not studied so far. In this paper, we design an optimized Meta-scheduler by which multiple files can be transferred simultaneously to the destined compute nodes. A LBLC scheduling algorithm is designed to transfer multiple files to multiple locations simultaneously. A greedy method is followed at every stage. The Optimized proposed model gives better results compared to the non-optimized data transfer.
Keywords:Grid computing, GridFTP, Optimization, Parallel TCP streams, Prediction, Data-dictionary.
|
|