top of page

Distributed Programming

  • ali@fuzzywireless.com
  • Mar 4, 2022
  • 2 min read

Sakr and Hammoud (2014) defined the distributed system as the one with computers networked together and communicate with each other through message passing or/and shared memory to coordinate action, offer certain service or solve a certain problem. The two classical programming models, which are widely used to program distributed systems are shared memory and message passing suited for specific application needs (Sakr & Hammoud, 2014).


Shared-memory based models offer communication by reading and writing to shared memory or hard drive location, thus require synchronization mechanism to avoid inconsistency or data corruption by using locks, barriers and semaphores (Pulipaka, 2016). In an essence, MapReduce is a shared-memory model which utilize HDFS as storage and performs only two sequential functions namely, map and reduce whereas reduce function cannot be performed unless intermediate output is generated by map function, followed by shuffle, merging and sorting (Sakr & Hammoud, 2014). Also GraphLab is another example, which does not utilize any explicit send or receive messages but offer consistency and correctness at the cost of parallelism (Pulipaka, 2016).


Message-passing based models share explicit send and receive messages between processes and do not share memory, thus incurs overheads but offers parallelism. MPI is an industry standard library to implement message passing model (Sakr & Hammoud, 2014). Similarly Pregel is an analytical model that employs message passing programming model (Pulipaka, 2016).


However programming model for cloud require more than sending and receiving message or selecting computational and architectural models, because cloud programming model require further emphasis on attributes like heterogeneity, scalability, communication, synchronization, fault tolerance and scheduling (Sakr & Hammoud, 2014).


Heterogeneity require cloud to run distributed programs across varying hardware, networks, OS and the programming model, which is difficult to implement using simple share-memory or message passing models (Sakr & Hammoud, 2014).


Scalability is another key cloud attribute, which enable addition or removal of resources on the fly with the change in resource utilization thus pose challenges to classical programming models where new or modification in program will be needed to adhere to change in resources (Sakr & Hammoud, 2014).


Communication is a principal attribute of cloud affecting performance of cloud. Although classical programming models offer communication but efficient communication while handling enormous volumes of data pose significant challenges (Sakr & Hammoud, 2014).


Synchronization in cloud is an important feature, although classical programming models offers synchronization by using locks and semaphores but such mechanism pose significant performance limitations while handling large data sets hosted on cloud (Sakr & Hammoud, 2014).


Fault tolerance in cloud is offered through hardware and software redundancy to avoid any degradation in the event of failure, which are offered in classical distributed programming model but still fall short of performance requirement from cloud’s perspective (Sakr & Hammoud, 2014).


Scheduling is yet another important cloud attribute which can have significant performance impact if not done efficiently. For instance, serial map and reduce function execution with transfer of data across nodes can significantly impact the cloud performance and efficiency (Sakr & Hammoud, 2014).


Reference


Sakr, M. & Hammoud, M. (2014). MapReduce family of large-scale data-processing systems. In S. Sakr, & M. Gaber (Eds.), Large scale and big data: Processing and management. Boca Raton, FL: CRC Press.


Pulipaka, GP., 2016. Distributed share memory programming for Hadoop, MapReduce, and HPC architectures. Retrieved from https://medium.com/@gp_pulipaka/distributed-shared-memory-programming-for-hadoop-mapreduce-and-hpc-357a1b226ff6

Recent Posts

See All
AI - supporting decision making

Machine learning is built on algorithms to learn and provide results to end user (Chavan, Somvanshi, Tambade & Shinde, 2016). It is...

 
 
 
AI Influence on big data

Traditional machine learning algorithms and systems were developed with the assumption that data will fit in memory however in the realm...

 
 
 

Commentaires


Post: Blog2_Post
bottom of page