logo
logo
AI Products 
Leaderboard Community🔥 Earn points
avatar
priti pawar
collect
0
collect
0
collect
1
Apache Spark

Apache Spark

Spark is based on the Hadoop distributed file system but does not use Hadoop MapReduce, but its own framework for parallel data processing, which starts with the insertion of data into persistent distributed data records (RDD) and distributed memory abstractions, which computes large Spark clusters in a way that fault-tolerant. Because data is stored in memory (and on disk if necessary), Apache Spark can be much faster and more flexible than the Hadoop MapReduce task for certain applications described below. 

 
collect
0
collect
0
collect
1
avatar
priti pawar