Daily Term: MapReduce
MapReduce
MapReduce is a programming model for processing large datasets in parallel across a distributed cluster, popularized by Hadoop. It breaks tasks into two phases: Map (transforming data into key-value pairs) and Reduce (aggregating results). For example, to count word frequencies in a large corpus, Map might emit (word, 1) pairs, and Reduce sums them. MapReduce scales well for batch processing, but it’s slow for iterative tasks and less suited for real-time processing.
Date: 2025-12-20