Skip to main content

Command Palette

Search for a command to run...

Make Your Own Map, Filter, Reduce in C++

Create Custom Map, Filter, Reduce Functions in C++

Published
11 min read
Make Your Own Map, Filter, Reduce in C++
D
Terminal Nerd, Web Developer, ML Engineer. I use Neovim BTW

As you know, C++ is an imperative programming language.

Although over the years they have added features of functional programming languages like promises and lambdas, that doesn’t change the core of the language.

However, if you come from a background of languages (like JavaScript, Python, OCaml) with heavy promotion of and decent reliance on immutability, you may find yourself frustrated or disappointed when not seeing built-in support for map, filter and reduce (reduce is also known as fold, fold_left or fold_right) functions.

Although from C++ 17, there comes very powerful and helpful functions like std::reduce, std::inclusive_scan, std::exclusive_scan in C++ that do very well transformations of elements of a data structure and folding a data structure. They are designed for generic values (types), so thus they work with any type and any custom operator.

But this benefit only applies to you if you use C++17 or higher. If you are constrained in an environment where Your compiler only supports C++14 or even lower (I heard few companies still use C++ 5.4 neglecting all the modern type-safe features, weird but it’s their choice), you won’t have access to these utilities.

In this article we will go through possible implementations of map, filter and reduce in C++, modularizing the utilities and hence providing a cleaner approach to solving Problems. For this article, we will be using C++14 all the time.

So, let’s get started!

An Immutable Way of C++

Let’s begin with the map function. First try to make a map function for a vector of integers.

#include <vector>
#include <algorithm>
#include <functional>

std::vector<int> map(std::vector<int>& list,std::function<int(int)> func) {
    std::vector<int> result;
    result.reserve(list.size());
    std::transform(list.begin(), list.end(), std::back_inserter(result), func);
    return result;
}

So, it is quite easy when writing for a specific data type.

And our function works pretty well.

Running The code in an online C++ compiler

Similarly, we can implement filter and reduce for vector<int> also -

Filter function -

#include <vector>
#include <algorithm>
#include <functional>

std::vector<int> filter(const std::vector<int>& list, std::function<bool(const int&)> predicate) {
    std::vector<int> result;
    std::copy_if(list.begin(), list.end(), std::back_inserter(result), predicate);
    return result;
}

Reduce Function -

#include <vector>
#include <numeric> // for std::accumulate

int reduce(const std::vector<int>& list, std::function<int(const int&, const int&)> reducer, int initial) {
    return std::accumulate(list.begin(), list.end(), initial, reducer);
}

Using the reduce and filter functions on an example array in CodeBlocks IDE and printing the results in command prompt (STDOUT)

Well, it seems like we don’t need a reduce function anymore, just use std::accumulate and it would be done, right?

NO! You’re wrong!

Because the above implementation of reduce is FLAWED!
A reduce function requires that its 2nd argument (the binary operator or function) is both cumulative and associative. Because it performs the operations out of order, sometimes even parallelly (if an execution policy is assigned).

Let me clear the concepts first, about folding and reducing -

  1. Folding has to have a direction, either left or right, reducing doesn’t need a direction.

  2. Reducing is a process of accumulating elements of a linear data structure where the binary operation is assumed to be commutative and associative. Hence, if the order of elements in operation and accumulation goes different from the order of elements stored in the data structure, it will have ZERO casualties.

  3. In case of Folding, it doesn’t matter if the order of accumulation is the same as order of elements or say the binary operator is commutative or associative or not. Because the Execution Order of Folding is Strictly sequential, left-to-right or right-to-left.

  4. The result of a fold_left or fold_right is always deterministic regardless the operator is commutative and associative or not. But if the operator or function is NOT associative and commutative, reduce will cause indeterministic results (due to unordered execution).

What I have implemented above is actually a fold_left function. What std::accumulate actually does is a left fold reduction.

So, what the reduce function should actually be is something like this -

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>  // For parallel execution

// Example cumulative and associative function: addition
int add(int a, int b) {
    return a + b;
}

// Cumulative and associative reduce function
int reduce(int* array, size_t length, int (*binary_op)(int, int), int identity) {
    int result = identity;

    // Parallel reduction
    #pragma omp parallel for reduction(+:result)
    for (size_t i = 0; i < length; ++i) {
        result = binary_op(result, array[i]);
    }

    return result;
}

int main() {
    int arr[] = {1, 2, 3, 4, 5};
    size_t length = sizeof(arr) / sizeof(arr[0]);

    // Using the reduce function with addition operation
    int sum = reduce(arr, length, add, 0);

    printf("Sum: %d\n", sum);
    return 0;
}

Here, <omp.h> is the header for using OpenMP (Open Multi-Processing). OpenMP is a library that supports multi-platform shared-memory parallel programming in C, C++, and Fortran.

Although the code is written in C. You can change it into suitable C++ syntax. I am leaving that up to you.

How the Parallel Reduction Works

  • #pragma omp parallel for reduction(+:result): This is the core of the parallel execution using OpenMP.

    • #pragma omp parallel for: This directive tells the OpenMP runtime to parallelize the following for loop across multiple threads. Each thread will execute a portion of the loop iterations concurrently.

    • reduction(+:result): This clause specifies a reduction operation.

      • +: Indicates that the reduction operation is addition. OpenMP will handle the partial sums computed by each thread and combine them safely at the end.

      • result: The variable on which the reduction is performed. OpenMP ensures that updates to result from different threads are properly synchronized to avoid race conditions.

  • After each thread finishes its assigned iterations, OpenMP combines the partial sums from all the threads using the specified operation (addition in this case) and stores the final result in the original result variable in the shared memory. This combining process is done in a thread-safe manner.

Now let’s see how we can define our own fold_left and fold_right.

The Easy Way -

int fold_left(const std::vector<int>& list, std::function<int(const int&, const int&)> reducer, int initial) {
    return std::accumulate(list.begin(), list.end(), initial, reducer);
}

int fold_right(const std::vector<int>& list, std::function<int(const int&, const int&)> reducer, int initial) {
    return std::accumulate(list.rbegin(), list.rend(), initial, reducer);
}

The Hard Way -

template <typename T, typename AccType, typename Func>
AccType fold_left(typename std::vector<T> vec,
                                AccType initial_acc,
                                Func func) {
    AccType accumulator = initial_acc;
    for (auto itr = vec.begin(); itr != vec.end(); ++itr) {
        accumulator = func(accumulator, *itr);
    }
    return accumulator;
}

template <typename T, typename AccType, typename Func>
AccType fold_right(typename std::vector<T> vec,
                                 AccType initial_acc,
                                 Func func) {
    AccType accumulator = initial_acc;
    for (auto ritr = vec.rbegin(); ritr != vec.rend(); ++ritr) {
        accumulator = func(*ritr, accumulator);
    }
    return accumulator;
}

Yes, I used templating syntax for making them reusable for different data types.

You can try out the functions defined like this way -

#include <bits/stdc++.h>

template <typename T, typename AccType, typename Func>
AccType fold_left(typename std::vector<T> vec,
                                AccType initial_acc,
                                Func func) {
    AccType accumulator = initial_acc;
    for (auto itr = vec.begin(); itr != vec.end(); ++itr) {
        accumulator = func(accumulator, *itr);
    }
    return accumulator;
}

template <typename T, typename AccType, typename Func>
AccType fold_right(typename std::vector<T> vec,
                                 AccType initial_acc,
                                 Func func) {
    AccType accumulator = initial_acc;
    for (auto ritr = vec.rbegin(); ritr != vec.rend(); ++ritr) {
        accumulator = func(*ritr, accumulator);
    }
    return accumulator;
}

int main() {
    std::vector<double> vec = {1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,0.5};
    double sum = fold_left(vec, 0.0, [](double num, double nyum){
        return (num + nyum);
    });
    double mul = fold_right(vec, 1.0, [](double n, double m)->double{
        return (n * m);
    });
    std::cout << std::fixed << std::setprecision(5) << "The sum would be - " << sum << "\nAnd the multiplication would be - " << mul << std::endl;
    return 0;
}

The code obviously works, and you can see it below -

Trying out the fold_left and fold_right functions with a main function defined in an online C++ compiler

Now as you have understood the possible implementation of map, filter and reduce, let’s write a module encapsulating all the functions we have defined till now with some extras.

Look the idea is that, in OCaml, the standard library provided a List Module which had some incredibly useful functions, from map, filter, reduce to pipeline operator, some more variations of them and also some extras few different ones.

Availability of utility functions like this can save you lot of time by not having to write cluttered code or repetitive specific function definitions.

So, let’s make it clear, we will be defining these functions in our module -

1. map

Application: Tail-recursive transformation of a range of elements using a function.

static std::vector<U> map(begin_itr, end_itr, func);

2. mapi

Application: Index-aware mapping function (not tail-recursive); applies function with access to index and value.

static std::vector<U> mapi(begin_itr, end_itr, func);

3. map2

Application: Applies a binary function elementwise on two vectors.

static std::vector<U> map2(begin1, end1, begin2, end2, func);

4. map2D (Version 1)

Application: Applies a function to each element in a 2D vector (matrix-like structure).

static std::vector<std::vector<U>> map2D(input_2d_vector, func);

5. map2D (Version 2)

Application: Applies an outer and inner transformation over a 2D vector.

static std::vector<std::vector<U>> map2D(input_2d_vector, func_outer, func_inner);

6. filter

Application: Filters elements based on a predicate.

static std::vector<T> filter(begin_itr, end_itr, predicate);

7. reduce

Application: Reduces a vector into a single value using a binary function and an initial accumulator.

static AccType reduce(input_vector, initial_acc, func);

8. fold_left

Application: Folds (reduces) from left to right over a range using a binary function.

static AccType fold_left(begin_itr, end_itr, initial_acc, func);

9. fold_right

Application: Folds (reduces) from right to left using reverse iterators.

static AccType fold_right(begin_ritr, end_ritr, initial_acc, func);

10. apply

Application: Function application (pipeline-style transformation of a single value).

static U apply(value, func);

11. forall

Application: Tests if all elements in a range satisfy a predicate.

static bool forall(begin_itr, end_itr, predicate);

12. split

Application: Splits a vector of pairs into two vectors.

static std::pair<std::vector<T1>, std::vector<T2>> split(input_pairs);

13. combine

Application: Combines two vectors into a vector of pairs.

static std::vector<std::pair<T1, T2>> combine(list1, list2);

14. operator| (free function)

Application: Provides pipeline operator-like syntax for function application.

U operator|(value, func);

Summary

Function NamePurpose
map, mapi, map2, map2D (v1 & v2)Transformation / Mapping
filter, reduce, fold_left, fold_rightFiltering and Aggregation
applyPartial Application / Pipelining
forallLogical test (universal quantification)
split, combineVector decomposition and composition

Let’s Implement

This will be our header file -

#ifndef FUNCTIONAL_LIB_HPP
#define FUNCTIONAL_LIB_HPP

#include <vector>
#include <utility>

namespace functional_lib {

class Functional {
public:
    // map (Tail Recursive)
    template <typename T, typename U, typename Func>
    static std::vector<U> map(typename std::vector<T>::const_iterator begin_itr,
                             typename std::vector<T>::const_iterator end_itr,
                             Func func);

    // mapi (Not Tail Recursive - index aware map)
    template <typename T, typename U, typename Func>
    static std::vector<U> mapi(typename std::vector<T>::const_iterator begin_itr,
                              typename std::vector<T>::const_iterator end_itr,
                              Func func);

    template <typename T1, typename T2, typename U, typename Func>
    static std::vector<U> map2(typename std::vector<T1>::const_iterator begin_itr1,
                              typename std::vector<T1>::const_iterator end_itr1,
                              typename std::vector<T2>::const_iterator begin_itr2,
                              typename std::vector<T2>::const_iterator end_itr2,
                              Func func);

    // map2D - Version 1: Single callback
    template <typename T, typename U, typename Func>
    static std::vector<std::vector<U>> map2D(const std::vector<std::vector<T>>& input_2d_vector, Func func);

    // map2D - Version 2: Two callbacks
    template <typename T, typename U, typename FuncOuter, typename FuncInner>
    static std::vector<std::vector<U>> map2D(const std::vector<std::vector<T>>& input_2d_vector, FuncOuter func_outer, FuncInner func_inner);

    template <typename T, typename Predicate>
    static std::vector<T> filter(typename std::vector<T>::const_iterator begin_itr,
                                 typename std::vector<T>::const_iterator end_itr,
                                 Predicate predicate);

    template <typename T, typename AccType, typename Func>
    static AccType reduce(const std::vector<T>& input_vector, AccType initial_acc, Func func);

    template <typename T, typename AccType, typename Func>
    static AccType fold_left(typename std::vector<T>::const_iterator begin_itr,
                              typename std::vector<T>::const_iterator end_itr,
                              AccType initial_acc,
                              Func func);

    template <typename T, typename AccType, typename Func>
    static AccType fold_right(typename std::vector<T>::const_reverse_iterator begin_ritr,
                               typename std::vector<T>::const_reverse_iterator end_ritr,
                               AccType initial_acc,
                               Func func);

    // Partial application function (what pipeline operator does)
    template <typename T, typename U, typename Func>
    static U apply(T value, Func func);

    template <typename T, typename Predicate>
    static bool forall(typename std::vector<T>::const_iterator begin_itr,
                        typename std::vector<T>::const_iterator end_itr,
                        Predicate predicate);

    template <typename T1, typename T2>
    static std::pair<std::vector<T1>, std::vector<T2>> split(const std::vector<std::pair<T1, T2>>& input_pairs);

    template <typename T1, typename T2>
    static std::vector<std::pair<T1, T2>> combine(const std::vector<T1>& list1, const std::vector<T2>& list2);

private:
    // Helper for tail-recursive map
    template <typename T, typename U, typename Func>
    static void map_helper(typename std::vector<T>::const_iterator itr,
                           typename std::vector<T>::const_iterator end_itr,
                           Func func,
                           std::vector<U>& result_vector);
};

// Pipeline operator equivalent (free function for cleaner syntax)
template <typename T, typename U, typename Func>
U operator|(T value, Func func);


} // namespace functional_lib

#endif // FUNCTIONAL_LIB_HPP

Now, the thing is I won’t just give you the source code (implementation of the functions), Hehe…

Try with all your efforts, don’t just ChatGPT it, try to have little fun, with programming!

In the end, if you think you still want to see my implementation, for comparison purpose, I suppose. Comment it. I will link a GitHub gist in the comment section.

Concluding

So, hey, make sure you try to implement the source file!
I wanted to make sure you just don’t mindlessly read, or copy paste but actually mindfully read and understand the content.

If no one comments about if they successfully implemented all the functions in the header file or not then I will realize no one actually read it, just scrolled through it 😔🥲.

It’s not something very hard, so I expect you to able to do this. And BTW if you are not able to define all of them, it’s okay, just post whatever functions you implemented in a comment.

I will love to read your implementation and maybe find some better techniques so that I can improve my own code.

So, that’s it! Thank you for reading!

If you found this POST helpful, if this blog added some value to your time and energy, please show some love by giving the article some likes and share it with your developer friends. It will encourage me to create more content like this.

Feel free to connect with me at - Twitter, LinkedIn or GitHub :)

Happy Coding 🧑🏽‍💻👩🏽‍💻! Have a nice day ahead! 🚀

S

Insightful

5