Apache Spark is a powerful tool for distributed
processing of data. Data, whether represented as RDDs
or as DataFrames
is processed using transformations (such as map
and filter
) which are performed
lazily, meaning that they are only executed when the results are needed for some output.
In some cases, a particular RDD or DataFrame may be used as the basis for
multiple different downstream computations, or a computation performed multiple times.
Continue reading
In this post, the delivery guarantee provided by Akka is explored by example,
demonstrating techniques for handling unreliable message delivery.
Continue reading
This is my first meaningful blog post and is inspired by
This Stack Overflow question. I answered the question but wasn’t entirely happy with it (and apparently
neither was the community!). So I decided to look at the problem in more depth
as I expect to use Akka Http in the future, and mocking is an important part
of testing components which connect to external services.
Continue reading
I’ve been trying for a while to find the time and energy to start blogging about
programming and computing topics which are interesting, but never quite managed
to get it off the ground.
Continue reading