Serverless Architecture Moves Ahead with Better Applications

essidsolutions

The increased use of serverless architecture is making an impact on big data as it radically changes the way applications are run.

Serverless architecture allows users to deploy code and run applications without being saddled by managing the supporting infrastructure. Instead, that aspect is managed by an outside company such as Amazon’s AWS cloud service Amazon Lambda. After a stagnant period last year the number of providers of serverless architecture is increasing and the applications are becoming more enterprise-ready.

Big-data company Qubole has improved its big-data processing engine Apache Spark, making it more flexible and easier to use. The new engine allows users to run Spark applications on Amazon Web Services’ AWS Lambda.

Apache Spark, a fast, general purpose big data processing engine, is one of the most popular of its type. It is designed to execute streaming, machine learning or SQL workloads that require fast and constant access to datasets.

Its latest incarnation, Spark-on- Lambda, is currently available only as a prototype but has already been able to show a successful scan of 1 TB of data and sort 100 GB of data from AWS Simple Storage Service.

Qubole says the ability to run Spark on Lambda, a serverless compute service that allows users to pay only for the compute power they use without needing to provision servers, makes the platform more elastic and efficient with its resource usage.

The new Spark on Lambda service tackled two existing problems which have made the use of Spark on AWS Lambda difficult previously. First, Spark was unable to communicate directly with Lambda yet needed that capability to run its executors. Second, Lambda’s runtime resources were limited to a maximum execution duration of five minutes, 1,536 MB memory and 512 MB disk space, which was making it extremely difficult for a platform like Sparks, which uses a large amount of memory.

In its new form, Spark-on- Lambda service runs its executors from within an AWS Lambda invocation and manages to sidestep the communication problem. Lambda’s limited runtime resources were addressed by using external storage to avoid local disk size limits.

Spark-on- Lambda makes it possible to run on-premises applications that are offloaded to public clouds when demand for compute capacity rises, without needing to wait for servers to spin up first. The Spark clusters can also scale automatically with no input needed by administrators.

Finally, Quoble has made pricing more transparent as Spark-on- Lambda invokes Lambda functions with a well-defined cost, allowing it to calculate the exact cost of each Spark workload.