Arun Jijo – Medium

Arun Jijo
in
Javarevisited

Subqueries and CTEs in Spark: Enhancing Data Analysis and Manipulation

In the intricate world of data analytics, the power to craft sophisticated and efficient queries is invaluable. Delving into the realm of…

10 min readApr 24, 2024

--

Subqueries and CTEs in Spark: Enhancing Data Analysis and Manipulation

--

Arun Jijo
in
Javarevisited

Beefing Up Redshift Performance

MPP is an predestined tool for any Data Warehousing and Big Data use case. Amazon Red Shift overhaul all of its peers in its space due to…

5 min readApr 2, 2021

--

Beefing Up Redshift Performance

--

Arun Jijo
in
Javarevisited

Spark 3.0 — New Functions in a Nutshell

Recently Apache Spark community releases the preview of Spark 3.0 which holds many significant new features that will help Spark to make a…

8 min readJun 14, 2020

--

3

Spark 3.0 — New Functions in a Nutshell

--

3

Arun Jijo
in
DataKare Solutions

Spark SQL — Salient functions in a Nutshell

As, Spark DataFrame becomes de-facto standard for data processing in Spark, it is a good idea to be aware key functions of Spark sql that…

3 min readDec 27, 2019

--

Spark SQL — Salient functions in a Nutshell

--

Arun Jijo
in
Javarevisited

Curious case of Island of Isolation

Garbage collector is one of the major primitives in the JAVA world. The tool that clears the unused / unreachable objects from the memory…

4 min readJun 25, 2019

--

1

Curious case of Island of Isolation

--

1

Arun Jijo
in
DataKare Solutions

Key factors to consider when optimizing Spark Jobs

Developing a spark application is fairly simple and straightforward, as spark provides featured pack APIs. Be that as it may, the tedious…

8 min readMar 21, 2019

--

1

Key factors to consider when optimizing Spark Jobs

--

1

Arun Jijo
in
DataKare Solutions

Structured Streaming: Essentials

This is the second chapter under the series “Structured Streaming” which center around covering all the essential details to set up a…

4 min readMar 3, 2019

--

--

Arun Jijo
in
DataKare Solutions

Structured Streaming

Introduction

3 min readFeb 26, 2019

--

--

Arun Jijo
in
DataKare Solutions

Structured Streaming: Kafka integration

This article focuses on explaining how to integrate Spark’s new stream processing engine Structured Streaming with Apache Kafka along with…

5 min readFeb 10, 2019

--

--

Arun Jijo

Arun Jijo

Data engineer at DataKare Solutions who gained expertise at Apache Nifi, Kafka, Spark and passionate in Java.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams