Spark Interview Question 7: When to use Broadcast join() ?Date: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Interview Question6: optimize transformations and actions? What is Catalyst?Date: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Interview Question 4: What is partitioning? Coalesce() vs Repartition()Date: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Interview Question 3: What is Data Serialization? Java vs KryoDate: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Interview Question2: What is file format? AVRO vs Parquet vs ORCDate: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Interview Question 1: Difference Between RDD vs DataFrame?Date: March 5th, 2024Mar 5, 2024Mar 5, 2024
Spark Question: What happens in the memory when you collect() in Spark ?I know collect() is often not recommended for large datasets (because it can cause Java Out of Memory issue), but this is for learning…Feb 1, 20241Feb 1, 20241
System Design Key Concepts in Data Architecture and Data InfraHi everyone, in this post, I will summarize things I recommend knowing and understanding for data architecture and data infra system…Jan 30, 2024Jan 30, 2024
PostgreSQL Interview Question: what is string_agg() and when do you use it?Hi everyone, this post is about the Postgres useful function. Postgres offers a variety of useful functions thanks to committers and I am…Jan 21, 2024Jan 21, 2024