Data CatSpark Interview Question6: optimize transformations and actions? What is Catalyst?Date: March 5th, 2024Mar 5Mar 5
Data CatSpark Interview Question 4: What is partitioning? Coalesce() vs Repartition()Date: March 5th, 2024Mar 5Mar 5
Data CatSpark Interview Question 3: What is Data Serialization? Java vs KryoDate: March 5th, 2024Mar 5Mar 5
Data CatSpark Interview Question2: What is file format? AVRO vs Parquet vs ORCDate: March 5th, 2024Mar 5Mar 5
Data CatSpark Interview Question 1: Difference Between RDD vs DataFrame?Date: March 5th, 2024Mar 5Mar 5
Data CatSpark Question: What happens in the memory when you collect() in Spark ?I know collect() is often not recommended for large datasets (because it can cause Java Out of Memory issue), but this is for learning…Feb 11Feb 11
Data CatSystem Design Key Concepts in Data Architecture and Data InfraHi everyone, in this post, I will summarize things I recommend knowing and understanding for data architecture and data infra system…Jan 30Jan 30
Data CatPostgreSQL Interview Question: what is string_agg() and when do you use it?Hi everyone, this post is about the Postgres useful function. Postgres offers a variety of useful functions thanks to committers and I am…Jan 21Jan 21