Posts

Showing posts with the label Sqoop

TOP PYSPARK INTERVIEW QUESTION 2023

What is Apache Spark and how does it differ from Hadoop? What are the benefits of using Spark over MapReduce? What is a Spark RDD and what operations can be performed on it? How does Spark handle fault-tolerance and data consistency? Explain the difference between Spark transformations and actions. What is a Spark DataFrame and how is it different from an RDD? What is Spark SQL and how does it work? How can you optimize a Spark job to improve its performance? How does Spark handle memory management and garbage collection? Explain the role of Spark Driver and Executors. What is PySpark and how does it differ from Apache Spark? How do you create a SparkContext in PySpark? What is the purpose of SparkContext? What is RDD (Resilient Distributed Dataset)? How is it different from DataFrame and Dataset? What are the different ways to create RDD in PySpark? What is the use of persist() method in PySpark? How does it differ from cache() method? What is the use of broadcast variables in PySpark

How to handle NULL Value during sqoop Import/Export

Image
In this post we will discuss about handling Null value during sqoop import/export. If any value is NULL in the table and we want to sqoop that table ,then sqoop will import NULL value as string “null” in HDFS. So , that will create problem to use Null condition in our query  using hive For example: – Lets insert  NULL value to mysql table “cities”. mysql> insert into cities values(6,7,NULL); mysql> select * from cities; By default ,Sqoop will import NULL value as string “null” in HDFS. Lets sqoop and see what happens:– sqoop import –connect jdbc:mysql://localhost:3306/sqoop –username sqoop -P –table cities –hive-import –hive-overwrite –hive-table vikas.cities -m 1 After executing above sqoop import command . We will verify in HDFS the sqooped data. As we can see that this string “null” not NULL So if we query on table in hive we will get string “null”. So , after including the conditions  “is not null” or “is null” in hive query, we will not ge

Popular posts from this blog

Spark SQL “case when” and “when otherwise”

Top Hive Commands with Examples

SPARK : Ways to Rename column on Spark DataFrame