Connect to any data source the same consistent way

Connect to any data source the same consistent way.

DataFrames and SQL provide a common way to access a variety of data sources, including Hive, Avro, Parquet, ORC, JSON, and JDBC. You can even join data across these sources.

Hive Integration, run SQL or HiveQL queries on existing warehouses.

Spark SQL supports the HiveQL syntax as well as Hive SerDes and UDFs, allowing you to access existing Hive warehouses.

Spark SQL can use existing Hive metastores, SerDes, and UDFs.

Standard Connectivity, connect through JDBC or ODBC. A server mode provides industry standard JDBC and ODBC connectivity for business intelligence tools.

PreviousEnabling for Conversion to/from Pandas in Python NextSpark SQL Implementation Example in Scala

Last updated 5 years ago

Was this helpful?