Discussion:
Reduce side Join with SQL problem
Or Raz
2018-11-02 16:15:36 UTC
Permalink
I am running Hadoop 2.9.1, and I am doing a reduce side join, where I want
to use reduce function that does the local join using SQL, but I am getting
this error (for MySQL).

java.sql.SQLException: No suitable driver found for
jdbc:mysql://localhost:3306/acm_ex

From line code- Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");

*Each computer on the cluster has MySQL installed with the database acm_ex.


I have a Maven project with the SQL dependencies as follows:

<dependency>

<groupId>mysql</groupId>

<artifactId>mysql-connector-java</artifactId>

<version>5.1.39</version>

</dependency>

<dependency>

<groupId>com.microsoft.sqlserver</groupId>

<artifactId>mssql-jdbc</artifactId>

<version>7.0.0.jre8</version>

</dependency>


I compile and make a jar from the project and try to run it with the
following reduce function:

public void reduce(TextPair key, Iterable<Text> values, Context context)
throws IOException, InterruptedException

{

try { Class.forName("com.mysql.jdbc.Driver").newInstance(); }

catch (Exception e){ System.out.println(e.toString()); }

try {

Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");

Statement statement =
connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE);

LOG.info("SQL- connection: " + connection + " statement: " + statement);

//create 3 tables names

.

.

.

} //try

}//reduce


The code for the reduce function works perfectly when I run it locally
(user and password are "root") with Eclipse, but somehow there is a problem
when I run the same code with Hadoop's reduce function.

I have tried to add the jar to the classpath (mysql-connector-java),
although Maven has done it already, and it didn't help.

I am not sure if it is something with permissions to 3306 port for the
reduce container? Or Maven problem? Or even a hostname problem?

Therefore, does anyone know how to solve this particular issue or knows
another way to do a reduce side join with SQL (I am familiar with MySQL,
but I can change if you believe there is a difference)?

*Using Hive or map side join are not an option and doing a naive for loops
works but of course not as fast as SQL.

Loading...