Or Raz
2018-11-02 16:15:36 UTC
I am running Hadoop 2.9.1, and I am doing a reduce side join, where I want
to use reduce function that does the local join using SQL, but I am getting
this error (for MySQL).
java.sql.SQLException: No suitable driver found for
jdbc:mysql://localhost:3306/acm_ex
From line code- Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");
*Each computer on the cluster has MySQL installed with the database acm_ex.
I have a Maven project with the SQL dependencies as follows:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.39</version>
</dependency>
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>mssql-jdbc</artifactId>
<version>7.0.0.jre8</version>
</dependency>
I compile and make a jar from the project and try to run it with the
following reduce function:
public void reduce(TextPair key, Iterable<Text> values, Context context)
throws IOException, InterruptedException
{
try { Class.forName("com.mysql.jdbc.Driver").newInstance(); }
catch (Exception e){ System.out.println(e.toString()); }
try {
Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");
Statement statement =
connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE);
LOG.info("SQL- connection: " + connection + " statement: " + statement);
//create 3 tables names
.
.
.
} //try
}//reduce
The code for the reduce function works perfectly when I run it locally
(user and password are "root") with Eclipse, but somehow there is a problem
when I run the same code with Hadoop's reduce function.
I have tried to add the jar to the classpath (mysql-connector-java),
although Maven has done it already, and it didn't help.
I am not sure if it is something with permissions to 3306 port for the
reduce container? Or Maven problem? Or even a hostname problem?
Therefore, does anyone know how to solve this particular issue or knows
another way to do a reduce side join with SQL (I am familiar with MySQL,
but I can change if you believe there is a difference)?
*Using Hive or map side join are not an option and doing a naive for loops
works but of course not as fast as SQL.
to use reduce function that does the local join using SQL, but I am getting
this error (for MySQL).
java.sql.SQLException: No suitable driver found for
jdbc:mysql://localhost:3306/acm_ex
From line code- Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");
*Each computer on the cluster has MySQL installed with the database acm_ex.
I have a Maven project with the SQL dependencies as follows:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.39</version>
</dependency>
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>mssql-jdbc</artifactId>
<version>7.0.0.jre8</version>
</dependency>
I compile and make a jar from the project and try to run it with the
following reduce function:
public void reduce(TextPair key, Iterable<Text> values, Context context)
throws IOException, InterruptedException
{
try { Class.forName("com.mysql.jdbc.Driver").newInstance(); }
catch (Exception e){ System.out.println(e.toString()); }
try {
Connection connection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root",
"root");
Statement statement =
connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE);
LOG.info("SQL- connection: " + connection + " statement: " + statement);
//create 3 tables names
.
.
.
} //try
}//reduce
The code for the reduce function works perfectly when I run it locally
(user and password are "root") with Eclipse, but somehow there is a problem
when I run the same code with Hadoop's reduce function.
I have tried to add the jar to the classpath (mysql-connector-java),
although Maven has done it already, and it didn't help.
I am not sure if it is something with permissions to 3306 port for the
reduce container? Or Maven problem? Or even a hostname problem?
Therefore, does anyone know how to solve this particular issue or knows
another way to do a reduce side join with SQL (I am familiar with MySQL,
but I can change if you believe there is a difference)?
*Using Hive or map side join are not an option and doing a naive for loops
works but of course not as fast as SQL.