Discussion:
get a spark job full log via yarn api (instead of yarn cli)
Lian Jiang
2018-11-26 22:02:45 UTC
Permalink
On HDP3, I cannot get the full log of a failing spark job by using yarn api:

curl -k -u guest:"" -X GET https://
myhost.com/gateway/ui/resourcemanager/v1/cluster/apps/
<https://confluence.oci.oraclecorp.com/display/BDW/dw-knox-prod1.us-phoenix-1.oracleiaas.com/gateway/ui/resourcemanager/v1/cluster/apps/>
{applicationId}

This means the job owner has to ssh to the cluster to run "yarn logs"
command to get the full log. Is this expected? How can I get the full spark
log without sshing to the cluster? Appreciate your help.
Lian Jiang
2018-11-27 22:11:19 UTC
Permalink
Any idea? or I should ask another user group? Thanks.
Post by Lian Jiang
curl -k -u guest:"" -X GET https://
myhost.com/gateway/ui/resourcemanager/v1/cluster/apps/
<https://confluence.oci.oraclecorp.com/display/BDW/dw-knox-prod1.us-phoenix-1.oracleiaas.com/gateway/ui/resourcemanager/v1/cluster/apps/>
{applicationId}
This means the job owner has to ssh to the cluster to run "yarn logs"
command to get the full log. Is this expected? How can I get the full spark
log without sshing to the cluster? Appreciate your help.
GERARD Nicolas
2018-11-30 10:19:38 UTC
Permalink
Your best option if you are using spark is here:
https://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact

Standard, you can have access to the log via:
* the web ui
* yarn log
* or the file system

As far as I know, there is no simple call rest api.

The cli logs implementation of Yarn can be found here if you want to have a
look:
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java


Met vriendelijke groeten, Regards, Cordialement,

Nicolas GERARD
Post by Lian Jiang
Any idea? or I should ask another user group? Thanks.
Post by Lian Jiang
curl -k -u guest:"" -X GET https://
myhost.com/gateway/ui/resourcemanager/v1/cluster/apps/
<https://confluence.oci.oraclecorp.com/display/BDW/dw-knox-prod1.us-phoenix-1.oracleiaas.com/gateway/ui/resourcemanager/v1/cluster/apps/>
{applicationId}
This means the job owner has to ssh to the cluster to run "yarn logs"
command to get the full log. Is this expected? How can I get the full spark
log without sshing to the cluster? Appreciate your help.
Loading...