Tuesday, September 11, 2018

Hive Query on Amazon S3 fails intermittently

SYMPTOM

Hive query is observed to be failing intermittently. The application log or hiveserver2.log shows errors like below while running the task attempts or while moving data to storage:

2016-03-02 13:28:23,459 INFO  [HiveServer2-Background-Pool: Thread-52002]: SessionState (SessionState.java:printInfo(824)) - Map 1: 2(+6)/16    Map 4: 0(+2)/16 Map 5: 9(+0)/17 Reducer 2: 0/1009       Reducer 3: 0/1009
2016-03-02 13:28:23,642 ERROR [HiveServer2-Background-Pool: Thread-51679]: exec.Task (SessionState.java:printError(833)) - Job Commit failed with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.http.NoHttpResponseException: The target server failed to respond)'
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.http.NoHttpResponseException: The target server failed to respond
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1031)
        at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:650)
        at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:655)
        at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:655)
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:403)
        ...
Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
        ...       
       at org.jets3t.service.StorageService.copyObject(StorageService.java:871)
        at org.jets3t.service.StorageService.copyObject(StorageService.java:916)
        at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.copy(Jets3tNativeFileSystemStore.java:323)
        at sun.reflect.GeneratedMethodAccessor203.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at org.apache.hadoop.fs.s3native.$Proxy52.copy(Unknown Source)
        at org.apache.hadoop.fs.s3native.NativeS3FileSystem.rename(NativeS3FileSystem.java:717)

        at org.apache.hadoop.hive.ql.exec.Utilities.renameOrMoveFiles(Utilities.java:1566)
        at org.apache.hadoop.hive.ql.exec.Utilities.mvFileToFinalPath(Utilities.java:1806)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1027)


ROOT CAUSE
Intermittent Amazon S3 access failure.

RESOLUTION
Work with Amazon to resolve the access issues by reporting the complete error message from hiveserver2.log or yarn application log.