hive0.11的hive server实现kerberos认证和impersonation中碰到的问题(三)

2014-11-24 08:54:12 · 作者: · 浏览: 3
的filesystem instance
[java]
finally {
if (clientUgi != null) {
// 清除与此clientUgi相关的filesystem
try { FileSystem.closeAllForUGI(clientUgi); }
catch(IOException exception) {
LOG.error("Could not clean up file-system handles for UGI: " + clientUgi, exception);
}
}
正是由于第一个execute方法在finally中调用FileSystem.closeAllForUGI(clientUgi),close掉相关filesystem对象,同时也删除了绑定的scratch目录,第二个fetchN方法才没有数据可读。但是为什么同样实现了kerberos认证和impersonation的hive server 2没有碰到这个问题呢 其实hive server 2在开启impersonation(set hive.server2.enable.doAs=true)后并不是在thrift processor level而是在hive session level做impersonation的,从而不会在process finally中清理filesystem
[java]
// hive server 2中useProxy = false;
if (useProxy) {
clientUgi = UserGroupInformation.createProxyUser(
endUser, UserGroupInformation.getLoginUser());
remoteUser.set(clientUgi.getShortUserName());
returnCode = clientUgi.doAs(new PrivilegedExceptionAction() {
public Boolean run() {
try {
return wrapped.process(inProt, outProt);
} catch (TException te) {
throw new RuntimeException(te);
}
}
});
} else {
remoteUser.set(endUser);
return wrapped.process(inProt, outProt);
}
在HiveSessionProxy(代理HiveSessionImplwithUGI)中用ugi doAs执行
[java]
public Object invoke(Object arg0, final Method method, final Object[] args)
throws Throwable {
try {
return ShimLoader.getHadoopShims().doAs(ugi,
new PrivilegedExceptionAction () {
@Override
public Object run() throws HiveSQLException {
try {
return method.invoke(base, args);
} catch (InvocationTargetException e) {
if (e.getCause() instanceof HiveSQLException) {
throw (HiveSQLException)e.getCause();
} else {
throw new RuntimeException(e.getCause());
}
} catch (IllegalArgumentException e) {
throw new RuntimeException(e);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
});
} catch (UndeclaredThrowableException e) {
Throwable innerException = e.getCause();
if (innerException instanceof PrivilegedActionException) {
throw innerException.getCause();
} else {
throw e.getCause();
}
}
}
client端调用HiveConnection.close后,最终server端会调用HiveSessionImplwithUGI.close();关闭UGI相对应的filesystem对象
[java]
public void close() throws HiveSQLException {
try {
acquire();
ShimLoader.getHadoopShims().closeAllForUGI(sessionUgi);
cancelDelegationToken();
} finally {
release();
super.close();
}
}
解决方法
了解实现原理后,解决的方式有三种:
1. 启动Hive Server的时候关闭FileSystem Cache
[plain]
$HIVE_HOME/bin/hive --service hiveserver2 --hiveconf fs.hdfs.impl.disable.cache=true --hiveconf fs.file.impl.disable.cache=true
2. Hive Context中设置setHDFSCleanup(false),从而不会自动清除scratch目录,但是会有orphaned files问题,需要另外部署一个定时脚本去主动删除
3. thrift processor中根据每个function call的返回值来判断是否close filesystem,并且在最后connection close的时候,主动close filesystem
我们最终采用了第三种方案:
对hive改动如下https://github.com/lalaguozhe/hive/commit/b32eeee2498b679