设为首页 加入收藏

TOP

HBase源码解析(一) HMaster启动流程
2019-02-12 13:43:30 】 浏览:17
Tags:HBase 源码 解析 HMaster 启动 流程

本文基于HBase-0.94.1分析HMaster的启动流程。

1.HMaster命令行启动简述

HMaster的启动流程可以概括为:

将用户所要执行的"hbase-daemon.sh start master" 操作封装成一个HMasterCommandLine对象(一个tool实例),交给org.apache.hadoop.util.ToolRunner的静态方法run(conf,tool,args) 去执行;其中args为"start". 具体流程如下:

通过$HBASE_HOME/bin/hbase-daemon.sh start master 启动master时,会调用$HBASE_HOME/bin/hbase start master

$HBASE_HOME/bin/hbase start master
可以看下$HBASE_HOME/bin/hbase的内容:

"$JAVA" -XX:OnOutOfMemoryError="kill -9 %p" $JAVA_HEAP_MAX $HBASE_OPTS -classpath "$CLASSPATH" $CLASS "$@"
也即执行了如下方法:

org.apache.hadoop.hbase.master.HMaster.main("start")
HMaster的main方法创建了一个HMasterCommandLine对象,执行该对象的doMain(args)方法。

  /**
   * @see org.apache.hadoop.hbase.master.HMasterCommandLine
   */
  public static void main(String [] args) throws Exception {
	VersionInfo.logVersion();
    new HMasterCommandLine(HMaster.class).doMain(args);
  }
ServerCommandLine是HMasterCommandLine的父类,它实现了Tool接口,通过Hadoop中的ToolRunner机制执行启动/停止等各种命令
  /**
   * Parse and run the given command line. This may exit the JVM if
   * a nonzero exit code is returned from <code>run()</code>.
   */
  public void doMain(String args[]) throws Exception {
    int ret = ToolRunner.run(
      HBaseConfiguration.create(), this, args);
    if (ret != 0) {
      System.exit(ret);
    }
  }

2.HMaster的启动采用了ToolRunner机制

ToolRunner的run方法如下:

(1). 将conf和args封装成GenericOptionsParser对象parser, 根据parser获取toolArgs
(2). 返回tool.run(toolArgs);
public static int run(Configuration conf, Tool tool, String[] args)
    throws Exception{
    if(conf == null) {
      conf = new Configuration();
    }
    GenericOptionsParser parser = new GenericOptionsParser(conf, args);
    //set the configuration back, so that Tool can configure itself
    tool.setConf(conf);
    
    //get the args w/o generic hadoop args
    String[] toolArgs = parser.getRemainingArgs();
    return tool.run(toolArgs);
  }
在HMaster启动过程中,tool.run(toolArgs)也即HMasterComandLine.run(toolArgs),代码如下:

public int run(String args[]) throws Exception {
    .......
    if ("start".equals(command)) {
      return startMaster();
    } else if ("stop".equals(command)) {
      return stopMaster();
    } else {
      usage("Invalid command: " + command);
      return -1;
    }
  }
也即,执行HMasterCommandLine的startMaster()方法
private int startMaster() {
    Configuration conf = getConf();
    try {
      // If 'local', defer to LocalHBaseCluster instance.  Starts master
      // and regionserver both in the one JVM.
      if (LocalHBaseCluster.isLocal(conf)) {
          ....
      } else {
        HMaster master = HMaster.constructMaster(masterClass, conf);
        ...
        master.start();
        master.join();
         ...
        return 0;
  }
这里调用了HMaster.constructMaster(masterClass,conf)方法构建一个master线程,然后执行master.start()和master.join().

至此,我们已经将启动HMaster的命令和启动HMaster的代码对应起来了。

3.HMaster启动的内部细节

首先看看HMaster的构造函数,它所做的事情可以归纳为以下几点:
1.初始化相关配置 2.初始化rpcServer 3.初始化zk监控类
public HMaster(final Configuration conf)
  throws IOException, KeeperException, InterruptedException {
    this.conf = new Configuration(conf);//1.配置设置
    // (1.1) HMaster端需要禁用block cache.
    this.conf.setFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY, 0.0f);
    // (1.2) 设置尝试次数//Set how many times to retry talking to another server over HConnection.
    HConnectionManager.setServerSideHConnectionRetries(this.conf, LOG);
    //2.初始化hostname参数 
    ...
    //3.初始化rpcServer ,将HMaster自己封装成RPC Server.
    ...
    this.rpcServer = HBaseRPC.getServer(this,
      new Class<>[]{HMasterInterface.class, HMasterRegionInterface.class},
        initialIsa.getHostName(), // This is bindAddress if set else it's hostname
        initialIsa.getPort(),
        numHandlers,
        0, // we dont use high priority handlers in master
        conf.getBoolean("hbase.rpc.verbose", false), conf,
        0); // this is a DNC w/o high priority handlers
   
    //4.zookeeper安全相关
 
    ZKUtil.loginClient(this.conf, "hbase.zookeeper.client.keytab.file",
      "hbase.zookeeper.client.kerberos.principal", this.isa.getHostName());
    ...
    //5. 更改master的配置,以添加副本相关的特性.
    Replication.decorateMasterConfiguration(this.conf);

    if (this.conf.get("mapred.task.id") == null) {
      this.conf.set("mapred.task.id", "hb_m_" + this.serverName.toString());
    }
    //6.启动zk client和rpcserver 
    this.zooKeeper = new ZooKeeperWatcher(conf, MASTER + ":" + isa.getPort(), this, true);
    this.rpcServer.startThreads();
    this.metrics = new MasterMetrics(getServerName().toString());
    //7.启动健康检查线程
    ...
  }

接下来,看看HMaster的run方法,首先启动infoServer,然后就一直阻塞在becameActiveMaster()处。

(PS:如果当前HMaster成功的由backupMaster变成activeMaster了,则进行finishInitialization操作)

 @Override
  public void run() {
    MonitoredTask startupStatus =
      TaskMonitor.get().createStatus("Master startup");
    startupStatus.setDescription("Master startup");
    masterStartTime = System.currentTimeMillis();
    try {//
      this.registeredZKListenersBeforeRecovery = this.zooKeeper.getListeners();

      //1.启动info server // Put up info server.
      int port = this.conf.getInt("hbase.master.info.port", 60010);
      ...
      this.infoServer.start();
      
      //2.尝试成为active master. 整个HMaster的生命周期都在becomeActiveMaster()里
     
      becomeActiveMaster(startupStatus);
      //3.如果我们是active master 或者我们被要求shutdown ,finishInitialization
      
      if (!this.stopped) {
     //成为activeMaster后,完成Master初始化工作     finishInitialization(startupStatus, false);
     // 进入主循环
       loop();
      }
    } catch (Throwable t) {
    } finally {
      startupStatus.cleanup();
      ...
    }
  }

becomeActiveMaster方法通过当前HMaster构造一个ActiveMasterManager对象,调用blockUntilBecomingActiveMaster(startupStatus)方法,阻塞直至成为ActiveMaster

  private boolean becomeActiveMaster(MonitoredTask startupStatus)
  throws InterruptedException {
    ...
    this.activeMasterManager = new ActiveMasterManager(zooKeeper, this.serverName,
        this);
    this.zooKeeper.registerListener(activeMasterManager);
    //阻塞,直至成为active master 才返回.
    return this.activeMasterManager.blockUntilBecomingActiveMaster(startupStatus);
  }

ActiveMasterManager的代码片段

    private boolean becomeActiveMaster(MonitoredTask startupStatus)  
    throws InterruptedException {  
      ...  
      this.activeMasterManager = new ActiveMasterManager(zooKeeper, this.serverName,  
          this);  
      this.zooKeeper.registerListener(activeMasterManager);  
      //阻塞,直至成为active master 才返回.  
      return this.activeMasterManager.blockUntilBecomingActiveMaster(startupStatus);  
    }  
/**阻塞等待,直到自己成为active master 
   */  
  boolean blockUntilBecomingActiveMaster(MonitoredTask startupStatus) {  
    while (true) {  
...  
      try {
        //1.获取backupZNode,默认/hbase/backup-master/${SERVER-NAME}
        String backupZNode = ZKUtil.joinZNode( this.watcher.backupMasterAddressesZNode, this.sn.toString());
        //2.尝试创建master ZNode,默认/hbase/master. 
        if (ZKUtil.createEphemeralNodeAndWatch(this.watcher, this.watcher.masterAddressZNode, this.sn.getVersionedBytes())) {//创建成功,表示当前HMaster成为active master
        //2.1成为active master后要删除该hmaster在/hbase/backup-master/下建的znode  
        ZKUtil.deleteNodeFailSilent(this.watcher, backupZNode);
         ...
        this.clusterHasActiveMaster.set(true);
         ...
        return true;
     } 
    // 3.当前hmaster无法创建/hbase/master(因为已经有别的active master创建了). 说明当前集群有active master,将标志值true 
    this.clusterHasActiveMaster.set(true); 
    // 4.因为当前hmaster没有成为active master, 则在/hbase/backup-master下创建znode,表示自己是backup-master 
    ZKUtil.createEphemeralNodeAndWatch(this.watcher, backupZNode, this.sn.getVersionedBytes()); 
    //5.获取当前active master的znode数据.
     String msg;
     byte [] bytes = ZKUtil.getDataAndWatch(this.watcher, this.watcher.masterAddressZNode);
     if (bytes == null) {
        //(4.1)active master的znode数据为空,表示active master挂掉了
         ...
     } else {
     // (4.2)active master的znode正常 
        ServerName currentMaster = ServerName.parseVersionedServerName(bytes);
        if (ServerName.isSameHostnameAndPort(currentMaster, this.sn)) {
        //(4.2.1) active master的地址和当前hmaster的相同,说明master可能刚刚进行了重启
        // 将原来active master的znode删掉.保证所有的backup-master继续竞选master
        this.watcher.masterAddressZNode); } else { 
       //(4.2.2) active master 正常
         ... 
    }
  }
  LOG.info(msg); startupStatus.setStatus(msg);
  }catch (KeeperException ke) { 
  ...
  return false; 
  }
    //6.同步访问clusterHasActiveMaster,如果为true且当前hmaster没有被stop,则释放锁,等待被唤醒.
    synchronized (this.clusterHasActiveMaster) {
    //注:nodeCreated和nodeDeleted,stop方法可能会唤醒该方法. 
    while (this.clusterHasActiveMaster.get() && !this.master.isStopped()) {
    try { 
        this.clusterHasActiveMaster.wait(); 
    }catch (InterruptedException e) {
    } 
    } if(clusterShutDown.get()) {
        this.master.stop("...");
    }
    if (this.master.isStopped()) {
         return false; 
    }
// Try to become active master again now that there is no active master } } } 
至此,HMaster启动的整个流程也就分析完啦。



编程开发网
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇hbase中二级索引的实现--ihbase 下一篇hbase配置与优化

评论

帐  号: 密码: (新用户注册)
验 证 码:
表  情:
内  容:

array(4) { ["type"]=> int(8) ["message"]=> string(24) "Undefined variable: jobs" ["file"]=> string(32) "/mnt/wp/cppentry/do/bencandy.php" ["line"]=> int(214) }