后来hdfs选择了方案3,增加了append操作.不幸的是,hdfs增加的append操作bug挺多,他们最终又打算把这个append去掉.但是有一些开发者在这期间搞出来了sync(我感觉是这期间搞出来的,但是很怀疑为什么把sync放到append的branch里,不明白sync为什么依赖append),sync可以保证hbase的持久性(我感觉是使方案2成为了可能),但是要想使用sync需要首先设置dfs.support.append.也就是说想要sync就不得不启用有缺陷的append. 而且当时append和sync这两个重要功能是在hadoop的branch "0.20.append",这个branch基于branch "0.20.2",而基于"0.20.2"这个branch的还有一个"0.20.security"的branch,包含了security的功能.所以导致一个开发者说,"there has been an 18 month period where there has been no one Apache release that had all the committed features of Apache Hadoop"(无力吐槽了,按理说branch应该merge回trunk的).下面的图表显示了当时的状态. 难怪另一个开发者说"HDFS sync has a colorful history"
[quote]The work on branch-20-append was to support sync, for durable HBase WALs, not append. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list like this. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do). For v1.x let's: Always enable the sync path (currently only enabled if dfs.support.append is set) Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back For 2.x let's Always enable the hsync/hflush path The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default.[/quote]
其实hdfs的hsync/hflush也有一番曲折(HDFS sync has a colorful history),但那是另一个故事了.