Mysql源码学习――没那么简单的Hash(二)

2014-11-24 13:40:47 · 作者: · 浏览: 4
C_ARRAY结构体)。

动态HASH函数

我们首先来看下hash函数的定义:

static inline char*

my_hash_key(const HASH *hash, const uchar *record, size_t *length,

my_bool first)

{

if (hash->get_key)

return (char*) (*hash->get_key)(record,length,first);

*length=hash->key_length;

return (char*) record+hash->key_offset;

}

static uint my_hash_mask(my_hash_value_type hashnr, size_t buffmax,

size_t maxlength)

{

if ((hashnr & (buffmax-1)) < maxlength) return (hashnr & (buffmax-1));

return (hashnr & ((buffmax >> 1) -1));

}

my_hash_key参数 说明
hash HASH链表结构
record 带插入的元素的值
length 带插入元素的值长度
first 辅助参数

"3">

my_hash_mask参数 说明
hashnr my_hash_key的计算结果
buffmax hash结构体中的blength
maxlength 实际桶的个数

你可能要问我怎么有两个?其实这和我们平时使用的差不多,第一个函数my_hash_key是根据我们的值进行Hash Key计算,一般我们在计算后,会对hash key进行一次模运算,以便计算结果在我们的bucket中。即my_hash_key的结果作为my_hash_mask的第一个输入参数。其实到这里都是非常好理解的,唯一让我蛋疼的是my_hash_mask的实现,其计算结果是和第二和第三个参数有关,即Hash结构体中的blength和records有关。动态变化的,我去..

看到这里我迷惑了,我上网经过各种百度,谷歌,终于让我找到了一封Mysql Expert的回信:

Hi!

"Yan" == Yan Yu writes:

Yan> Dear MySQL experts:

Yan> Thank you so much for your reply to my previous Qs, they are very

Yan> helpful!

Yan> Could someone please help me understand function my_hash_insert()

Yan> in mysys/hash.cc

Yan> what are lines 352 -429 trying to achieve Are they just some

Yan> optimization to shuffle existing

Yan> hash entries in the table (since the existing hash entries may be in

Yan> the wrong slot due to chaining

Yan> in the case of hash collision)

The hash algorithm is based on dynamic hashing without empty slots.

This means that when you insert a new key, in some cases a small set

of old keys needs to be moved to other buckets. This is what the code

does.

Regards,

Monty

红色注释的地方是重点,dynamic hash,原来如此,动态hash,第一次听说,在网上下了个《Dynamic Hash Tables》的论文,下面图解下基本原理。

image

动态Hash的本质是Hash函数的设计,图中给出的动态hash函数只是论文中提到的一个例子。下面就具体解读下Mysql中的hash插入——my_hash_insert

my_hash_insert非深入解析

首先给出my_hash_insert的源代码,代码在mysys/hash.c中。

my_bool my_hash_insert(HASH *info, const uchar *record)

{

int flag;

size_t idx,halfbuff,first_index;

my_hash_value_type hash_nr;

uchar *UNINIT_VAR(ptr_to_rec),*UNINIT_VAR(ptr_to_rec2);

HASH_LINK *data,*empty,*UNINIT_VAR(gpos),*UNINIT_VAR(gpos2),*pos;

if (HASH_UNIQUE & info->flags)

{

uchar *key= (uchar*) my_hash_key(info, record, &idx, 1);

if (my_hash_search(info, key, idx))

return(TRUE); /* Duplicate entry */

}

flag=0;

if (!(empty=(HASH_LINK*) alloc_dynamic(&info->array)))

return(TRUE); /* No more memory */

data=dynamic_element(&info->array,0,HASH_LINK*);

halfbuff= info->blength >> 1;

idx=first_index=info->records-halfbuff;

if (idx != info->records) /* If some records */

{

do

{

pos=data+idx;

hash_nr=rec_hashnr(info,pos->data);

if (flag == 0) /* First loop; Check if ok */

if (my_hash_mask(hash_nr, info->blength, info->records) != first_index)

break;

if (!(hash_nr & halfbuff))

{ /* Key will not move */

if (!(flag & LOWFIND))

{