设为首页 加入收藏

TOP

为SAP Sybase IQ 15.1 Demo库建大数据表并验证其性能(一)
2014-11-23 19:16:02 】 浏览:515
Tags:SAP Sybase 15.1 Demo 建大 数据 验证 性能

接上一节,我们有了demo数据库,可惜里边的数据集相对都比较小。如果你没有合适的测试数据集,也没有可用的生产环境为你提供数据集,怎么办?

可以自己去造一张大表,生成随机数据。这是许多DBA或者开发人员尤其是研究系统性能的开发人员常用的办法。

第一步,建目标表

建一张表,字段足够多,如,表名为BIG,有43个字段:

定义如下:

CREATE TABLE "BIG" (
    "H0" VARchar(1),
    "ID" VARchar(18),
    "H02" VARchar(1),
    "H031" VARchar(2),
    "H032" VARchar(2),
    "H041" VARchar(2),
    "H042" VARchar(2),
    "H051" VARchar(2),
    "H052" VARchar(2),
    "H061" VARchar(2),
    "H062" VARchar(2),
    "H071" VARchar(1),
    "H072" VARchar(1),
    "H081" VARchar(1),
    "H082" VARchar(1),
     "H09" VARchar(2),
    "H10" VARchar(3),
    "H11" VARchar(1),
    "H12" VARchar(1),
    "H13" VARchar(4),
    "H14" VARchar(1),
    "H15" VARchar(1),
    "H16" VARchar(1),
    "H17" VARchar(1),
    "H18" VARchar(1),
    "H19" VARchar(1),
    "H20" VARchar(1),
    "H21" VARchar(1),
    "H22" VARchar(1),
    "H23" VARchar(1),
    "HA0" VARchar(1),
    "HA1" VARchar(2),
    "HA2" VARchar(2),
    "HA3" VARchar(1),
    "HA4" VARchar(1),
    "HA5" VARchar(1),
    "HA6" VARchar(1),
    "HA7" VARchar(1),
    "HA8" VARchar(1),
    "HA9" VARchar(1),
    "HA10" VARchar(3),
    "HA11" VARchar(1),
    "HA20" VARchar(2)
);

2. 生成数据

我们使用比较直接而且笨的办法,插入随机数据,但是经过实测,发现性能极其低下,最后我已经无法忍受了。其脚本如下:

BEGIN
DECLARE i INT;
SET i = 1;
WHILE i<=1000000 LOOP
    INSERT INTO "BIG" VALUES (
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48),
        char( cast (rand() * 10 as int) + 48)
    );   
    SET i = i + 1;
    if ( i % 2000 = 0) then
        COMMIT;
    END IF;
END LOOP;
END;
1000000条数据,几个小时能弄完。

有没有快捷的方式呢?

可以写一小段程序,生成一个数据文件,再后再用INPUT或者LOAD来加载

这一小段程序如下, 用C实现:

#include "stdafx.h"
#include 
#include 


int getN()
{
	return rand() % 10;
}

static void gen_data(FILE* f, int n, int colCount)
{
	// file : d:\work\demo\BIG.txt, default colCount is 43
	for (int i=0; i 
 
只用10秒左右即生成所有数据。

3. 加载数据

如果采用INPUT命令,
INPUT INTO BIG from 'd:\\asa120\\BIG.txt' format ascii escapes on escape character '\\' delimited by ',' encoding 'GBK';实测速度比较慢,一秒钟大概也就300多条。文档中说了:

The LOAD TABLE statement adds rows into a table; it doesn't replace them.

Loading data using the LOAD TABLE statement (without the WITH ROW LOGGING and WITH CONTENT LOGGING options) is considerably faster than using the INPUT statement.

看来,INPUT操作比LOAD操作,多了些LOGGING的操作,所以费时间。

痛苦的LOAD TABLE命令开始了,

试了好几个用法:

最后,下述命令通过, 大概花了10来秒钟完成100万条数据的加载,速度非常快。

LOAD TABLE BIG (H0',',
ID',',
H02',',
H031',',
H032',',
H041',',
H042',',
H051',',
H052',',
H061',',
H062',',
H071',',
H072',',
H081',',
H082',',
H09',',
H10',',
H11',',
H12',',
H13',',
H14',',
H15',',
H16',',
H17',',
H18',',
H19',',
H20',',
H21',',
H22',',
H23',',
HA0',',
HA1',',
HA2',',
HA3',',
HA4',',
HA5',',
HA6',',
HA7',',
HA8',',
HA9',',
HA10',',
HA11',',
HA20'\X0A')
from 'd:\\asa120\\BIG.txt' 
ESCAPES OFF
QUOTES OFF
NOTIFY 100000
WITH CHECKPOINT ON

要说明的是,在上边,如果没有各列后边的分隔符说明,如果quotes设成默认值 ON,该命令都会执行失败。

我估计load table命令在这个版本里可能功能不是很完善。

比如,一个简单的数据文件内容如下:

'123','456'
'222','111'

使用INPUT命令,很容易就载入表abc成功。
input into abc from 'd:\\asa120\\abc.txt' format ascii escapes on escape character '\\' delimited by ',' encoding 'GBK';

可是用load table就失败,默认值也出错。

truncate table abc;

load table abc(col1 ',', col2 '0x0A') from 'd:\\asa120\\abc.txt' escapes off;
结果报错:Non-space text found after ending quote character for an enclosed field ......

4. 验证查询

select ID,
sum(case when ID<>'0' then cnt end) c1,
sum(case when ha3='1'   then cnt end) c2,
sum(case when ha3='2'   then cnt end) c4,
sum(case when ha3='3'   then cnt end) c6,
sum(case when ha3='4'   then cnt end) c8
from (
sele
首页 上一页 1 2 下一页 尾页 1/2/2
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇sybase常用指令 下一篇Sybase开发工具包安装及字符集修..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目