[20231121]oracle SYS_GUID的组成.txt

–//看了链接:http://ksun-oracle.blogspot.com/2023/08/oracle-sysguid-composition.html
–//函数SYS_GUID,不知道作者如何猜测函数SYS_GUID的组成的,自己按照作者的测试,自己重复看看.

SYS_GUID generates and returns a globally unique identifier (RAW value) made up of 16 bytes.
On most platforms, the generated identifier consists of a host identifier, a process or thread identifier of the process
or thread invoking the function, and a nonrepeating value (sequence of bytes) for that process or thread.

The Oracle Database SYS_GUID function does not return a standard UUID since the generated GUID is not a random number.
(see: IETF RFC 4122 version 4 UUID, and Oracle dbms_crypto.randombytes and Enhancement Suggestion).

SYS_GUID生成并返回一个由16个字节组成的全局唯一标识符(RAW值)。在大多数平台上,生成的标识符由主机标识符、调用该函数的进程
或线程的进程或线程标识符以及该进程或线程的非重复值(字节序列)组成。

Oracle数据库SYS_GUID函数不返回一个标准的UUID,因为生成的GUID不是一个随机数。(请参见: IETF RFC 4122版本4的UUID,以及
Oracle dbms_crypto.randombytes和增强建议)。

–//注:实际上我第一次在windows测试,发现没有任何规律可言,作者的猜测或者分析根本不适合windows系统!!
–//顺便提一下我个人反对使用SYS_GUID做为主键的,小量使用尚可接受,大量使用我在以前文章提到过,大量使用消耗CPU资源,我看过2套
–//使用它作为主键的系统,一般不会使用raw类型,而是使用varchar2(32),因为使用raw类型要在写sql语句时使用hextoraw函数转换,这
–//样占用空间多占1倍,索引键值也会变得很大,许多开发会讲这样不就消耗一点磁盘空间吗?现在的磁盘太便宜,我遇到一个表有1X个字
–//段都是这个类型,一个查询看到的满屏都是类似ascii码的东西,不知道是怎么感觉,这样的系统索引,表增加异常地快,意味着日志量也
–//很大.

–//在linux下测试看看.

1.环境:
SCOTT@book> @ver1
PORT_STRING VERSION BANNER
—————————— ————– ——————————————————————————–
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production

2.分析:
–//建立脚本,便于重复执行:
$ cat guid.sql
column seq_inc_by_1 format a12
column PROC_SPID fromat a20
with sq as (select sys_guid() gid, s.logon_time, p.spid, s.sid, s.serial#, p.pid from v$session s, v$process p
where s.paddr=p.addr and (s.sid = (select sid from v$mystat where rownum=1)))
select substr(gid, 1, 12) seq_inc_by_1
,substr(gid, 13, 4)||'(=’||to_number(substr(gid, 13, 4), ‘XXXXXX’)||’)’ proc_spid
,substr(gid, 17, 4) unknown_1
,substr(gid, 21, 8) unix_host_id
,substr(gid, 29, 4) unknown_2
,sq.*
from sq;

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6B3CC961E4 7E19(=32281) E063 4E64A8C0 88E6 0B6B3CC961EA7E19E0634E64A8C088E6 2023-12-01 11:19:32 32281 18 805 25

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6B3CC961EB 7E19(=32281) E063 4E64A8C0 88E6 0B6B3CC961F17E19E0634E64A8C088E6 2023-12-01 11:19:32 32281 18 805 25

–//我的测试即使密集的执行SEQ_INC_BY_1也不是按1递增。

$ hostid
a8c04e64

$ hostid | od -t x4| xxd -r -p
0c8a46e4
–//od -t x4 转换的结果还是不对,但是反转过来读就可以跟前面的UNIX_HOST_ID对上。

$ hostid | od -t x4 | xxd -r -p | strings| rev
4e64a8c0

$ hostid | od -t x4 | cut -c8-| xxd -r -p | rev
4e64a8c0
–//这样倒是能对上。

–//按照输出可以大致推测,来自:http://ksun-oracle.blogspot.com/2023/08/oracle-sysguid-composition.html
SEQ_INC_BY_1 ( 1-12): session own Sequence Number, increase 1 per sys_guid call, initiated by a number related to
v$session logon_time
PROC_SPID (13-16): v$process.spid
UNKNOWN_1 (17-20): (E063 or E064)
UNIX_HOST_ID (21-28): hostid command output (Linux little endian, 4 bytes reverse order)
UNKNOWN_2 (29-32) :

–//退出会话重复执行:
SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6BA3598106 8037(=32823) E063 4E64A8C0 B041 0B6BA359810C8037E0634E64A8C0B041 2023-12-01 11:48:14 32823 18 807 25

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6BA359810D 8037(=32823) E063 4E64A8C0 B041 0B6BA35981138037E0634E64A8C0B041 2023-12-01 11:48:14 32823 18 807 25

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6BA3598114 8037(=32823) E063 4E64A8C0 B041 0B6BA359811A8037E0634E64A8C0B041 2023-12-01 11:48:14 32823 18 807 25

–//似乎SEQ_INC_BY_1 按+7递增。
–//6 =6 D = 13 14 = 20
–//E4 = 228 EB = 235

–//0B6BA359810D = 12556929958157
–//0B6BA3598114 = 12556929958164

with sq1 as (select /*+ materialize */ level nr, substr(sys_guid(), 1, 12) guid_12 from dual connect by level <= 1e6)
,sq2 as (select /*+ materialize */ level + 1*1e6 nr, substr(sys_guid(), 1, 12) guid_12 from dual connect by level <= 1e6)
,sq3 as (select /*+ materialize */ level + 2*1e6 nr, substr(sys_guid(), 1, 12) guid_12 from dual connect by level <= 1e6)
select min(nr) min_nr, max(nr) max_nr
,min(guid_12) min_guid_12, max(guid_12) max_guid_12
,count(*) nr_count
,to_number(max(guid_12), ‘xxxxxxxxxxxx’) – to_number(min(guid_12), ‘xxxxxxxxxxxx’) + 1 nr_count
from
(select * from sq1
union
select * from sq2
union
select * from sq3);

MIN_NR MAX_NR MIN_GUID_12 MAX_GUID_12 NR_COUNT NR_COUNT
———- ———- ———————— ———————— ———- ———-
1 3000000 0BE7D7EAE1FF 0BE7D818A8BE 3000000 3000000
–//这样执行SEQ_INC_BY_1确实按照+1增加的。

–//按照作者介绍,SEQ_INC_BY_1
–//来自:http://ksun-oracle.blogspot.com/2023/08/oracle-sysguid-composition.html
So SEQ_INC_BY_1 is a Sequence Number, increasing 1 per sys_guid call, initiated by a number related to epoch time of
v$session logon_time (probably cached in each v$process.spid).

SEQ_INC_BY_1 is 12 hex digits, with maximum decimal value:

SCOTT@book> set numw 15
SCOTT@book> select to_number(lpad(‘F’, 12, ‘F’), lpad(‘X’, 12, ‘X’)) n20 from dual;
N20
—————
281474976710655

The last 6 digits represents a pure calling sequence number, the rest prefix digits are UNIX epoch seconds. So the
maximum seconds is:

281474976

Since 281474976 seconds is about 3258 days (281474976/86400) or about 9 years, sys_guid is wrapped on overflow about
each 9 years. The first 10 reset datetime can be projected as follows:

–//281474976/86400/365 = 8.92551293759512937595

select level NR#
,to_date(‘1970-01-01 00:00:00’, ‘yyyy-mm-dd hh24:mi:ss’) + level*281474976/86400 datetime
from dual connect by level <= 10;

NR# DATETIME
————— ——————-
1 1978-12-02 19:29:36
2 1987-11-03 14:59:12
3 1996-10-04 10:28:48
4 2005-09-05 05:58:24
5 2014-08-07 01:28:00
6 2023-07-08 20:57:36
7 2032-06-08 16:27:12
8 2041-05-10 11:56:48
9 2050-04-11 07:26:24
10 2059-03-13 02:56:00
10 rows selected.

16 bytes sys_guid is a 32 long raw hex value. In each interval, 10/16 of them are starting with number 0-9, 6/16 with
A-F.

Given a sys_guid, we can estimate its datetime by:

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ———- ———- ——-
0B6BA3598114 8037(=32823) E063 4E64A8C0 B041 0B6BA359811A8037E0634E64A8C0B041 2023-12-01 11:48:14 32823 18 807 25

–//代入0B6BA359811A8037E0634E64A8C0B041。
with sq as (select to_number(substr(‘0B6BA359811A8037E0634E64A8C0B041’, 1, 12), lpad(‘X’, 12, ‘X’))/1e6 epoch_reminder from dual)
select epoch_reminder
,to_date(‘2023*JUL*08 20:57:36’, ‘YYYY*MON*DD hh24:mi:ss’) + epoch_reminder/86400 estimated_datetime
from sq;

EPOCH_REMINDER ESTIMATED_DATETIME
————— ——————-
12556929.95817 2023-12-01 04:59:46

$ xdate ‘2023-12-01 11:48:14’ 2
1701402494.000000000

$ xdate ‘2023-12-01 04:59:46’ 2
1701377986.000000000

–//相差 1701402494-1701377986 = 24508

SCOTT@book> @ guid
SEQ_INC_BY_1 PROC_SPID UNKNOWN_ UNIX_HOST_ID UNKNOWN_ GID LOGON_TIME SPID SID SERIAL# PID
———— ——————– ——– —————- ——– ——————————– ——————- —— ————— ————— ——-
0BE7D818A905 2FFC(=12284) E063 4E64A8C0 FA2B 0BE7D818A90B2FFCE0634E64A8C0FA2B 2023-12-07 15:59:10 12284 36 7121 26

with sq as (select to_number(substr(‘0BE7D818A90B2FFCE0634E64A8C0FA2B’, 1, 12), lpad(‘X’, 12, ‘X’))/1e6 epoch_reminder from dual)
select epoch_reminder
,to_date(‘2023*JUL*08 20:57:36’, ‘YYYY*MON*DD hh24:mi:ss’) + epoch_reminder/86400 estimated_datetime
from sq;

EPOCH_REMINDER ESTIMATED_DATETIME
————— ——————-
13090390.845707 2023-12-07 09:10:47

$ xdate ‘2023-12-07 15:59:10’ 2
1701935950.000000000

$ xdate ‘2023-12-07 09:10:47’ 2
1701911447.000000000

–//1701935950-1701911447 = 24503
–//放弃!!不再探究..