一、前言

在日趋复杂的分布式系统中，数据量越来越大，数据库分库分表是一贯的垂直水平做法，但是需要一个全局唯一ID标识一条数据或者MQ消息，数据库id自增就显然不能满足要求了。因为场景不同，分布式ID需要满足以下几个条件：

全局唯一性，不能出现重复的ID。
趋势递增，在MySQL InnoDB引擎中使用的是聚集索引，由于多数RDBMS使用B-tree的数据结构来存储索引数据，在主键的选择上应该尽量使用有序的主键保证写入性能。
单调递增，保证下一个ID一定大于上一个ID。例如分布式事务版本号、IM增量消息、排序等特殊需求。
信息安全，对于特殊业务，如订单等，分布式ID生成应该是无规则的，不能从ID上反解析出流量等敏感信息。
市面上对分布式ID生成大致有几种算法（一些开源项目都是围着这几种算法进行实现和优化）：

UUID：因为是本地生成，性能极高，但是生成的ID太长，16字节128位，通常需要字符串类型存储，且无序，所以很多场景不适用，也不适用于作为MySQL数据库的主键和索引（MySql官方建议，主键越短越好；对于InnoDB引擎，索引的无序性可能会引起数据位置频繁变动，严重影响性能）。
数据库自增ID：每次获取ID都需要DB的IO操作，DB压力大，性能低。数据库宕机对外依赖服务就是毁灭性打击，不过可以部署数据库集群保证高可用。
数据库号段算法：对数据库自增ID的优化，每次获取一个号段的值。用完之后再去数据库获取新的号段，可以大大减轻数据库的压力。号段越长，性能越高，同时如果数据库宕机，号段没有用完，短时间还可以对外提供服务。（美团的Leaf、滴滴的TinyId）
雪花算法：Twitter开源的snowflake，以时间戳+机器+递增序列组成，基本趋势递增，且性能很高，因为强依赖机器时钟，所以需要考虑时钟回拨问题，即机器上的时间可能因为校正出现倒退，导致生成的ID重复。（百度的uid-generator、美团的Leaf）
雪花算法和数据库号段算法用的最多，本篇主要对雪花算法原理剖析和解决时钟回拨问题讨论。

二、雪花算法snowflake

1、基本定义

snowflake原理其实很简单，生成一个64bit（long）的全局唯一ID，标准元素以1bit无用符号位+41bit时间戳+10bit机器ID+12bit序列化组成，其中除1bit符号位不可调整外，其他三个标识的bit都可以根据实际情况调整：

41bit-时间可以表示（1L<<41）/(1000L360024*365)=69年的时间。
10bit-机器可以表示1024台机器。如果对IDC划分有需求，还可以将10-bit分5-bit给IDC，分5-bit给工作机器。这样就可以表示32个IDC，每个IDC下可以有32台机器。
12个自增序列号可以表示2^12个ID，理论上snowflake方案的QPS约为409.6w/s。
注：都是从0开始计数。

2、snowflake的优缺点
优点：

毫秒数在高位，自增序列在低位，整个ID都是趋势递增的。
可以不依赖数据库等第三方系统，以服务的方式部署，稳定性更高，生成ID的性能也非常高。
可以根据自身业务特性分配bit位，非常灵活。
缺点：

强依赖机器时钟，如果机器上时钟回拨，会导致发号重复或者服务处于不可用状态。

三、Java代码实现snowflake

public class SnowflakeIdGenerator {public static final int TOTAL_BITS = 1 << 6;private static final long SIGN_BITS = 1;private static final long TIME_STAMP_BITS = 41L;private static final long DATA_CENTER_ID_BITS = 5L;private static final long WORKER_ID_BITS = 5L;private static final long SEQUENCE_BITS = 12L;/*** 时间向左位移位数 22位*/private static final long TIMESTAMP_LEFT_SHIFT = WORKER_ID_BITS + DATA_CENTER_ID_BITS + SEQUENCE_BITS;/*** IDC向左位移位数 17位*/private static final long DATA_CENTER_ID_SHIFT = WORKER_ID_BITS + SEQUENCE_BITS;/*** 机器ID 向左位移位数 12位*/private static final long WORKER_ID_SHIFT = SEQUENCE_BITS;/*** 序列掩码，用于限定序列最大值为4095*/private static final long SEQUENCE_MASK =  -1L ^ (-1L << SEQUENCE_BITS);/*** 最大支持机器节点数0~31，一共32个*/private static final long MAX_WORKER_ID = -1L ^ (-1L << WORKER_ID_BITS);/*** 最大支持数据中心节点数0~31，一共32个*/private static final long MAX_DATA_CENTER_ID = -1L ^ (-1L << DATA_CENTER_ID_BITS);/*** 最大时间戳 2199023255551*/private static final long MAX_DELTA_TIMESTAMP = -1L ^ (-1L << TIME_STAMP_BITS);/*** Customer epoch*/private final long twepoch;private final long workerId;private final long dataCenterId;private long sequence = 0L;private long lastTimestamp = -1L;/**** @param workerId 机器ID* @param dataCenterId  IDC ID*/public SnowflakeIdGenerator(long workerId, long dataCenterId) {this(workerId, dataCenterId, null);}/**** @param workerId  机器ID* @param dataCenterId IDC ID* @param epochDate 初始化时间起点*/public SnowflakeIdGenerator(long workerId, long dataCenterId, Date epochDate) {if (workerId > MAX_WORKER_ID || workerId < 0) {throw new IllegalArgumentException("worker Id can't be greater than "+ MAX_WORKER_ID + " or less than 0");}if (dataCenterId > MAX_DATA_CENTER_ID || dataCenterId < 0) {throw new IllegalArgumentException("datacenter Id can't be greater than {" + MAX_DATA_CENTER_ID + "} or less than 0");}this.workerId = workerId;this.dataCenterId = dataCenterId;if (epochDate != null) {this.twepoch = epochDate.getTime();} else {//2010-10-11this.twepoch = 1286726400000L;}}public long genID() throws Exception {try {return nextId();} catch (Exception e) {throw e;}}public long getLastTimestamp() {return lastTimestamp;}/*** 通过移位解析出sequence，sequence有效位为[0,12]* 所以先向左移64-12，然后再像右移64-12，通过两次移位就可以把无效位移除了* @param id* @return*/public long getSequence2(long id) {return (id << (TOTAL_BITS - SEQUENCE_BITS)) >>> (TOTAL_BITS - SEQUENCE_BITS);}/*** 通过移位解析出workerId，workerId有效位为[13,17], 左右两边都有无效位* 先向左移 41+5+1，移除掉41bit-时间，5bit-IDC、1bit-sign，* 然后右移回去41+5+1+12，从而移除掉12bit-序列号* @param id* @return*/public long getWorkerId2(long id) {return (id << (TIME_STAMP_BITS + DATA_CENTER_ID_BITS + SIGN_BITS)) >>> (TIME_STAMP_BITS + DATA_CENTER_ID_BITS + SEQUENCE_BITS + SIGN_BITS);}/*** 通过移位解析出IDC_ID，dataCenterId有效位为[18,23]，左边两边都有无效位* 先左移41+1，移除掉41bit-时间和1bit-sign* 然后右移回去41+1+5+12，移除掉右边的5bit-workerId和12bit-序列号* @param id* @return*/public long getDataCenterId2(long id) {return (id << (TIME_STAMP_BITS + SIGN_BITS)) >>> (TIME_STAMP_BITS + WORKER_ID_BITS + SEQUENCE_BITS + SIGN_BITS);}/*** 41bit-时间，左边1bit-sign为0，可以忽略，不用左移，所以只需要右移，并加上起始时间twepoch即可。* @param id* @return*/public long getGenerateDateTime2(long id) {return (id >>> (DATA_CENTER_ID_BITS + WORKER_ID_BITS + SEQUENCE_BITS)) + twepoch;}public long getSequence(long id) {return id & ~(-1L << SEQUENCE_BITS);}public long getWorkerId(long id) {return id >> WORKER_ID_SHIFT & ~(-1L << WORKER_ID_BITS);}public long getDataCenterId(long id) {return id >> DATA_CENTER_ID_SHIFT & ~(-1L << DATA_CENTER_ID_BITS);}public long getGenerateDateTime(long id) {return (id >> TIMESTAMP_LEFT_SHIFT & ~(-1L << 41L)) + twepoch;}private synchronized long nextId() throws Exception {long timestamp = timeGen();// 1、出现时钟回拨问题，直接抛异常if (timestamp < lastTimestamp) {long refusedTimes = lastTimestamp - timestamp;// 可自定义异常类throw new UnsupportedOperationException(String.format("Clock moved backwards. Refusing for %d seconds", refusedTimes));}// 2、时间等于lastTimestamp，取当前的sequence + 1if (timestamp == lastTimestamp) {sequence = (sequence + 1) & SEQUENCE_MASK;// Exceed the max sequence, we wait the next second to generate idif (sequence == 0) {timestamp = tilNextMillis(lastTimestamp);}} else {// 3、时间大于lastTimestamp没有发生回拨， sequence 从0开始this.sequence = 0L;}lastTimestamp = timestamp;return allocate(timestamp - this.twepoch);}private long allocate(long deltaSeconds) {return (deltaSeconds << TIMESTAMP_LEFT_SHIFT) | (this.dataCenterId << DATA_CENTER_ID_SHIFT) | (this.workerId << WORKER_ID_SHIFT) | this.sequence;}private long timeGen() {long currentTimestamp = System.currentTimeMillis();// 时间戳超出最大值if (currentTimestamp - twepoch > MAX_DELTA_TIMESTAMP) {throw new UnsupportedOperationException("Timestamp bits is exhausted. Refusing ID generate. Now: " + currentTimestamp);}return currentTimestamp;}private long tilNextMillis(long lastTimestamp) {long timestamp = timeGen();while (timestamp <= lastTimestamp) {timestamp = timeGen();}return timestamp;}/*** 测试* @param args*/public static void main(String[] args) throws Exception {SnowflakeIdGenerator snowflakeIdGenerator = new SnowflakeIdGenerator(1,2);long id = snowflakeIdGenerator.genID();System.out.println("ID=" + id + ", lastTimestamp=" + snowflakeIdGenerator.getLastTimestamp());System.out.println("ID二进制：" + Long.toBinaryString(id));System.out.println("解析ID:");System.out.println("Sequence=" + snowflakeIdGenerator.getSequence(id));System.out.println("WorkerId=" + snowflakeIdGenerator.getWorkerId(id));System.out.println("DataCenterId=" + snowflakeIdGenerator.getDataCenterId(id));System.out.println("GenerateDateTime=" + snowflakeIdGenerator.getGenerateDateTime(id));System.out.println("Sequence2=" + snowflakeIdGenerator.getSequence2(id));System.out.println("WorkerId2=" + snowflakeIdGenerator.getWorkerId2(id));System.out.println("DataCenterId2=" + snowflakeIdGenerator.getDataCenterId2(id));System.out.println("GenerateDateTime2=" + snowflakeIdGenerator.getGenerateDateTime2(id));}}

四、时钟回拨问题和解决方案讨论

1、时间戳自增彻底解决时钟回拨问题

private long sequence = -1L;
private long startTimestamp = 1623947387000L;
private synchronized  long nextId2() {long sequenceTmp = sequence;sequence = (sequence + 1) & SEQUENCE_MASK;// sequence =0 有可能是初始+1=0，也可能是超过了最大值等于0// 所以把 初始+1=0排除掉if (sequence == 0 && sequenceTmp >= 0) {// sequence自增到最大了，时间戳自增1startTimestamp += 1;}// 生成idreturn allocate(startTimestamp - twepoch);
}

2、缓存历史序列号缓解时钟回拨问题

// 记录近2S的毫秒数的sequence的缓存
private int LENGTH = 2000;
// sequence缓存
private long[] sequenceCycle = new long[LENGTH];private synchronized long nextId() throws Exception {long timestamp = timeGen();int index = (int)(timestamp % LENGTH);// 1、出现时钟回拨问题，获取历史序列号自增if (timestamp < lastTimestamp) {long sequence = 0;do {if ((lastTimestamp - timestamp) > LENGTH) {// 可自定义异常、告警等，短暂不能对外提供，故障转移，将请求转发到正常机器。throw new UnsupportedOperationException("The timeback range is too large and exceeds 2000ms caches");}long preSequence = sequenceCycle[index];sequence = (preSequence + 1) & SEQUENCE_MASK;if (sequence == 0) {// 如果取出的历史序列号+1后已经达到超过最大值，// 则重新获取timestamp,重新拿其他位置的缓存timestamp = tilNextMillis(lastTimestamp);index = (int)(timestamp % LENGTH);} else {// 更新缓存sequenceCycle[index] = this.sequence;            return allocate((timestamp - this.twepoch), sequence);}} while (timestamp < lastTimestamp);// 如果在获取缓存的过程中timestamp恢复正常了，就走正常流程}// 2、时间等于lastTimestamp，取当前的sequence + 1if (timestamp == lastTimestamp) {sequence = (sequence + 1) & SEQUENCE_MASK;// Exceed the max sequence, we wait the next second to generate idif (sequence == 0) {timestamp = tilNextMillis(lastTimestamp);index = (int)(timestamp % LENGTH);}} else {// 3、时间大于lastTimestamp没有发生回拨， sequence 从0开始this.sequence = 0L;}// 缓存sequence + 更新lastTimestampsequenceCycle[index] = this.sequence;lastTimestamp = timestamp;// 生成idreturn allocate(timestamp - this.twepoch);
}

3、等待时钟校正

private synchronized  long nextId3() {long timestamp = timeGen();// 1、出现时钟回拨问题，如果回拨幅度不大，等待时钟自己校正if (timestamp < lastTimestamp) {int sleepCntMax = 2;int sleepCnt = 0;do {long sleepTime = lastTimestamp - timestamp;if (sleepCnt > sleepCntMax) {// 可自定义异常类throw new UnsupportedOperationException(String.format("Clock moved backwards. Refusing for %d seconds", sleepTime));}if (sleepTime <= 500) {try {Thread.sleep(sleepTime);} catch (InterruptedException e) {e.printStackTrace();} finally {sleepCnt++;timestamp = tilNextMillis(lastTimestamp);}} else {// 可自定义异常类throw new UnsupportedOperationException(String.format("Clock moved backwards. Refusing for %d seconds", sleepTime));}} while (timestamp < lastTimestamp);}// 2、时间等于lastTimestamp，取当前的sequence + 1if (timestamp == lastTimestamp) {sequence = (sequence + 1) & SEQUENCE_MASK;// Exceed the max sequence, we wait the next second to generate idif (sequence == 0) {timestamp = tilNextMillis(lastTimestamp);}} else {// 3、时间大于lastTimestamp没有发生回拨， sequence 从0开始this.sequence = 0L;}lastTimestamp = timestamp;// 生成idreturn allocate(timestamp - this.twepoch);
}