SQLite 变长度整型(varint)编码解码方法,sqlitevarint
SQLite为了节省存储空间,可以对64-bits的整型变量进行压缩。压缩后的最后一个byte的第一个bit是0,其他byte的第一个bit都是1,如下:
对于十进制数 182, 它的二进制是 10110110, 有8个bit,但是第一个bit是1,所以对其分割:
分割成两个bytes, 填充这两个bytes,使得他们符合上面提到的规则(加粗部分):
这样十进制整数 182 就被编码为一个2bytes的二进制格式了。
解码只需要把这个过程逆过来进行一遍就可以了。
还有需要注意的一点是SQLite是对64-bits的整型变量的二进制补码进行编码的,这对正数没什么影响,但是对负数编码的时候需要注意一下。
附上我的编码解码代码,我没使用补码,默认数据都是非负数:
unsigned char* IntToVar(unsigned char* dst, uint32_t v)
{
unsigned char* ptr = dst;
static const int B = 128;
if (v < (1 << 7))
{
*(ptr++) = v;
}
else if (v < (1 << 14))
{
*(ptr++) = v | B;
*(ptr++) = v >> 7;
}
else if (v < (1 << 21))
{
*(ptr++) = v | B;
*(ptr++) = (v >> 7) | B;
*(ptr++) = v >> 14;
}
else if (v < (1 << 28))
{
*(ptr++) = v | B;
*(ptr++) = (v >> 7) | B;
*(ptr++) = (v >> 14) | B;
*(ptr++) = v >> 21;
}
else
{
*(ptr++) = v | B;
*(ptr++) = (v >> 7) | B;
*(ptr++) = (v >> 14) | B;
*(ptr++) = (v >> 21)|B;
*(ptr++) = v >> 28;
}
return ptr;
}
uint32_t VarToInt(unsigned char* tail)
{
unsigned char* pTmp = tail;
uint32_t Res=0;
while ((*pTmp) & 128 == 128)
{
Res = Res << 7;
*pTmp = *pTmp ^ 128;
Res = Res | (*pTmp);
pTmp--;
}
Res = Res << 7;
Res = Res | *pTmp;
return Res;
}
再添加一点最近学到的:
0x81 0x81 0x81 0x81 0x01 becomes 0x10204081:
1000 0001 0000 0001
1000 0001 000 0001
1000 0001 变成-> 000 0001
1000 0001 000 0001
0000 0001 000 0001
0x8a 0x91 0xd1 0xac 0x78 becomes 0x12345678:我求出来的是0xA2345678,怀疑注释错了
1000 1010 0000 1010
1001 0001 001 0001
1101 0001 变成-> 101 0001
1010 1100 010 1100
0111 1000 111 1000
源代码中的编码解码函数分别是:
/*
** The variable-length integer encoding is as follows:
**
** KEY:
** A = 0xxxxxxx 7 bits of data and one flag bit
** B = 1xxxxxxx 7 bits of data and one flag bit
** C = xxxxxxxx 8 bits of data
**
** 7 bits - A
** 14 bits - BA
** 21 bits - BBA
** 28 bits - BBBA
** 35 bits - BBBBA
** 42 bits - BBBBBA
** 49 bits - BBBBBBA
** 56 bits - BBBBBBBA
** 64 bits - BBBBBBBBC
*/
/*
** Write a 64-bit variable-length integer to memory starting at p[0].
** The length of data write will be between 1 and 9 bytes. The number
** of bytes written is returned.
**
** A variable-length integer consists of the lower 7 bits of each byte
** for all bytes that have the 8th bit set and one byte with the 8th
** bit clear. Except, if we get to the 9th byte, it stores the full
** 8 bits and is the last byte.
*/
static int SQLITE_NOINLINE putVarint64(unsigned char *p, u64 v);
/*
** Read a 64-bit variable-length integer from memory starting at p[0].
** Return the number of bytes read. The value is stored in *v.
*/
SQLITE_PRIVATE u8 sqlite3GetVarint(const unsigned char *p, u64 *v);