Winse Blog

走走停停, 熙熙攘攘, 忙忙碌碌, 不知何畏.

Redis使用优化

最近对生产的Redis做了两个优化:Redis扩展、以及对简单键值对的存储优化(string改成hash形式)

Redis扩展

上一篇介绍的Codis安装。但是使用Pipeline操作时间比较长、连接数比较多的情况下,经常出现连接重置的情况。感觉不踏实,go也不懂感觉短时间处理不了这种问题。

寻求它法。前期是把不同业务数据写入不同的redis实例,根据业务来分。对于同一个业务来说,得根据key的hash来写入不同的实例,但是自己写的话得包装一堆东西。

jedis工具包括Shared的功能,根据写入key的hash映射到不同的redis实例。截取了部分Shared的主要代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class Sharded<R, S extends ShardInfo<R>> {
...
    private void initialize(List<S> shards) {
  nodes = new TreeMap<Long, S>();

  for (int i = 0; i != shards.size(); ++i) {
      final S shardInfo = shards.get(i);
      if (shardInfo.getName() == null)
      for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
          nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n),
              shardInfo);
      }
      else
      for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
          nodes.put(
              this.algo.hash(shardInfo.getName() + "*"
                  + shardInfo.getWeight() + n), shardInfo);
      }
      resources.put(shardInfo, shardInfo.createResource());
  }
    }
...   
    public S getShardInfo(byte[] key) {
  SortedMap<Long, S> tail = nodes.tailMap(algo.hash(key));
  if (tail.isEmpty()) {
      return nodes.get(nodes.firstKey());
  }
  return tail.get(tail.firstKey());
    }

    public S getShardInfo(String key) {
  return getShardInfo(SafeEncoder.encode(getKeyTag(key)));
    }
...

使用的时刻很简单,通过ShardedJedis来进读写,大部分的操作与Jedis类似。只是有部分整个集群的操作不能用:keys/scan等。

1
2
3
4
5
6
7
8
9
10
11
12
13
  public List<JedisShardInfo> getShards(String sValue) {
    String[] servers = sValue.split(",");

    List<JedisShardInfo> shards = new ArrayList<>();
    for (String server : servers) {
      Pair<String, Integer> hp = parseServer(server);
      shards.add(new JedisShardInfo(hp.getLeft(), hp.getRight(), Integer.MAX_VALUE));
    }
    return shards;
  }
  private ShardedJedisPool createRedisPool(String server) {
    return new ShardedJedisPool(new GenericObjectPoolConfig(), getShards(server));
  }

如果使用过程中要使用keys,可以通过getAllShards得到所有Jedis实例的键再进行处理:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
  public Double zscore(String key, String member) {
    try (ShardedJedis redis = getRedis()) {
      return redis.zscore(key, member);
    }
  }
  
  public void expires(List<String> patterns, int seconds) {
    try (ShardedJedis shardedJedis = getRedis()) {
      Set<String> keys = new HashSet<>();

      for (Jedis redis : shardedJedis.getAllShards()) {
        for (String p : patterns) {
          keys.addAll(redis.keys(p)); // 调用单独实例的keys命令获取匹配的键
        }
      }

      ShardedJedisPipeline pipeline = shardedJedis.pipelined();
      for (String key : keys) {
        pipeline.expire(key, seconds);
      }
      pipeline.sync();
    }
  }

进行多实例(集群)切分后,效果还是挺明显的。写入高峰期分流效果显著,负载均摊,可使用的内存也翻翻,键也基本平均分布( --maxmemory-policy volatile-lru )。生产实际效果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
[hadoop@hadoop-master1 redis]$ sh stat_cluster.sh 

 * [ ============================================================> ] 4 / 4

hadoop-master1:
# Memory
used_memory:44287785776
used_memory_human:41.25G
used_memory_rss:67458658304
used_memory_peak:67981990576
used_memory_peak_human:63.31G
used_memory_lua:33792
mem_fragmentation_ratio:1.52
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=72729777,expires=11967,avg_ttl=63510023

hadoop-master2:
# Memory
used_memory:50667945344
used_memory_human:47.19G
used_memory_rss:66036752384
used_memory_peak:64424543672
used_memory_peak_human:60.00G
used_memory_lua:33792
mem_fragmentation_ratio:1.30
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=100697581,expires=13426,avg_ttl=63509903

hadoop-master3:
# Memory
used_memory:56763389184
used_memory_human:52.87G
used_memory_rss:66324045824
used_memory_peak:64424546136
used_memory_peak_human:60.00G
used_memory_lua:33792
mem_fragmentation_ratio:1.17
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=94363547,expires=13544,avg_ttl=63505693

hadoop-master4:
# Memory
used_memory:54513952832
used_memory_human:50.77G
used_memory_rss:67257393152
used_memory_peak:64820124928
used_memory_peak_human:60.37G
used_memory_lua:33792
mem_fragmentation_ratio:1.23
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=83297543,expires=12418,avg_ttl=63507046


Finished processing 4 / 4 hosts in 298.89 ms

存储优化

实际环境中存在会大量的用到简单string键值对,挺耗内存的。其实使用hash(内部存储ziplist)能更有效的利用内存。

注意是ziplist形式的hash才能省内存!!如果是skiplist的hash会浪费内存。

下面引用官网对简单键值对和Hash的一个比较(Redis中key的相关特性不关注): 对于小数据量的hash进行了优化

a few keys use a lot more memory than a single key containing a hash with a few fields.

We use a trick.

But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small

This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than a hash table).

优化主要涉及到ziplist的两个参数,是一个cpu/memory之间的均衡关系。entries直接用默认的就好了,value最好不要大于254(ziplist节点entry大于254需要增加4个到5字节,来存储前一个entry的长度)。

1
2
hash-max-zipmap-entries 512 (hash-max-ziplist-entries for Redis >= 2.6)
hash-max-zipmap-value 64  (hash-max-ziplist-value for Redis >= 2.6)

简单列几条数据:

1
2
3
3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com  ->  1469584847
3:co4hk52ia0b1.5buzd.com                                          ->  1468859527
1:119.84.110.82_39502                                             ->  1469666877

原始key内容可以不需要,鉴于包括域名的key太长,直接对数据key取md5。以1亿键值对来进行估算,取md5的前五位作为key,后27位作为hash键值对的key。

扫描原始redis实例,然后把键值对转换后存储到新的实例。转换Scala代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
import java.util.{List => JList}
import org.apache.commons.codec.digest.DigestUtils
import redis.clients.jedis._
import scala.collection.JavaConversions._

trait RedisUtils {

  def md5(data: String): String = {
    DigestUtils.md5Hex(data)
  }

  def Type(redis: Jedis, key: String) = redis.`type`(key)

  def scan(redis: Jedis)(action: JList[String] => Unit): Unit = {
    import scala.util.control.Breaks._

    var cursor = "0"
    breakable {
      while (true) {
        val res = redis.scan(cursor)

        action(res.getResult())

        cursor = res.getStringCursor
        if (cursor.equals("0")) {
          break
        }
      }
    }
  }
  
  def printInfo(redis: Jedis): Unit = {
    println(redis.info())
  }

  // 验证:
  //  打印 **总共** 的键值对数量
  //  eval "local aks=redis.call('keys', '*'); local res=0; for i,r in ipairs(aks) do res=res+redis.call('hlen', r) end; return res" 0
  //  打印 **每个** hash包括的键值对个数
  //  eval "local aks=redis.call('keys', '*'); local res={}; for i,r in ipairs(aks) do res[i]=redis.call('hlen', r) end; return res" 0
  //

}

Object RedisTransfer extends RedisUtils {

  def handle(key: String, value: String, tp: Pipeline): Unit = {
    val m5 = md5(key)
    tp.hset(m5.substring(0, 5), m5.substring(5), value)
  }

  def main(args: Array[String]) {
    val Array(sHost, sPort, tHost, tPort) = args

    val timeout = 60 * 1000
    val source = new Jedis(sHost, sPort.toInt, timeout)
    val sp = source.pipelined()
    val target = new Jedis(tHost, tPort.toInt, timeout)
    val tp = target.pipelined()

    scan(source) { keys =>
      // 仅处理 string类型 的记录
      val requests = for (key <- keys) yield Some((key, sp.get(key)))
      sp.sync()

      for (
        request <- requests;
        (key, resp) <- request
      ) {
        try {
          handle(key, resp.get(), tp)
        } catch {
          case e: Exception => println(s"fetch $key with exception, ${e.getMessage}")
        }
      }
    }

    tp.sync()

    printInfo(target)

    target.close()
    source.close()
  }

}

由于对数据进行了处理,对比不是很清晰,不能直接说省了多少空间。但是添加上面的处理后,原来30G(大概3亿多)的实例变成了15G。

另一个案例

另外对域名的实例做了下测试,6.4百万的键值对:707.29M内存:

md5前4个字符作为key,总共产生65536个键值对。每个hash大概包括100个kv。

  • hash的key使用原来的键
    • 不调ziplist_value的值,实际的转换成hash(skiplist):939.6M,
    • ziplist_value修改成1024,转换成hash(ziplist):513.78M
  • md5的作为hash的新key:344.7M
  • md5的后28位作为hash的新key: 259.09M

如:

1
2
3
4
5
6
7
MD5:
  3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com
  1356de078028ddf266c962533760b27c

1356 -> hash( 3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com -> 1469584847 )
1356 -> hash( 1356de078028ddf266c962533760b27c -> 1469584847 )
1356 -> hash( de078028ddf266c962533760b27c -> 1469584847 )

–END

Comments