varnish

缓存

  • 缓存之所以能够生效是程序的运行具有局部性特征:

    • 时间局部性:一个数据被访问过之后,可能很快会被再次访问到;
    • 空间局部性:一个数据被访问时,其周边的数据也有可能被访问到
  • 缓存的是热区数据

    • 时效性:

      • 缓存空间耗尽:LRU,最近最少使用;
      • 过期:缓存清理
  • 缓存命中率:hit/(hit+miss)

    • 页面命中率:基于页面数量进行衡量
    • 字节命中率:基于页面的体积进行衡量
  • 缓存的数据类型

    • 私有数据:private,private cache;
    • 公共数据:public, public or private cache;
  • Cache-related Headers Fields

    The most important caching header fields are:
    
          Expires:过期时间;
              Expires:Thu, 22 Oct 2026 06:34:30 GMT
          Cache-Control:max-age=
    
          Etag
          If-None-Match
    
          Last-Modified
          If-Modified-Since
    
          Vary
          Age
    • 缓存有效性判断机制:

      • 过期时间:Expires

        • HTTP/1.0
          Expires:过期时间
        • HTTP/1.1
          Cache-Control: maxage= 控制公共和私有缓存
          Cache-Control: s-maxage= 控制公共缓存
      • 条件式请求:

        • Last-Modified/If-Modified-Since:基于文件的修改时间戳来判别;
        • Etag/If-None-Match:基于文件的校验码来判别;
      • 示例:

        Expires:Thu, 13 Aug 2026 02:05:12 GMT
          Cache-Control:max-age=315360000
          ETag:"1ec5-502264e2ae4c0"
          Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT
    • 缓存层级:

      • 私有缓存:用户代理附带的本地缓存机制;
      • 公共缓存:反向代理服务器的缓存功能;

        User-Agent <–> private cache <–> public cache <–> public cache 2 <–> Original Server

  • 请求报文用于通知缓存服务如何使用缓存响应请求:

    cache-request-directive = 
          "no-cache",                        
          | "no-store"                         
          | "max-age" "=" delta-seconds        
          | "max-stale" [ "=" delta-seconds ]  
          | "min-fresh" "=" delta-seconds      
          | "no-transform"                    
          | "only-if-cached"                  
          | cache-extension
  • 响应报文用于通知缓存服务器如何存储上级服务器响应的内容:

    cache-response-directive =
          "public"                               
          | "private" [ "=" <"> 1#field-name <"> ] ,仅私有缓存可以响应
          | "no-cache" [ "=" <"> 1#field-name <"> ],可缓存,但响应给客户端之前需要revalidation,即必须发出条件式请求进行缓存有效性验正;
          | "no-store" ,不允许存储响应内容于缓存中;                           
          | "no-transform"                        
          | "must-revalidate"                     
          | "proxy-revalidate"                  
          | "max-age" "=" delta-seconds           
          | "s-maxage" "=" delta-seconds          
          | cache-extension
  • 开源解决方案:

    • squid
      varnish

varnish

  • varnish的配置:

    • 定义进程配置
    • 缓存系统的配置
  • 程序架构:

    • Manager进程
    • Cacher进程,包含多种类型的线程:accept, worker, expiry, …
    • shared memory log:

      • 统计数据:计数器;
      • 日志区域:日志记录;
        varnishlog, varnishncsa, varnishstat…
    • 配置接口:VCL:Varnish Configuration Language,
      vcl complier –> c complier –> shared object

  • varnish的程序环境:

    • /etc/varnish/varnish.params: 配置varnish服务进程的工作特性,例如监听的地址和端口,缓存机制;
    • /etc/varnish/default.vcl:配置各Child/Cache线程的缓存策略;
    • 主程序:/usr/sbin/varnishd
    • CLI interface:/usr/bin/varnishadm
    • Shared Memory Log交互工具:

      • /usr/bin/varnishhist
        /usr/bin/varnishlog
        /usr/bin/varnishncsa
        /usr/bin/varnishstat
        /usr/bin/varnishtop
    • 测试工具程序:/usr/bin/varnishtest
    • VCL配置文件重载程序:/usr/sbin/varnish_reload_vcl
    • Systemd Unit File:

      • varnish服务:
        /usr/lib/systemd/system/varnish.service
      • 日志持久的服务:
        /usr/lib/systemd/system/varnishlog.service
        /usr/lib/systemd/system/varnishncsa.service
  • varnish的缓存存储机制( Storage Types):
    -s [name=]type[,options]

    · malloc[,size]
      内存存储,[,size]用于定义空间大小;重启后所有缓存项失效;
      · file[,path[,size[,granularity]]]
          磁盘文件存储,黑盒;重启后所有缓存项失效;
      · persistent,path,size
          文件存储,黑盒;重启后所有缓存项有效;实验;
  • varnish程序的选项:

    • 程序选项:/etc/varnish/varnish.params文件
      -a address[:port][,address[:port][…],默认为6081端口;
      -T address[:port],默认为6082端口;
      -s [name=]type[,options],定义缓存存储机制;
      -u user
      -g group
      -f config:VCL配置文件;
      -F:运行于前台;
    • 运行时参数:/etc/varnish/varnish.params文件, DEAMON_OPTS
      DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″

      -p param=value:设定运行参数及其值; 可重复使用多次;
      -r param[,param…]: 设定指定的参数为只读状态;

  • 重载vcl配置文件:
    ~ ]# varnish_reload_vcl

  • varnish的命令行:varnishadm

    ~]# varnishadm -S /etc/varnish/secret -T [ADDRESS:]PORT 
    
      help [<command>]
      ping [<timestamp>]
      auth <response>
      quit
      banner
      status
      start
      stop
      vcl.load <configname> <filename>    
      vcl.inline <configname> <quoted_VCLstring>
      vcl.use <configname>            
      vcl.discard <configname>
      vcl.list                    
      param.show [-l] [<param>]    
      param.set <param> <value>    
      panic.show
      panic.clear
      storage.list
      vcl.show [-v] <configname>        
      backend.list [<backend_expression>]        
      backend.set_health <backend_expression> <state>
      ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
      ban.list
    • 配置文件相关:

      vcl.list :列出配置文件的版本
        vcl.load:装载,加载并编译;
        vcl.use:激活;切换使用配置文件
        vcl.discard:删除;
        vcl.show [-v] <configname>:查看指定的配置文件的详细信息;
    • 运行时参数:

      param.show -l:显示内部可调的参数所有列表;
        param.show <PARAM>
        param.set <PARAM> <VALUE>    设置参数
    • 缓存存储:

      storage.list:存储类型
    • 后端服务器:

      backend.list :列出后端主机
        backend.set_health <backend_expression> <state>        #指定后端主机是健康还是不健康的
  • VCL

    • ”域“专有类型的配置语言;

    • state engine:状态引擎;

    • VCL有多个状态引擎,状态之间存在相关性,但状态引擎彼此间互相隔离;每个状态引擎可使用return(x)指明关联至哪个下一级引擎;每个状态引擎对应于vcl文件中的一个配置段,即为subroutine

      • Client Side

        vcl_recv, vcl_pass, vcl_hit, vcl_miss, vcl_pipe, vcl_purge, vcl_synth, vcl_deliver
        
        vcl_recv:
          hash:vcl_hash
          pass: vcl_pass 
          pipe: vcl_pipe
          synth: vcl_synth
          purge: vcl_hash --> vcl_purge
        
        vcl_hash:
          lookup:
              hit: vcl_hit
              miss: vcl_miss
              pass, hit_for_pass: vcl_pass
              purge: vcl_purge
      • Backend Side:

        vcl_backend_fetch, vcl_backend_response, vcl_backend_error
      • 两个特殊的引擎:Housekeeping

        vcl_init:在处理任何请求之前要执行的vcl代码:主要用于初始化VMODs;
        vcl_fini:所有的请求都已经结束,在vcl配置被丢弃时调用;主要用于清理VMODs;
      • vcl_recv的默认配置:

        sub vcl_recv {
          if (req.method == "PRI") {
              /* We do not support SPDY or HTTP/2.0 */
              return (synth(405));
          }
          if (req.method != "GET" &&
          req.method != "HEAD" &&
          req.method != "PUT" &&
          req.method != "POST" &&
          req.method != "TRACE" &&
          req.method != "OPTIONS" &&
          req.method != "DELETE") {
              /* Non-RFC2616 or CONNECT which is weird. */
              return (pipe);
          }
        
          if (req.method != "GET" && req.method != "HEAD") {
              /* We only deal with GET and HEAD by default */
              return (pass);
          }
          if (req.http.Authorization || req.http.Cookie) {
              /* Not cacheable by default */
              return (pass);
          }
              return (hash);
          }
        }
    • vcl的语法格式:

      (1) VCL files start with vcl 4.0;
      (2) //, # and /* foo */ for comments;
      (3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { ...};
      (4) No loops, state-limited variables(受限于引擎的内建变量);
      (5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action);用于实现状态引擎转换; 
      (6) Domain-specific;
    • The VCL Finite State Machine

      (1) Each request is processed separately;
      (2) Each request is independent from others at any given time;
      (3) States are related, but isolated;
      (4) return(action); exits one state and instructs Varnish to proceed to the next state;
      (5) Built-in VCL code is always present and appended below your own VCL;
    • 三类主要语法:

      sub subroutine {
        ...
      }
      if CONDITION {
        ...
      } else {    
        ...
      }
      return(), hash_data()
    • VCL Built-in Functions and Keywords

      • 内建函数:

        • hash_data():指明哈希计算的数据;减少差异,以提升命中率;
        • regsub(str,regex,sub):把str中被regex第一次匹配到字符串替换为sub;主要用于URL Rewrite
        • regsuball(str,regex,sub):把str中被regex每一次匹配到字符串均替换为sub;
        • return()
        • ban(expression)
        • ban_url(regex):Bans所有的其URL可以被此处的regex匹配到的缓存对象;
        • synth(status,”STRING”):purge操作;
      • Keywords:

        • call subroutine, return(action),new,set,unset
      • 操作符:

        • ==, !=, ~, >, >=, <, <=
          逻辑操作符:&&, ||, !
          变量赋值:=
      • 标志位:

        • (?i):模式匹配时不区分字符大小写
      • 举例:
        obj.hits是内建变量,用于保存某缓存项的从缓存中命中的次数;发送响应报文的过程中,在sub vcl_deliver中定义;

        if (obj.hits>0) {
          set resp.http.X-Cache = "HIT via " + server.ip;
        } else {
          set resp.http.X-Cache = "MISS via " + server.ip;
        }
    • 变量类型:

      • 内建变量:

        req.*:request,表示由客户端发来的请求报文相关;
          req.http.*
              req.http.User-Agent, 
              req.http.Referer, ...
        bereq.*:由varnish发往BE主机的httpd请求相关;
          bereq.http.*
        beresp.*:由BE主机响应给varnish的响应报文相关;
          beresp.http.*
        resp.*:由varnish响应给client相关;
          resp.http.*
        obj.*:存储在缓存空间中的缓存对象的属性;只读;
        • 常用变量:

          bereq.*, req.*:
            bereq.http.HEADERS
            bereq.request:请求方法;
            bereq.url:请求的url;
            bereq.proto:请求的协议版本;
            bereq.backend:指明要调用的后端主机;
          
            req.http.Cookie:客户端的请求报文中Cookie首部的值; 
            req.http.User-Agent ~ "chrome"
          
          beresp.*, resp.*:
            beresp.http.HEADERS
            beresp.status:响应的状态码;
            reresp.proto:协议版本;
            beresp.backend.name:BE主机的主机名;
            beresp.ttl:BE主机响应的内容的余下的可缓存时长;
          
          obj.*
            obj.hits:此对象从缓存中命中的次数;
            obj.ttl:对象的ttl值
          
          server.*
            server.ip
            server.hostname
          client.*
            client.ip
      • 用户自定义:

        • set
          unset
    • 示例1:强制对某类资源的请求不检查缓存:

      vcl_recv {
        if (req.url ~ "(?i)^/(login|admin)") {
            return(pass);
        }
      }
    • 示例2:对于特定类型的资源,例如公开的图片等,取消其私有标识,并强行设定其可以由varnish缓存的时长; 定义在vcl_backend_response中;

      if (beresp.http.cache-control !~ "s-maxage") {
        if (bereq.url ~ "(?i)\.(jpg|jpeg|png|gif|css|js)$") {
            unset beresp.http.Set-Cookie;
            set beresp.ttl = 3600s;
        }
      }
    • 示例3:将客户端的IP地址传递到后端主机,定义在vcl_recv中;

      if (req.restarts == 0) {
        if (req.http.X-Forwarded-For) {
            set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip;
        } else {
            set req.http.X-Forwarded-For = client.ip;
        }
      }        
      
      当只有一个变量
        变量为字符串:非空为真,空则为假;
        变量为数值型:非0为真,0则为假
    • 缓存对象的修剪:purge, ban

      • purge

        1. 能执行purge操作

          sub vcl_purge {
               return (synth(200,"Purged"));
           }
        2. 何时执行purge操作

          sub vcl_recv {
               if (req.method == "PURGE") {
                   return(purge);
               }
               ...
           }
        • 添加此类请求的访问控制法则:

          acl purgers {
                "127.0.0.0"/8;
                "192.168.10.0"/24;
            }
          
            sub vcl_recv {
                if (req.method == "PURGE") {
                    if (!client.ip ~ purgers) {
                        return(synth(405,"Purging not allowed for " + client.ip));
                    }
                    return(purge);
                }
                ...
            }
      • Banning:

        1. varnishadm:

          ban <field> <operator> <arg>
          
           示例:
               ban req.url ~ ^/javascripts
        2. 在配置文件中定义,使用ban()函数;

          示例:
           if (req.method == "BAN") {
               ban("req.http.host == " + req.http.host + " && req.url == " + req.url);
               # Throw a synthetic page so the request won't go to the backend.
               return(synth(200, "Ban added"));
           }
  • 如何设定使用多个后端主机:

    backend default {
          .host = "172.16.100.6";
          .port = "80";
      }
    
      backend appsrv {
          .host = "172.16.100.7";
          .port = "80";
      }
    
      sub vcl_recv {                
          if (req.url ~ "(?i)\.php$") {
              set req.backend_hint = appsrv;
          } else {
              set req.backend_hint = default;
          }    
    
          ...
      }
    • Director:

      • varnish module;
        使用前需要导入:import directors;

      • 示例:

        import directors;    # load the directors
        
        backend server1 {
          .host = 
          .port = 
        }
        backend server2 {
          .host = 
          .port = 
        }
        
        sub vcl_init {
          new GROUP_NAME = directors.round_robin();
          GROUP_NAME.add_backend(server1);
          GROUP_NAME.add_backend(server2);
        }
        
        sub vcl_recv {
          # send all traffic to the bar director:
          set req.backend_hint = GROUP_NAME.backend();
        }
      • 示例:
        设置使用多个后端主机,采用动静分离的方式,反代至backend server

        backend imgsrv1 {
          .host = "192.168.10.11";
          .port = "80";
        }
        
        backend imgsrv2 {
          .host = "192.168.10.12";
          .port = "80";
        }    
        
        backend appsrv1 {
          .host = "192.168.10.21";
          .port = "80";
        }
        
        backend appsrv2 {
          .host = "192.168.10.22";
          .port = "80";
        }
        
        sub vcl_init {
          new imgsrvs = directors.random();
          imgsrvs.add_backend(imgsrv1,10);
          imgsrvs.add_backend(imgsrv2,20);
        
          new staticsrvs = directors.round_robin();
          appsrvs.add_backend(appsrv1);
          appsrvs.add_backend(appsrv2);
        
          new appsrvs = directors.hash();
          appsrvs.add_backend(appsrv1,1);
          appsrvs.add_backend(appsrv2,1);        
        }
        
        sub vcl_recv {
          if (req.url ~ "(?i)\.(css|js)$" {
              set req.backend_hint = staticsrvs.backend();
          }         
          if (req.url ~ "(?i)\.(jpg|jpeg|png|gif)$" {
              set req.backend_hint = imgsrvs.backend();
          } else {        
              set req.backend_hint = appsrvs.backend(req.http.cookie);
          }
        }
    • 基于cookie的session sticky:
      把来自同一用户的信息发往同一个客户端主机

      sub vcl_init {
        new h = directors.hash();
        h.add_backend(one, 1);   // backend 'one' with weight '1'
        h.add_backend(two, 1);   // backend 'two' with weight '1'
      }
      
      sub vcl_recv {
        // pick a backend based on the cookie header of the client
        set req.backend_hint = h.backend(req.http.cookie);
      }
  • BE Health Check:

    backend BE_NAME {
          .host =  
          .port = 
          .probe = {
              .url= 
              .timeout= 
              .interval= 
              .window=
              .threshold=
          }
      }
    • .probe:定义健康状态检测方法;
      .url:检测时要请求的URL,默认为”/”;
      .request:发出的具体请求;

      .request = 
            "GET /.healthtest.html HTTP/1.1"
            "Host: www.magedu.com"
            "Connection: close"

      .window:基于最近的多少次检查来判断其健康状态;
      .threshold:最近.window中定义的这么次检查中至有.threshhold定义的次数是成功的;
      .interval:检测频度;
      .timeout:超时时长;
      .expected_response:期望的响应码,默认为200;

    • 健康状态检测的配置方式:

      1. probe PB_NAME { }

        backend NAME = {
             .probe = PB_NAME;
             ...
         }
      2. backend NAME {.probe = {}}

        backend NAME  {
             .probe = {
                 ...
             }
         }
    • 示例(1):

      probe check {
            .url = "/.healthcheck.html";
            .window = 5;
            .threshold = 4;
            .interval = 2s;
            .timeout = 1s;
        }
      
        backend default {
            .host = "10.1.0.68";
            .port = "80";
            .probe = check;
        }
    • 示例(2):

      backend appsrv {
            .host = "10.1.0.69";
            .port = "80";
            .probe check {
                .url = "/.healthcheck.html";
                .window = 5;
                .threshold = 4;
                .interval = 2s;
                .timeout = 1s;
            }
        }
    • 设置后端的主机属性:

      backend BE_NAME {
        ...
        .connect_timeout = 0.5s;
        .first_byte_timeout = 20s;
        .between_bytes_timeout = 5s;
        .max_connections = 50;
      }
  • varnish的运行时参数:

    • 线程模型:

      cache-worker
        cache-main
        ban lurker
        acceptor:
        epoll/kqueue:
        ...
    • 线程相关的参数:
      在线程池内部,其每一个请求由一个线程来处理; 其worker线程的最大数决定了varnish的并发响应能力;

      • thread_pools:Number of worker thread pools. 线程池数量最好小于或等于CPU核心数量;
      • thread_pool_max:The maximum number of worker threads in each pool. 每线程池的最大线程数;
      • thread_pool_min:The minimum number of worker threads in each pool. 最少线程数;额外意义为“最大空闲线程数”;

        最大并发连接数=thread_pools * thread_pool_max

      • thread_pool_timeout:Thread idle threshold. Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.空闲线程的超时时长,

      • thread_pool_add_delay:Wait at least this long after creating a thread.

      • thread_pool_destroy_delay:Wait this long after destroying a thread.

    • Timer相关的参数:

      • send_timeout:Send timeout for client connections. If the HTTP response hasn’t been transmitted in this many seconds the session is closed.
      • timeout_idle:Idle timeout for client connections.
      • timeout_req: Max time to receive clients request headers, measured from first non-white-space character to double CRNL.
      • cli_timeout:Timeout for the childs replies to CLI requests from the mgt_param.
    • 设置方式:

      • vcl.param
        param.set
    • 永久有效的方法:

      /etc/varnish/varnish.params
            DEAMON_OPTS="-p PARAM1=VALUE -p PARAM2=VALUE"
  • varnish日志区域:

    • shared memory log

      • 计数器
        日志信息
    1. varnishstat – Varnish Cache statistics

      • -1
        -1 -f FILED_NAME
        -l:可用于-f选项指定的字段名称列表;

        MAIN.cache_hit 
        MAIN.cache_miss
        
        # varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss
        # varnishstat -l -f MAIN -f MEMPOOL
    2. varnishtop – Varnish log entry ranking

      • -1 Instead of a continously updated display, print the statistics once and exit.
        -i taglist,可以同时使用多个-i选项,也可以一个选项跟上多个标签;
        -I <[taglist:]regex>
        -x taglist:排除列表
        -X <[taglist:]regex>
    3. varnishlog – Display Varnish logs

    4. varnishncsa – Display Varnish logs in Apache / NCSA combined log format

      开启varnish的日志记录:
           systemctl start varnishncsa.service
           systemctl enable varnishncsa.service
           tail /var/log/varnish/varnishncsa.log

原创文章,作者:s,如若转载,请注明出处:http://www.178linux.com/79368

联系我们

400-080-6560

在线咨询

工作时间:周一至周五,9:30-18:30,节假日同时也值班

QR code