Web Page Cache：

squid –> varnish
程序的运行具有局部性特征：
时间局部性：一个数据被访问过之后，可能很快会被再次访问
空间局部性：一个数据被访问时，其周边的数据也有可能被访问到

cache：命中
热区：局部性；
时效性：
缓存空间耗尽：LRU
过期：缓存清理

缓存命中率：hit/(hit+miss)
(0,1)
页面命中率：基于页面数量进行衡量
字节命中率：基于页面的体积进行衡量

缓存与否：
私有数据：private，private cache；
公共数据：public, public or private cache;

Cache-related Headers Fields
The most important caching header fields are:

Expires：过期时间；
Expires:Thu, 22 Oct 2026 06:34:30 GMT
Cache-Control：max-age=

Etag
If-None-Match

Last-Modified
If-Modified-Since

Vary
Age

缓存有效性判断机制：
过期时间：Expires
HTTP/1.0
Expires
HTTP/1.1
Cache-Control: maxage=
Cache-Control: s-maxage=
条件式请求：
Last-Modified/If-Modified-Since
Etag/If-None-Match

Expires:Thu, 13 Aug 2026 02:05:12 GMT
Cache-Control:max-age=315360000
ETag:”1ec5-502264e2ae4c0″
Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT

缓存层级：
私有缓存：用户代理附带的本地缓存机制；
公共缓存：反向代理服务器的缓存功能；

User-Agent <–> private cache <–> public cache <–> public cache 2 <–> Original Server

开源解决方案：
squid：
varnish：

varnish官方站点： http://www.varnish-cache.org/
Community
Enterprise

This is Varnish Cache, a high-performance HTTP accelerator.

程序架构：
Manager进程
Cacher进程，包含多种类型的线程：
accept, worker, expiry, …
shared memory log：
统计数据：计数器；
日志区域：日志记录；
varnishlog, varnishncsa, varnishstat…

配置接口：VCL
Varnish Configuration Language,
vcl complier –> c complier –> shared object

varnish的程序环境：
/etc/varnish/varnish.params：配置varnish服务进程的工作特性，例如监听的地址和端口，缓存机制；
/etc/varnish/default.vcl：配置各Child/Cache线程的缓存工作属性；
主程序：
/usr/sbin/varnishd
CLI interface：
/usr/bin/varnishadm
Shared Memory Log交互工具：
/usr/bin/varnishhist
/usr/bin/varnishlog
/usr/bin/varnishncsa
/usr/bin/varnishstat
/usr/bin/varnishtop
测试工具程序：
/usr/bin/varnishtest
VCL配置文件重载程序：
/usr/sbin/varnish_reload_vcl
Systemd Unit File：
/usr/lib/systemd/system/varnish.service
varnish服务
/usr/lib/systemd/system/varnishlog.service
/usr/lib/systemd/system/varnishncsa.service
日志持久的服务；

varnish的缓存存储机制( Storage Types)：
-s [name=]type[,options]

· malloc[,size]
内存存储，[,size]用于定义空间大小；重启后所有缓存项失效；
· file[,path[,size[,granularity]]]
文件存储，黑盒；重启后所有缓存项失效；
· persistent,path,size
文件存储，黑盒；重启后所有缓存项有效；实验；

varnish程序的选项：
程序选项：/etc/varnish/varnish.params文件
-a address[:port][,address[:port][…]，默认为6081端口；
-T address[:port]，默认为6082端口；
-s [name=]type[,options]，定义缓存存储机制；
-u user
-g group
-f config：VCL配置文件；
-F：运行于前台；
…
运行时参数：/etc/varnish/varnish.params文件， DEAMON_OPTS
DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″

-p param=value：设定运行参数及其值；可重复使用多次；
-r param[,param…]: 设定指定的参数为只读状态；

重载vcl配置文件：
~ ]# varnish_reload_vcl

varnishadm
-S /etc/varnish/secret -T [ADDRESS:]PORT

help [<command>]
ping [<timestamp>]
auth <response>
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname>
vcl.discard <configname>
vcl.list
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show [-v] <configname>
backend.list [<backend_expression>]
backend.set_health <backend_expression> <state>
ban <field> <operator> <arg> [&& <field> <oper> <arg>]…
ban.list

配置文件相关：
vcl.list
vcl.load：装载，加载并编译；
vcl.use：激活；
vcl.discard：删除；
vcl.show [-v] <configname>：查看指定的配置文件的详细信息；

运行时参数：
param.show -l：显示列表；
param.show <PARAM>
param.set <PARAM> <VALUE>

缓存存储：
storage.list

后端服务器：
backend.list

VCL：
”域“专有类型的配置语言；

state engine：状态引擎；

VCL有多个状态引擎，状态之间存在相关性，但状态引擎彼此间互相隔离；每个状态引擎可使用return(x)指明关联至哪个下一级引擎；每个状态引擎对应于vcl文件中的一个配置段，即为subroutine

vcl_hash –> return(hit) –> vcl_hit

Client Side：
vcl_recv, vcl_pass, vcl_hit, vcl_miss, vcl_pipe, vcl_purge, vcl_synth, vcl_deliver

vcl_recv：
hash：vcl_hash
pass: vcl_pass
pipe: vcl_pipe
synth: vcl_synth
purge: vcl_hash –> vcl_purge

vcl_hash：
lookup：
hit: vcl_hit
miss: vcl_miss
pass, hit_for_pass: vcl_pass
purge: vcl_purge

Backend Side：
vcl_backend_fetch, vcl_backend_response, vcl_backend_error

两个特殊的引擎：
vcl_init：在处理任何请求之前要执行的vcl代码：主要用于初始化VMODs；
vcl_fini：所有的请求都已经结束，在vcl配置被丢弃时调用；主要用于清理VMODs；

vcl的语法格式：
(1) VCL files start with vcl 4.0;
(2) //, # and /* foo / for comments;
(3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { …}；
(4) No loops, state-limited variables（受限于引擎的内建变量）；
(5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action)；用于实现状态引擎转换；
(6) Domain-specific;

The VCL Finite State Machine
(1) Each request is processed separately;
(2) Each request is independent from others at any given time;
(3) States are related, but isolated;
(4) return(action); exits one state and instructs Varnish to proceed to the next state;
(5) Built-in VCL code is always present and appended below your own VCL;

三类主要语法：
sub subroutine {
…
}

if CONDITION {
…
} else {
…
}

return(), hash_data()

VCL Built-in Functions and Keywords
函数：
regsub(str, regex, sub)
regsuball(str, regex, sub)
ban(boolean expression)
hash_data(input)
synthetic(str)

Keywords:
call subroutine， return(action)，new，set，unset

操作符：
==, !=, ~, >, >=, <, <=
逻辑操作符：&&, ||, !
变量赋值：=

举例：obj.hits
if (obj.hits>0) {
set resp.http.X-Cache = “HIT via ” + server.ip;
} else {
set resp.http.X-Cache = “MISS via ” + server.ip;
}

变量类型：
内建变量：
req.：request，表示由客户端发来的请求报文相关；
req.http.
req.http.User-Agent, req.http.Referer, …
bereq.：由varnish发往BE主机的httpd请求相关；
bereq.http.
beresp.：由BE主机响应给varnish的响应报文相关；
beresp.http.
resp.：由varnish响应给client相关；
obj.：存储在缓存空间中的缓存对象的属性；只读；

常用变量：
bereq., req.：
bereq.http.HEADERS
bereq.request：请求方法；
bereq.url：请求的url；
bereq.proto：请求的协议版本；
bereq.backend：指明要调用的后端主机；

req.http.Cookie：客户端的请求报文中Cookie首部的值；
req.http.User-Agent ~ “chrome”

beresp., resp.：
beresp.http.HEADERS
beresp.status：响应的状态码；
reresp.proto：协议版本；
beresp.backend.name：BE主机的主机名；
beresp.ttl：BE主机响应的内容的余下的可缓存时长；

obj.
obj.hits：此对象从缓存中命中的次数；
obj.ttl：对象的ttl值

server.
server.ip
server.hostname
client.
client.ip

用户自定义：
set
unset

示例1：强制对某类资源的请求不检查缓存：
vcl_recv {
if (req.url ~ “(?i)^/(login|admin)”) {
return(pass);
}
}

示例2：对于特定类型的资源，例如公开的图片等，取消其私有标识，并强行设定其可以由varnish缓存的时长；
if (beresp.http.cache-control !~ “s-maxage”) {
if (bereq.url ~ “(?i).(jpg|jpeg|png|gif|css|js)$”) {
unset beresp.http.Set-Cookie;
set beresp.ttl = 3600s;
}
}

示例3：
if (req.restarts == 0) {
if (req.http.X-Fowarded-For) {
set req.http.X-Forwarded-For = req.http.X-Forwarded-For + “,” + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}

缓存对象的修剪：purge, ban
(1) 能执行purge操作
sub vcl_purge {
return (synth(200,”Purged”));
}

(2) 何时执行purge操作
sub vcl_recv {
if (req.method == “PURGE”) {
return(purge);
}
…
}

添加此类请求的访问控制法则：
acl purgers {
“127.0.0.0”/8;
“10.1.0.0”/16;
}

sub vcl_recv {
if (req.method == “PURGE”) {
if (!client.ip ~ purgers) {
return(synth(405,”Purging not allowed for ” + client.ip));
}
return(purge);
}
…
}

如何设定使用多个后端主机：
backend default {
.host = “172.16.100.6”;
.port = “80”;
}

backend appsrv {
.host = “172.16.100.7”;
.port = “80”;
}

sub vcl_recv {
if (req.url ~ “(?i).php$”) {
set req.backend_hint = appsrv;
} else {
set req.backend_hint = default;
}

…
}

Director：
varnish module；
使用前需要导入：
import directors；

示例：
import directors; # load the directors

backend server1 {
.host =
.port =
}
backend server2 {
.host =
.port =
}

sub vcl_init {
new GROUP_NAME = directors.round_robin();
GROUP_NAME.add_backend(server1);
GROUP_NAME.add_backend(server2);
}

sub vcl_recv {

send all traffic to the bar director:

set req.backend_hint = GROUP_NAME.backend();
}

BE Health Check：
backend BE_NAME {
.host =
.port =
.probe = {
.url=
.timeout=
.interval=
.window=
.threshold=
}
}

.probe：定义健康状态检测方法；
.url：检测时请求的URL，默认为”/”;
.request：发出的具体请求；
.request =
“GET /.healthtest.html HTTP/1.1”
“Host: www.magedu.com”
“Connection: close”
.window：基于最近的多少次检查来判断其健康状态；
.threshhold：最近.window中定义的这么次检查中至有.threshhold定义的次数是成功的；
.interval：检测频度；
.timeout：超时时长；
.expected_response：期望的响应码，默认为200；

健康状态检测的配置方式：
(1) probe PB_NAME = { }
backend NAME = {
.probe = PB_NAME;
…
}

(2) backend NAME {
.probe = {
…
}
}

示例：
probe check {
.url = “/.healthcheck.html”;
.window = 5;
.threshold = 4;
.interval = 2s;
.timeout = 1s;
}

backend default {
.host = “10.1.0.68”;
.port = “80”;
.probe = check;
}

backend appsrv {
.host = “10.1.0.69”;
.port = “80”;
.probe = check;
}

varnish的运行时参数：
线程模型：
cache-worker
cache-main
ban lurker
acceptor：
epoll/kqueue：
…

线程相关的参数：
在线程池内部，其每一个请求由一个线程来处理；其worker线程的最大数决定了varnish的并发响应能力；

thread_pools：Number of worker thread pools. 最好小于或等于CPU核心数量；
thread_pool_max：The maximum number of worker threads in each pool. 每线程池的最大线程数；
thread_pool_min：The minimum number of worker threads in each pool. 额外意义为“最大空闲线程数”；

最大并发连接数=thread_pools * thread_pool_max

thread_pool_timeout：Thread idle threshold. Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.
thread_pool_add_delay：Wait at least this long after creating a thread.
thread_pool_destroy_delay：Wait this long after destroying a thread.

设置方式：
vcl.param
param.set

永久有效的方法：
varnish.params
DEAMON_OPTS=”-p PARAM1=VALUE -p PARAM2=VALUE”

varnish日志区域：
shared memory log
计数器
日志信息

1、varnishstat – Varnish Cache statistics
-1
-1 -f FILED_NAME
-l：可用于-f选项指定的字段名称列表；

MAIN.cache_hit
MAIN.cache_miss

varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss

varnishstat -l -f MAIN -f MEMPOOL

2、varnishtop – Varnish log entry ranking
-1 Instead of a continously updated display, print the statistics once and exit.
-i taglist，可以同时使用多个-i选项，也可以一个选项跟上多个标签；
-I <[taglist:]regex>
-x taglist：排除列表
-X <[taglist:]regex>

3、varnishlog – Display Varnish logs

4、 varnishncsa – Display Varnish logs in Apache / NCSA combined log format

内建函数：
hash_data()：指明哈希计算的数据；减少差异，以提升命中率；
regsub(str,regex,sub)：把str中被regex第一次匹配到字符串替换为sub；主要用于URL Rewrite
regsuball(str,regex,sub)：把str中被regex每一次匹配到字符串均替换为sub；
return()：
ban(expression)
ban_url(regex)：Bans所有的其URL可以被此处的regex匹配到的缓存对象；
synth(status,”STRING”)：purge操作；

总结：
varnish： state engine, vcl
varnish 4.0：
vcl_init
vcl_recv
vcl_hash
vcl_hit
vcl_pass
vcl_miss
vcl_pipe
vcl_waiting
vcl_purge
vcl_deliver
vcl_synth
vcl_fini

vcl_backend_fetch
vcl_backend_response
vcl_backend_error

sub VCL_STATE_ENGINE {
…
}
backend BE_NAME {}
probe PB_NAME {}
acl ACL_NAME {}

博客作业：以上所有内容；
实战项目：两个lamp部署wordpress，用Nginx反代，做压测；nginx后部署varnish缓存，调整vcl，多次压测；
课外实践：(1) zabbix监控varnish业务指标；
(2) ansible实现varnish快速部署；

ab, http_load, webbench, seige, jmeter, loadrunner,…

原创文章，作者：shewei，如若转载，请注明出处：http://www.178linux.com/76700

varnish

send all traffic to the bar director:

varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss

varnishstat -l -f MAIN -f MEMPOOL

相关推荐

马哥教育网络班20期+第六周博客作业

linux上的文件查找工具:locate和find

LAMP

N25第四周作业

Linux系统下的翻译神器——Goldendict

Linux Services and Security–part2