谁能告诉我,从我在浏览器中输入URL到我在浏览器中看到页面,幕后发生了什么?对这一过程的详细描述将会很有帮助。


查找HTTP规范。或者尝试http://www.jmarshall.com/easy/http/

首先,计算机查找目标主机。如果它存在于本地DNS缓存中,它将使用该信息。否则,将继续进行DNS查询,直到找到IP地址为止。

然后,浏览器打开到目标主机的TCP连接,并根据HTTP 1.1(也可能使用HTTP 1.0,但普通浏览器不再这样做)发送请求。

服务器查找所需的资源(如果存在)并使用HTTP协议进行响应,将数据发送给客户端(=您的浏览器)

然后,浏览器使用HTML解析器重新创建文档结构,稍后在屏幕上呈现给您。如果它找到了外部资源的引用,比如图片、css文件、javascript文件,这些都是以与HTML文档本身相同的方式传递的。

Attention: this is an extremely rough and oversimplified sketch, assuming the simplest possible HTTP request (no HTTPS, no HTTP2, no extras), simplest possible DNS, no proxies, single-stack IPv4, one HTTP request only, a simple HTTP server on the other end, and no problems in any step. This is, for most contemporary intents and purposes, an unrealistic scenario; all of these are far more complex in actual use, and the tech stack has become an order of magnitude more complicated since this was written. With this in mind, the following timeline is still somewhat valid:

browser checks cache; if requested object is in cache and is fresh, skip to #9 browser asks OS for server's IP address OS makes a DNS lookup and replies the IP address to the browser browser opens a TCP connection to server (this step is much more complex with HTTPS) browser sends the HTTP request through TCP connection browser receives HTTP response and may close the TCP connection, or reuse it for another request browser checks if the response is a redirect or a conditional response (3xx result status codes), authorization request (401), error (4xx and 5xx), etc.; these are handled differently from normal responses (2xx) if cacheable, response is stored in cache browser decodes response (e.g. if it's gzipped) browser determines what to do with response (e.g. is it a HTML page, is it an image, is it a sound clip?) browser renders response, or offers a download dialog for unrecognized types

Again, discussion of each of these points have filled countless pages; take this only as a summary, abridged for the sake of clarity. Also, there are many other things happening in parallel to this (processing typed-in address, speculative prefetching, adding page to browser history, displaying progress to user, notifying plugins and extensions, rendering the page while it's downloading, pipelining, connection tracking for keep-alive, cookie management, checking for malicious content etc.) - and the whole operation gets an order of magnitude more complex with HTTPS (certificates and ciphers and pinning, oh my!).