Characters such as.html,.htm, .asp, .aspx, .php, .jsp, .jspx or a slash can include all HTML Web resource.
查看答案
“?"in a Web site may cause the crawler to download an infinite number of URLs from the site, which is a spider trap.
A. 对
B. 错
URL normalization is useful for crawlers to avoid crawling the same resource more than once.
A. 对
B. 错
There is no space in URL.
A. 对
B. 错
A path-ascending crawler would normally ascend to every path in each URL that it intends to crawl.
A. 对
B. 错