“?"in a Web site may cause the crawler to download an infinite number of URLs from the site, which is a spider trap.
查看答案
URL normalization is useful for crawlers to avoid crawling the same resource more than once.
A. 对
B. 错
There is no space in URL.
A. 对
B. 错
A path-ascending crawler would normally ascend to every path in each URL that it intends to crawl.
A. 对
B. 错
For example, when given a seed URL of ****.org/a/b/page. , it will attempt to crawl//a/, /b/, and/.
A. 对
B. 错