-
作者大佬好, 我刚才尝试爬取webp图片, 但全部失败了, 不知道是什么情况, 之前其他图片都没问题, 有空时麻烦指导下, 非常感谢! import xCrawl from 'x-crawl';
async function getMarketcapRanking() {
const myXCrawl = xCrawl({maxRetry: 3, intervalTime: {max: 3000, min: 2000}});
const [res] = await myXCrawl.crawlPage({
targets: ['https://companiesmarketcap.com/china/largest-companies-in-china-by-market-cap'],
viewport: {width: 1920, height: 1080},
});
const {page} = res.data;
// 等待页面加载完成
// eslint-disable-next-line no-promise-executor-return
await new Promise((r) => setTimeout(r, 3000));
let urls = await page.$$eval(`table > tbody > tr > td.name-td > div.logo-container > img`, (imgEls) => {
return imgEls
.flatMap((item) => [item.getAttribute('data-img-path') ?? item.src, item.getAttribute('data-img-dark-path') ?? item.src])
.map((d) => (d.startsWith('https:') ? d : 'https://companiesmarketcap.com' + d));
});
page.close();
urls = [...new Set(urls)];
await myXCrawl.crawlFile({
targets: urls,
storeDirs: './public/logos',
});
}
getMarketcapRanking(); |
Beta Was this translation helpful? Give feedback.
Answered by
coder-hxl
Jun 1, 2023
Replies: 1 comment
-
为 crawlFile API 加上 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
winner106
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
为 crawlFile API 加上
extensions: '.webp'
,手动设置文件扩展名。