There is a couple of things wrong here, but you are on a good path of getting this to work. The main problem is, that you can't have await within a try {} catch {} block. Asynchronous JavaScript has a different way of dealing with errors. See: try/catch blocks with async/await.
In your case, it's totally fine to write everything in one async function. Here is how I would do it:
async function scrapeIfc() {
const completeData = [];
const url = 'https://www.ifc.org/wps/wcm/connect/news_ext_content/ifc_external_corporate_site/news+and+events/pressroom/press+releases';
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
await page.setDefaultNavigationTimeout(0);
const links = await page.evaluate(() =>
Array.from(document.querySelectorAll('h3 > a')).map(anchor => anchor.href)
);
for (const link of links) {
const newPage = await browser.newPage();
await newPage.goto(link);
const data = await newPage.evaluate(() => {
const titleElement = document.querySelector('td[class="PressTitle"] > h3');
const contactElement = document.querySelector('center > table > tbody > tr:nth-child(1) > td');
const txtElement = document.querySelector('center > table > tbody > tr:nth-child(2) > td');
return {
source: 'ITC',
title: titleElement ? titleElement.innerText : undefined,
contact: contactElement ? contactElement.innerText : undefined,
txt: txtElement ? txtElement.innerText : undefined,
}
})
completeData.push(data);
newPage.close();
}
await browser.close();
return completeData;
}
There is couple of other things you should note:
- You have a bunch of unused import
title, link, resolve and reject the head of your script, which might have been added automatically by your code editor. Get rid of them, as they might overwrite the real variables.
- I changed your
document.querySelectors to be more specific, as I couldn't select the actual elements from the ITC website. You might need to revise them.
- For local development I use Google's functions-framework, which helps me to run and test the function locally before deploying. If you have errors on your local machine, you'll have error when deploying to Google Cloud.
- (Opinion) If you don't need Firebase, I would run this with Google Cloud Functions, Cloud Scheduler and the Cloud Firestore. For me, this has been the go-to workflow for periodic web scraping.
- (Opinion) Puppeteer might be overkill for scraping a simple static website, since it runs in a headless Browser. Something like Cheerio is much more lightweight and much faster.
Hope I could help. If you encounter other problems, let us know. Welcome to the Stack Overflow community!