Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add workflow and script to check edit links on docs #3557

Merged
merged 17 commits into from
Jan 19, 2025
49 changes: 49 additions & 0 deletions .github/workflows/check-edit-links.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Weekly Link Checker
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved

on:
schedule:
- cron: '0 0 * * 0' # Runs every week at midnight on Sunday
workflow_dispatch:

jobs:
check-links:
name: Run Link Checker and Notify Slack
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved

- name: Install dependencies
run: npm install

- name: Run link checker
id: linkcheck
run: |
npm run test:editlinks | tee output.log

anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
- name: Extract 404 URLs from output
id: extract-404
run: |
ERRORS=$(sed -n '/URLs returning 404:/,$p' output.log)
echo "errors<<EOF" >> $GITHUB_OUTPUT
echo "$ERRORS" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
- name: Notify Slack
if: ${{ steps.extract-404.outputs.errors != '' }}
akshatnema marked this conversation as resolved.
Show resolved Hide resolved
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.WEBSITE_SLACK_WEBHOOK }}
SLACK_TITLE: 'Edit Links Checker Errors Report'
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
SLACK_MESSAGE: |
🚨 The following URLs returned 404 during the link check:
```
${{ steps.extract-404.outputs.errors }}
```
MSG_MINIMAL: true
akshatnema marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions components/layout/DocsLayout.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ interface IDocsLayoutProps {
*/
function generateEditLink(post: IPost) {
let last = post.id.substring(post.id.lastIndexOf('/') + 1);

if (last.endsWith('.mdx')) {
last = last.replace('.mdx', '.md');
}
const target = editOptions.find((edit) => {
return post.slug.includes(edit.value);
});
Expand Down
4 changes: 2 additions & 2 deletions config/edit-page-config.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[
{
"value": "/tools/generator",
"href": "https://github.com/asyncapi/generator/tree/master/docs"
"href": "https://github.com/asyncapi/generator/tree/master/apps/generator/docs"
},
{
"value": "reference/specification/",
Expand All @@ -19,4 +19,4 @@
"value": "reference/extensions/",
"href": "https://github.com/asyncapi/extensions-catalog/tree/master/extensions"
}
]
]
11 changes: 8 additions & 3 deletions jest.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@ module.exports = {
coverageReporters: ['text', 'lcov', 'json-summary'],
coverageDirectory: 'coverage',
collectCoverageFrom: ['scripts/**/*.js'],
coveragePathIgnorePatterns: ['scripts/compose.js', 'scripts/tools/categorylist.js', 'scripts/tools/tags-color.js'],
coveragePathIgnorePatterns: [
'scripts/compose.js',
'scripts/tools/categorylist.js',
'scripts/tools/tags-color.js',
'scripts/markdown/check-editlinks.js'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the new script to ignore Jest coverage to pass CI. Will create a new good first issue for other contributors to add test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a practice to add test along with the code, so don't add the file here. Instead add the relevant tests for the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm okay. will add it

],
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
// To disallow netlify edge function tests from running
testMatch: ['**/tests/**/*.test.*', '!**/netlify/**/*.test.*'],
};
testMatch: ['**/tests/**/*.test.*', '!**/netlify/**/*.test.*']
};
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
"generate:tools": "node scripts/build-tools.js",
"test:netlify": "deno test --allow-env --trace-ops netlify/**/*.test.ts",
"test:md": "node scripts/markdown/check-markdown.js",
"test:editlinks": "node scripts/markdown/check-editlinks.js",
"dev:storybook": "storybook dev -p 6006",
"build:storybook": "storybook build"
},
Expand Down
170 changes: 170 additions & 0 deletions scripts/markdown/check-editlinks.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
const fs = require('fs').promises;
const path = require('path');
const fetch = require('node-fetch-2');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why node-fetch-2 instead of normal fetch API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module problem we had—I don't exactly remember what it was but we weren't able to use node-fetch because of some version issue.

Here's the PR:
#3038

const editUrls = require('../../config/edit-page-config.json');

const ignoreFiles = [
'reference/specification/v2.x.md',
'reference/specification/v3.0.0-explorer.md',
'reference/specification/v3.0.0.md'
];

/**
* Introduces a delay in the execution flow
* @param {number} ms - The number of milliseconds to pause
*/
async function pause(ms) {
return new Promise((res) => {
setTimeout(res, ms);
});
}
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved

/**
* Process a batch of URLs to check for 404s
* @param {object[]} batch - Array of path objects to check
* @returns {Promise<string[]>} Array of URLs that returned 404
*/
async function processBatch(batch) {
return Promise.all(
batch.map(async ({ filePath, urlPath, editLink }) => {
try {
if (!editLink || ignoreFiles.some((ignorePath) => filePath.endsWith(ignorePath))) return null;

const response = await fetch(editLink, { method: 'HEAD' });
if (response.status === 404) {
return { filePath, urlPath, editLink };
}
return null;
} catch (error) {
console.error(`Error checking ${editLink}:`, error.message);
return editLink;
}
})
);
}

/**
* Check all URLs in batches
* @param {object[]} paths - Array of all path objects to check
* @returns {Promise<string[]>} Array of URLs that returned 404
*/
async function checkUrls(paths) {
const result = [];
const batchSize = 5;

for (let i = 0; i < paths.length; i += batchSize) {
console.log(`Processing batch ${Math.floor(i / batchSize) + 1}/${Math.ceil(paths.length / batchSize)}`);
const batch = paths.slice(i, i + batchSize);
const batchResults = await processBatch(batch);
await pause(1000);
coderabbitai[bot] marked this conversation as resolved.
Show resolved Hide resolved

// Filter out null results and add valid URLs to results
result.push(...batchResults.filter((url) => url !== null));
}

return result;
}

/**
* Determines the appropriate edit link based on the URL path and file path
* @param {string} urlPath - The URL path to generate an edit link for
* @param {string} filePath - The actual file path
* @param {object[]} editOptions - Array of edit link options
* @returns {string|null} The generated edit link or null if no match
*/
function determineEditLink(urlPath, filePath, editOptions) {
// Remove leading 'docs/' if present for matching
const pathForMatching = urlPath.startsWith('docs/') ? urlPath.slice(5) : urlPath;

const target =
editOptions.find((edit) => pathForMatching.includes(edit.value)) || editOptions.find((edit) => edit.value === '');

if (!target) return null;

// Handle the empty value case (fallback)
if (target.value === '') {
return `${target.href}/docs/${urlPath}.md`;
}

// For other cases with specific targets
return `${target.href}/${path.basename(filePath)}`;
}

/**
* Recursively processes markdown files in a directory to generate paths and edit links
* @param {string} folderPath - The path to the folder to process
* @param {object[]} editOptions - Array of edit link options
* @param {string} [relativePath=''] - The relative path for URL generation
* @param {object[]} [result=[]] - Accumulator for results
* @returns {Promise<object[]>} Array of objects containing file paths and edit links
*/
async function generatePaths(folderPath, editOptions, relativePath = '', result = []) {
try {
const files = await fs.readdir(folderPath);

await Promise.all(
files.map(async (file) => {
const filePath = path.join(folderPath, file);
const relativeFilePath = path.join(relativePath, file);

// Skip _section.md files
if (file === '_section.md') {
return;
}

const stats = await fs.stat(filePath);

if (stats.isDirectory()) {
// Process directory
await generatePaths(filePath, editOptions, relativeFilePath, result);
} else if (stats.isFile() && file.endsWith('.md')) {
// Process all markdown files (including index.md)
const urlPath = relativeFilePath.split(path.sep).join('/').replace('.md', '');
result.push({
filePath,
urlPath,
editLink: determineEditLink(urlPath, filePath, editOptions)
});
}
})
);

return result;
} catch (err) {
console.error(`Error processing directory ${folderPath}:`, err);
throw err;
}
}

async function main() {
const editOptions = editUrls;

try {
const docsFolderPath = path.resolve(__dirname, '../../markdown/docs');
const paths = await generatePaths(docsFolderPath, editOptions);
console.log('Starting URL checks...');
const invalidUrls = await checkUrls(paths);

if (invalidUrls.length === 0) {
console.log('All URLs are valid.');
process.exit(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use process.exit functions like this. Make the conditional rendering more appropriate. Like, make this if block for invalidUrls only.

}

console.log('\nURLs returning 404:\n');
invalidUrls.forEach((url) => console.log(`- ${url.editLink} generated from ${url.filePath}\n`));
console.log(`\nTotal invalid URLs found: ${invalidUrls.length}`);

if (invalidUrls.length > 0) {
process.exit(1);
}
} catch (error) {
console.error('Failed to check edit links:', error);
process.exit(1);
}
}

if (require.main === module) {
main();
}

module.exports = { generatePaths, determineEditLink, main };
Loading