Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up static site generator for DHQ #40

Open
11 of 14 tasks
amclark42 opened this issue May 12, 2023 · 3 comments · May be fixed by #38
Open
11 of 14 tasks

Set up static site generator for DHQ #40

amclark42 opened this issue May 12, 2023 · 3 comments · May be fixed by #38
Assignees

Comments

@amclark42
Copy link
Contributor

amclark42 commented May 12, 2023

Replace Apache Cocoon's dynamic transformations with static web pages and resources, compiled though an Apache Ant build file. For additional context, see the DHQ infrastructure meeting notes and the specification document.

Tasks:

  • Add Saxon HE processor and dependencies to the repository
  • Set up article preview mechanism
  • Create XSLT to process Table of Contents
    • Transform article TEI into HTML
    • Apply custom transformations for select articles
    • Generate an Ant file that maps article directories to their static site counterparts
    • Generate volume/issue index pages
    • Generate volume/issue contributor biographies
  • Generate an index for search
    • Add metadata to article HTML
  • Check for missing files in static site copy
  • Test on a Windows computer
  • Compress static site
  • Create documentation

We still need to decide what to do with the editorial section (which requires authentication) and redirected URLs.

@amclark42
Copy link
Contributor Author

The new stylesheet generate_static_articles.xsl already uses the TOC to generate HTML of every DHQ article. It also generates a single XML file, which attempts to map each article's source directory to its expected home upon publication. This mapping needs to be replaced.

In order for Ant to make use of file mapping, the XSLT must generate a new Ant build file, structured like this:

<project name="dhq_articles">
  
  <target name="copyArticleResources">
    <copy todir="${toDir.path}">
      <fileset dir="${basedir}${file.separator}articles"/>
      <firstmatchmapper>
        <regexpmapper from="^000654/(.*)$" to="vol/17/1/000654/\1" handledirsep="true"/>
        <regexpmapper from="^000116/(.*)$" to="vol/7/2/000116/\1" handledirsep="true"/>
      </firstmatchmapper>
    </copy>
  </target>
</project>

Once the derived build file is available, the main build file can run the task to copy article resources into their static directories:

<ant antfile="..${file.separator}${toDir}${file.separator}article-mapper.xml" 
     target="copyArticleResources" inheritRefs="true"/>

@amclark42 amclark42 linked a pull request May 12, 2023 that will close this issue
@amclark42
Copy link
Contributor Author

The Ant build file in the static_site_generation branch has these main targets:

  • previewArticle: Create an HTML preview version of a single article.
    1. Checks for the XML resolver JAR.
    2. If article.id wasn’t provided, requests an article ID for processing.
    3. Creates the dhq-preview directory inside the repository.
    4. Runs template_article.xsl on the specified article, saving the output to dhq-journal/dhq-preview/.
  • zipPreviewArticle: Create a ZIP file which contains the HTML preview for a single article, as well as the separate assets.
    1. Checks for the XML resolver JAR.
    2. If article.id wasn’t provided, requests an article ID for processing.
    3. Runs previewArticle on the specified article.
    4. Compresses article HTML, resources, and DHQ web assets into dhq-journal/dhq-preview/dhq-article-######.zip.
  • generateIssues: Generate static HTML versions of the DHQ issues, using XSLT on toc.xml.
    1. Checks for the XML resolver JAR.
    2. Creates a dhq-static directory next to the journal repository.
    3. Generates issue indexes, contributor bios, and article HTML in the expected dhq/vol/#/#/ directory structure in the static directory.
    4. Generates an Ant build file in the dhq-static directory. (This will copy article XML and resources from the repository articles directories into the expected dhq/vol/#/#/###### directories.)
  • generateSite: Generate a full static copy of DHQ intended for the DHQ server. This is NOT a standalone copy. (Also runs generateIssues; you don’t have to run that one separately if you don’t want.)
    1. Checks for the XML resolver JAR.
    2. Runs generateIssues.
    3. Runs the new Ant task in dhq-static/article-map.xml to copy article resources into the expected dhq/vol/#/#/###### directories.
    4. Copies submissions text files and most web assets (excluding lib and tests).
    5. Transforms the test file to create starter.html in the static directory.
    6. Runs template_static_pages.xsl on pages in about, contact, news, people, and submissions folders, saving the output to dhq-static/dhq/.
    7. Compresses the contents of dhq-static/dhq/ into dhq-static/dhq.zip.

The default target is previewArticle. All targets rely on XSLT processing, and so, they require the XML resolver JAR to be on the classpath when Ant is called. If the JAR file is missing, the build file will stop and provide instructions for loading the JAR.

To run an Ant target, use this command: ant -lib common/lib TARGET. For example, this would generate a compressed, standalone preview of article 000600:

ant -lib common/lib zipPreviewArticle -Darticle.id=000600

Because this command provides the value of article.id ahead of time, zipPreviewArticle will not prompt for it.

@amclark42
Copy link
Contributor Author

amclark42 commented Jul 17, 2023

Some guidance on testing static site generation:

  1. Optional: run the generateIssues Ant target. (Useful if you want to examine how this task relates to generateSite.)
  • Did any errors occur?
  • Find the dhq-static directory and poke around in it.
  1. Run the generateSite Ant target.
  • Did any errors occur?
  • Find the dhq-static directory.
  • Examine the files in dhq-static/dhq/.
  • Optional: Load the ZIP into a server so that its contents appear at /dhq. This is useful for seeing at a glance if the web assets are in a comparable state to those on DHQ, and for testing some links.
    • If you can’t do this, it’s okay. Some things to keep in mind:
    • You’ll need to spend a little extra time scrolling past the navigation bar to look at the content of the pages.
    • Most links won’t work as-is; it’ll be best to open files using your OS’s file navigation software rather than clicking around.
  • Check a few “static” pages, such as those in the about directory.
  • Compare the articles listed in some DHQ issue’s index with those listed on the DHQ site.
  • Find an article with images, and make sure they display in the HTML.

For Windows testing, be especially attentive to file structure and paths. Are dhq-static and its contents in a reasonable place? Are issue indexes and article HTML files in the right places? If you examine the HTML, do links have forward slashes (/) and not back-slashes (\)? The Windows ZIP should definitely be tested by loading it into a server.

As you’re going through this, think about maintainability and quality-of-life. Could anything be made more transparent, or easier? Are there additional preventative measures that could be taken to head off errors or mistakes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants