Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPUB import #12457

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

EPUB import #12457

wants to merge 4 commits into from

Conversation

InAnYan
Copy link
Collaborator

@InAnYan InAnYan commented Feb 3, 2025

EPUB has some bits of metadata, so why not import it

Mandatory checks

  • I own the copyright of the code submitted and I licence it under the MIT license
  • [?] Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
    - [ ] Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit tests are failing. In the area "Some checks were not successful", locate "Tests / Unit tests (pull_request)" and click on "Details". This brings you to the test output.

You can then run these tests in IntelliJ to reproduce the failing tests locally. We offer a quick test running howto in the section Final build system checks in our setup guide.

@InAnYan
Copy link
Collaborator Author

InAnYan commented Feb 4, 2025

I would appreciate it if anyone can give me places where I can download ePUB books and include them in JabRef tests.

I only took books in public domain from Project Gutenberg, but they appear to be strictly in one format.

However, on their websites I found other ePUBs that have different contents of ZIP-archive (This is the reason why I search for any .opf file)

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.

In case of issues with the import order, double check that you activated Auto Import.
You can trigger fixing imports by pressing Ctrl+Alt+O to trigger Optimize Imports.

* Adapted from <a href="https://stackoverflow.com/a/79077999/10037342">...</a>.
*/
public static Path remapZipPath(Path zipPath) throws IOException {
final File tempFile = File.createTempFile("PREFIX" + UUID.randomUUID(), "SUFFIX");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addField(StandardField.URL, identifier);
}

addField(StandardField.AUTHOR, Optional.of(String.join(" and ", authors)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Authorlist parser

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it do? Parsing authors? Why do I need it there?

Authors are specified separately in <dc:author>. Should be...

UPDATE: Ah, crap, I remember seeing sometimes it's not. Should investigate a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that there is a fetcher which has a similar schema, take a look at that one

return IntStream
.range(0, nodes.getLength())
.mapToObj(i -> nodes.item(i).getTextContent())
.filter(s -> !(s == null || s.isBlank())) // Just in case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have StringUtil isNullOrBlank somewhere

* @return {@link Optional#empty()} if the {@param string} is empty, otherwise wrap it in {@link Optional}.
*/
public static Optional<String> optionalOfEmpty(String string) {
return Optional.ofNullable(string).filter(s -> !s.isEmpty());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a similar existing method already isNotBlank which takes already an optional


private static Stream<String> invalidFileNames() throws IOException {
// Clashes with ZIP-based formats are inevitable.
Predicate<String> fileName = name -> !name.startsWith("EpubImporterTest") && !name.startsWith("CitaviXmlImporterTest");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you. need to exclude citavixml? isn't the EpubImporterTest already unique enough as prefix

Copy link
Collaborator Author

@InAnYan InAnYan Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nuh huh

This is invalidFileNames. This includes all files that are not EpubImporterTest....

However, those Citavi files, specifically .cvt6back (or something like that) are ZIP-files (headers match), thus I need to exclude them.

Yes, a better way to do this is to match the extension, but if I remember correctly, name does not contain extension, so I had to use prefix.

I'll look into it again if I could match better.

+ Comment needs to be refined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants