-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need to support multiple author affiliations #7135
Comments
(Somewhat cross-posted: #5912 (comment)) @kmccurley, we often support a "dual-track" toolset for features in OJS:
For example, the built-in OJS search engine works out of the box, but is not very feature-rich and has limited scalability, but the Lucene/SOLR plugin is available for those who need the additional tools and have the capacity to run the necessary service. I'm hesitant to make the current "affiliation" field any smarter than it currently is because of some inherent constraints:
I think machine-readability is a precursor to the work you propose, thus working with RORs directly (where you mentioned this before) or possibly via the ORCID API and the author's ORCID record. For your own use case, are RORs or author ORCIDs a workable approach? |
Let's stay focused on the issue at hand: multiple affiliations per author. That's a critical deficiency of the OJS schema for what is stored about an author. The reason I mentioned the other issues is partly because OJS is falling behind, and when you make a schema change, you should anticipate all of the requirements for publishing. Machine-readable metadata is absolutely necessary for any serious publishing platform, because all reputation scores are based upon it, and research funding agencies are increasingly demanding it. As I mentioned earlier, almost every metadata format for publishing now supports multiple affiliations. |
As @asmecher says, we are likely to pursue support for multiple affiliations through extensions to the ROR plugin and/or the ORCID plugin. That's because these approaches offer the possibility to support affiliation disambiguation and machine-readability. Our schema is extensible using plugins, which means that we can store a single affiiliation record by default:
And plugins can enrich that by storing additional data alongside author records: [
{
"ror": "03rjyp183",
"name": {
"en_US": "University of Bern",
".._..": "...",
},
},
{
"ror": "03y4dt428",
"name": {
"en_US": "University of Pisa",
".._..": "...",
},
}
] That can then be used to enrich records sent to downstream consumers, like Crossref, Datacite, etc. Keeping the plain text field as a base provides flexibility that's important to satisfy all of the different use cases of our community, as you can see in this example. |
I've spent some time reading the plugin documentation, looking at other plugins, and reading the core code. I've concluded that it's probably not trivial to write a plugin, and may not even be possible without modifying the core code. The problem is that there are many parts of the code that depend upon $author->getAffiliation() returning a string. The whole point would be to capture more sensible metadata, but that means every other plugin that exports metadata would need to be modified. That includes doaj, native, googleScholar, users, ROR, and perhaps others. This would introduce far too many dependencies between existing plugins. I'm surprised that nobody has flagged this before, given the reality of publishing practice. It sounds like it won't happen soon and I should look toward developing our own alternative. |
Heads-up that we're likely going to be implementing multiple affiliation support in the core (OJS, OMP, OPS) as part of the work to integrate ROR support into the applications. The thinking goes like this:
|
This is good news. While you're making a change, it's worth thinking about why you collect affiliations at all. Possible reasons are:
I'd particularly recommend looking over the JATS This is also related to how authors express their funding relationships, which is different than an affiliation. crossref has announced that they will be transitioning from using their Open funder registry to ROR, so ROR identifiers will be useful there. |
Hello, Here's my proposal for adding multiple affiliations for authors and users in OJS. This includes support for both ROR and non-ROR affiliations. You van view the workflow here: https://youtu.be/FHwF4yBwzEA Some considerations:
@asmecher @GaziYucel @bozana please add more considerations if I have missed any |
I'm not a member of the OJS development team, but I think your UI looks quite nice. The goal of any UI is to help the user complete the task at hand with minimal fuss. There are several side constraints to consider here:
One of our sites uses a dropdown but encourages user's to click on something to refine their query. We opted for this because we wanted to tell users how to encode the most accurate information possible into their LaTeX document. I think you could figure out how to merge this into the OJS workflow, but perhaps it's too complicated to drill down on the relationships of an organization. I think it just illustrates how complex the choice is for an author to select an affiliation from the ROR database. |
It's now three years since this issue has been opened, and still no progress. As @kmccurley pointed out in his OP, it is a clear requirement due to publishing practice, especially for journals in the medical and sciences field (and we have a lot of articles examples where we need to use a specific separator such as "; " to distinguish multiple affiliations and use the separator to split them e.g. for PubMed export). This issue needs to relabelled to a major enhancement and a milestone should be set. As long as this is not solved, I can't recommend to my teams to install the ROR plugin (although ROR is a fine solution for organization disambiguation). |
@mpbraendle, development on this is currently underway as part of the RoR integration into OJS 3.5. @GaziYucel, maybe you can share a couple quick details? |
Hi @mpbraendle, thank you for your interest. As @asmecher pointed out above, I am currently working on this. Plan is to release this with the OJS 3.5 release. You can view the workflow here: https://youtu.be/FHwF4yBwzEA The PR where I referenced this issue is a part of the ROR / multiple affiliations integration into the core. This PR is solely to get the ROR dataset data dump into the OJS database. This will be used for lookups, because using the ROR api for lookups seems to slow. This will add approximately 40MB in the database, refreshed bi-weekly. If you are interested in the development flow, this is the branch I am working on https://github.com/GaziYucel/pkp-lib/tree/multiple-author-affiliations We decided to implement the new UI interfaces as you can see in the video, which we think is much better than before. This will make the interface more future proof and more accessible. |
Hi all,
|
Today I analysed the ROR data dump again. Out of 111325 entries, 32641 entries have "no_lang_code". This means that ROR does not know the language for almost a third. This is much worse than I previously thought. This introduces a problem localizing the data. I evaluated several possibilities:
First option will add more noise to the data dump. (thanks Bozana) In my opinion, the second option is something ROR should do for:
Assuming that those entries are in their primary language is a good assumption. For our end, I think the last one is the best solution. If anyone is interested, a recent dump can be found here: https://zenodo.org/api/records/14020449/files/v1.55-2024-10-31-ror-data.zip/content |
I too think the last option detailed by @GaziYucel seems like our best bet. |
Hi @GaziYucel, I understand the ror_display_name = name in ror_display_lang, correct? If we would insert ror_display_name then we would maybe not need to insert the names in no_lang_code. Lets think if this would be better (e.g. performance) or if we could then miss something... 🤔 And yes, we should not try to change or interpret/guess anything, and just take the ROR DB as it is. |
And when we are speaking about columns that we would like to have locally: what about the column 'active'? Are there any inactive RORs? And what does this exactly mean? |
@bozana I thought about inserting the ror_display_name into rors table, but this would be a clone of what is in ror_settings table. And we are already querying ror_settings table, I don't think we will gain much in performance if any. I think this is correct as it is. According to ROR, the name for ror_diplay_lang should also be in the names (names.types.label) field. |
Regarding isActive. I forgot to mention this in the questions/notes I wrote in the PR. This field can have three values: active, inactive or withdrawn. See https://ror.readme.io/docs/data-structure#status. I didn't fully implement this, because I couldn't figure out how to use these values for rors tables and the already saved affiliations for authors. Statistics:
Total: 111325 records (2024-10-31 data dump) @bozana I will update the rors table as it is now, all active gets a "1", others will become a "0". On the contributor screen, I will filter out all which is not active. Result will be that only active ones can be added, ignoring all "inactive" and "withdrawn". All affiliations, which are already added to authors will be left alone. This way, we will follow ROR again, because this will mean that older publications will be historically accurate. During an upgrade from an earlier version, the ROR cache is filled (tables rors and ror_settings). After this a migration is done for all affiliations currently saved for the authors. First a match is searched for all ROR on an exact match for the names. After this all remaining is migrated as a non-ROR affiliation. |
Hi @GaziYucel, let me check something, then I will come back to you again regarding ROR statuses. |
I write this here for documentation purposes. I will reference this from other places as needed. We have decided that we are not going to use the ror.org api for the lookups. Instead, a data dump of ROR is downloaded and added to the OJS database (tables rors/ror_settings). The refresh will be done bi-weekly. For this, a new Ror object is created in the pkp-lib. This and all other new features can be found in this PR #10460. The UI part of this work can be found in this PR pkp/ui-library#417. This is a screen recording of how it works now: For the lookups, a new ojs api endpoint is created, which is The affiliations list in the contributor screen are in the affiliation schema (pkp-lib/schemas/affiliation.json and pkp-lib/classes/Affiliation). What I do in the UI is as follows:
Schema for a single ror:
Schema for a single affiliation:
|
Apologies if this is already present in another issue or an internal development plan. I searched in issues but could not find anything related to multiple affiliations for an author.
A recent study of 22 million articles published in 2019 showed that "almost one in three publications was (co-)authored by authors with multiple affiliations..." and "the share of authors with multiple affiliations increased from around 10% to 16% since 1996." The fact that OJS does not support multiple affiliations for authors means it is increasingly out of step with the realities of academic publishing, and my organization is reluctant to continue using OJS for this reason (among others).
I believe that an author should be able to specify multiple affiliations for a submission. This meshes nicely with the need to uniquely identify affiliations through the use of ROR identifiers. The use of a free text field alone for affiliations makes it difficult for machines to determine that "UC Berkeley" and "University of California - Berkeley" and "University of California, Berkeley" are in fact the same institution.
Having just installed the latest version of OJS, I noticed that affiliation information is stored in the underlying database as a row in
author_settings
usingsetting_name
ofaffiliation
, but the underlying table has a unique key of (author_id
,locale
,setting_name
), which makes it impossible to store multiple affiliations unless the information is encoded in some way within thesetting_value
field. Our authors are currently listing multiple affiliations with;
to separate them, but this is a bad practice for the future (much like journals entering bogus email addresses when the field was required).As I mentioned before, the use of multiple affiliations is already an extremely common practice. The listing of affiliations has multiple purposes, including citation analysis to rank institutions, which strongly affects their funding. The attachment of an affiliation to a paper also strongly influences the reputation of the paper itself, and the inability to list multiple affiliations contributes to a "winner-take-all" attribution of credit, which is damaging to second-tier institutions and their authors.
Accuracy of affiliations is also important for identifying potential conflict of interest among reviewers.
Multiple affiliations are already supported by the following:
native plugin XSD
allows multiple affiliation tags per author, but when you import to OJS it appears to discards this information. Obviously export fails to report multiple affiliations since the database only holds one.Obviously other publishers have already embraced multiple affiliations. ACM has started capturing structured representations in their LaTeX class:
This can be useful in case the affiliation does not have a ROR ID, or the author wishes to define it as within a department or institute of a ROR entity (ROR does not catalogue these).
Unfortunately the schema of having a single affiliation is buried deeply in the codebase for OJS. Obviously the core developers of OJS are best able to understand a path forward for addressing this. IOne possible interim solution is to define a new field in
author_settings
withsetting_name
ofaffiliationList
. Then populate this with a JSON encoding that can have version information inside it. The code that uses $author->getAffiliation can over time be migrated to $author->getAffiliationList() to return a list of affiliations (perhaps with different locales!). An alternative is to allowauthor_settings
to have multiple values for a givensetting_name
.PRs:
The text was updated successfully, but these errors were encountered: