Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persName vs. orgName #18

Open
rettinghaus opened this issue Oct 13, 2018 · 14 comments
Open

persName vs. orgName #18

rettinghaus opened this issue Oct 13, 2018 · 14 comments

Comments

@rettinghaus
Copy link
Contributor

What to do when it is unclear if the addressee is a person or an organization (which is often the case with some publishers)?
Is usage of <name> preferable in those cases?

@rettinghaus
Copy link
Contributor Author

And if <name> is used, how to state clearly, that it's a correspondent, and not a place?

@peterstadler
Copy link
Member

What to do when it is unclear if the addressee is a person or an organization (which is often the case with some publishers)?
Is usage of <name> preferable in those cases?

That's exactly what I am doing. You'll find several <name> tags in our WeGA-BEACON where I fail to determine (automatically) whether it's a person, an organization, or a group of people (family).

And if <name> is used, how to state clearly, that it's a correspondent, and not a place?

I don't think it's really necessary for CMIF because for automated processing we rely on the IDs only, not the name strings (don't we?)
That said, by convention we could opt for <name> equals unknown agent and <placeName> equals unknown place?

@rettinghaus
Copy link
Contributor Author

Makes sense. So <name>, <persName> and <orgName> always refers to a correspondent and <placeName> (obviously) to a place.

@StefanDumont
Copy link
Contributor

StefanDumont commented Mar 29, 2019

When I look at the WEGA beacon, I can hardly see any doubt for a human reader as to whether the correspondent is a person or an organization. In my opinion, this is actually a processing problem within the edition. In the few cases, which actually have an "unknown" sender, one can choose <persName> as standard in my opinion with a clear conscience, unless contrary reasons are known. I would therefore plead to stay with <persName> and <orgName>.

@rettinghaus
Copy link
Contributor Author

rettinghaus commented Mar 16, 2023

Thinking about this a bit more:
How should an unknown correspondent be encoded?
Having the string "unknown" or "unbekannt" in <persName/> et al. doesn't make much sense, does it?
Furthermore this isn't compatible with multilingual surroundings (like @correspSearch).

Also using @ref to point to an "unknown person" makes it very hard to collect/distinguish all these cases (as we can see currently in @correspSearch):

image

image

In these cases I would propose to keep the respective correspAction free from any correspondent information, so an application like correspSearch can infer that this person/organization is indeed unknown (and may display it correctly).

@skurzinz
Copy link
Contributor

In these cases I would propose to keep the respective correspAction free from any correspondent information, so an application like correspSearch can infer that this person/organization is indeed unknown (and may display it correctly).

From the perspective of validating CMIF (@StefanDumont does that for correspSearch), I would suggest it is easier to infer from a special fixed value ('NN|unknown|…'?) in @ref (which would have to be defined in the schema as an alternative to an URI) or indeed one pointing to a "known unknown" that the edition was unable to identify the correspondent instead of a simple encoding error with a missing element? No element (or even an empty @ref value) does not make the statement "editors do not know" explicit.

@peterstadler
Copy link
Member

From my perspective it'd be totally ok to have a "missing" element indicating an unknown entity – since we assume that every correspAction (in CMIF) is a communicative event with at least one agent, place and date. So missing information equals unknown. Whether this is considered harder to validate depends on the workflow, imho.

That said, I do not feel strong about it (i.e. missing elements) but wouldn't want a magic token within @ref and dislike the current <persName>Unbekannt</persName>. So, something I'd prefer would be <name>unknown</name> or even better something along the lines <name>$unknown_name$</name> because I bet there are pseudonyms like "unknown" out there. (E.g., "Unknown Man" is the pseudonym of Alexander von Dusch)

@StefanDumont
Copy link
Contributor

The problem with missing elements is:

  • is it not clear, if its intended "unknown" or an error (we had some cases in our own editions where such problems occured)
  • at the moment correspAction need at least one element to be valid. Esp. on the reveiving side, often the person/org is the only information we have.

So I would prefer to have at least a name with some string "unknown" in it.

@skurzinz
Copy link
Contributor

skurzinz commented May 5, 2023

Just because I had to look it up for one of our editions:

In a similar question, a note in https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-author.html specifies:

Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous.

Here, we don’t have a wrapper element such as <author>, so a <name> or <pers|orgName> with "Unknown" would probably fit the bill best (I halfheartedly withdraw my above suggestion to create a fixed @ref attribute value).

Edge case: Letters to a broader public, where the recipient is unspecified. How many recieving correspActions do we use? (1, with denomination "everyone", 0, as it’s unspecified, or multiple, for each [recorded] act of reception)

@rettinghaus
Copy link
Contributor Author

@skurzinz I guess you've added some elements with brackets in your reply? Those disappear if you do not mark them as code.

@skurzinz
Copy link
Contributor

skurzinz commented May 5, 2023

@skurzinz I guess you've added some elements with brackets in your reply?

Thanks for notifying, edited.

@rettinghaus
Copy link
Contributor Author

@skurzinz actually we do have a wrapper, that is correspAction, and plain text isn't allowed there.
Furthermore, like @peterstadler pointed out, we expect there at least one agent, place and date.

The problem with <name>Unknown</name> is not only, that it is hard to parse in multilingual surroundings (like correspSearch), you would never know if this is an agent or a place. Take the following:

<correspAction type="sent">
    <name>Unknown</name>
    <name>Unknown</name>
    <date when="2023-05-08" />
</correspAction>

Does this mean that both the sender and the place are unknown, or is it a letter from two unknown persons?

correspAction requires at least one child element, so we can be quite sure, that missing information is unknown information.

@skurzinz
Copy link
Contributor

skurzinz commented May 8, 2023

@rettinghaus you are right, I did not think about that. My bad. <name>Unknown</name> indeed is not useful in any way.

@StefanDumont
Copy link
Contributor

@skurzinz @rettinghaus I think it would be okay, if noted properly in the documentation. As @rettinghaus stated:

Makes sense. So <name>, <persName> and <orgName> always refers to a correspondent and <placeName> (obviously) to a place.

Since <placeName> is often missing, espially in the receiver section, I would recommend to not encode unknown placeNames explicitly. But if so, <placeName>Unknown</placeName> could be used without disadvantages. So, <name>Unknown</name> would only refer to unknown persons, institutions or companies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants