Skip to content
This repository has been archived by the owner on Aug 17, 2020. It is now read-only.

XPATH expression to filter out correct UTC Timestamp from node date (if value exists) #14

Open
hwidmann opened this issue Mar 27, 2014 · 3 comments

Comments

@hwidmann
Copy link
Contributor

What I want :
Convert node date to UTC timestamp 👍

  • if date is empty don't map any thing (tries with if (//dc:date[1]/[text()]), but doesn't work)
  • elif it's already in UTC format (e.g. 2007-01-12T11:09:59Z ) let it as it is (don't know how to do ?)
  • elif it contains '-' take part before and attach '-07-01T11:59:59Z' (implemented and working)
    (better would be : elif first 4 characters are of year format YYYY, take it and append ...
  • else : just take it as it is (but maybe better let it out, because datepicker works only with UTC format ...

All in all : convert date/text() to UTC format and , if not possible, let it empty (= no mapping)

Current implemetation in oai_dc.xml is like :

if (//dc:date[1]/[text()]) then ( if(contains(//dc:date[1]/text(), '-')) then concat(substring-before(//dc:date[1]/text(), '-'),'-07-01T11:59:59Z') else concat(//dc:date[1]/text(),'-07-01T11:59:59Z') )

@hwidmann hwidmann changed the title XPATH expression to filter out correct Timestamp (if exist) XPATH expression to filter out correct UTC Timestamp from node date (if value exists) Mar 27, 2014
@hwidmann
Copy link
Contributor Author

see SVN commit r3141

@hwidmann
Copy link
Contributor Author

Ok, now find an examples as well in github, e.g.
md-ingestion/oaitestdata/sdl-oai_dc/testset/xml/25325.xml
converting works, but now you get :
{
"value": "September 2004-07-01T11:59:59Z",
"key": "PublicationTimestamp"
},

This is wrong (only UTC format is allowed) and upload will fail.

We have to remove month name or anything else during mapping or just grep out UTC format
YYYY-MM-DDThh:mm.ssZ and nothing else ...

Don' know how to do this with XPATH syntax. If not possible in general it would be nice to get rid off 'September' and anything else before the year YYYY !!??

@kjvandelooij
Copy link

Hi Heinrich,

I discussed this with Lari, and it turns out that we need to do some post processing to achieve this. Post processing in this case means that the mapper will have to operate on the key value pairs after mapping. Not sure when we will be able to do this.

Greetings, kj

On Mon, 14 Apr 2014 17:36:00 +0200
hwidmann [email protected] wrote:

Ok, now find an examples as well in github, e.g.
md-ingestion/oaitestdata/sdl-oai_dc/testset/xml/25325.xml
converting works, but now you get :
{
"value": "September 2004-07-01T11:59:59Z",
"key": "PublicationTimestamp"
},

This is wrong (only UTC format is allowed) and upload will fail.

We have to remove month name or anything else during mapping or just
grep out UTC format YYYY-MM-DDThh:mm.ssZ and nothing else ...

Don' know how to do this with XPATH syntax. If not possible in
general it would be nice to get rid off 'September' and anything else
before the year YYYY !!??


Reply to this email directly or view it on GitHub.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants