You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here's an example of how to do two-tier sorting of nodes.Functions are poorly named for this general case, but start wtih "publishable" and you reduce the token-level analytical edition to a read/analyzed canonical-level Corpus. NB: it's lacking the required feature of identifying the new exemplar by a unique exemplar identifier
def passagesFromTokens(readingPairs: Vector[(CtsUrn, String)]) = {
val triples = readingPairs.zipWithIndex.map( v => TextTriple(v._1._1, v._1._2, v._2))
val trVect = triples.groupBy(_.urn).toSeq.toVector
trVect.map{ case (k,v) =>(k, v.sortBy(_.seq).map(_.reading).mkString(" ") ) }
}
def idxForUrn(u: CtsUrn, urnSeq: Vector[(CtsUrn, Int)]) = {
urnSeq.filter(_._1 == u)(0)._2
}
def publishable(scholionGroup: String, publType: String) : Vector[CitableNode] = {
val tkns =tokensForDocument(scholionGroup, publType)
val passages = passagesFromTokens(tkns)
// For final sort:
val urnSeq = tkns.map(_._1).distinct.zipWithIndex
val sortedFinal = passages.sortBy{ case (k,v) => idxForUrn(k, urnSeq) }
sortedFinal.map { case (k,v) => CitableNode(k,v)}
}
E.g., collapse token-level exemplar to a canonical-level exemplar.
Function should take parameter for level to collapse to, and String value for new exemplar identifier.
The text was updated successfully, but these errors were encountered: