You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Removing “e.g.” in the sentence that includes “such as”
Removing accessibility measures. – Gabriel said the data should be made available, usable, and meaningful by adding the metadata.
o Action: InKyung to check what was the original rationale for including the term “accessibility” in version 5.1 of GSBPM.
Removing quality characteristics and coefficient of variation. This was agreed, but following the comment by France (Florian), it was suggested that the metadata might be updated at this stage regarding calculated quality metrics, or that the data and metadata could be packaged into a product. In the evaluate phase, you can make a quality report
Regarding the extra sentence on explanation of methodology, it was suggested this could be something about machine learning (since methodology in general should be documented in the design phase already)
o Action: Giorgia to develop a sentence or two about methodology for machine learning methods.
Make sub-process 5.1 “integrate” as a separate phase of GSBPM (following the suggestion by the UK)
Joni suggested that having 5.1 as a separate phase might provide a place for subprocesses underneath, dealing with things such as pseudo-anonymisation, or input-privacy methods such as multi-party computation. However, the decision that had been made about pseudo-anonymisation, was to add it to the very last subprocess of the Collect phase.
Gabriel expressed support for the name “acquisition” for Phase 4 to reflect the use of secondary data sources. However, he wanted to keep integration separate from collect/acquire, with integration coming afterward. In his opinion, integration shouldn’t be a separate phase.
Carlo said too big change, which could be difficult for users to adapt to. Gabriel pointed out that data integration is not something new for countries having population registers.
Geocoding could arguably by considered as an example of data integration (e.g. referencing address information to add coordinates to the dataset). Additionally geocoding can be used as a means of integrating datasets by geographical coordinates. This has relevance to GeoGSBPM (which mentions geocoded data as a means of integration in the Process phase, but mentions geocoding itself in the Collect phase).
o Action: InKyung offered to revisit 5.1 in Geo GSBPM to see if there is anything that should be mentioned in 5.1 for the next version of GSBPM.
Outcome: In general, there was not strong support to make 5.1 a separate phase, though the UK could see if there are further reasons for doing so.
Action: InKyung to talk about whether to rename the Collect phase to Acquire at the next meeting. (This could usefully feed into the revised GSIM.)
Add something on training ML models
There is discussion about whether this should happen in the Design phase (if data is collected repeatedly, but the model is not trained each time). But if not, then it cannot take place until after data has been collected, although GSBPM is itself not linear. Gabriel pointed out that while survey collection is driven by demand for those data, use of secondary data is based on what’s already available, so the data are not designed in advance, but rather the models used to consume those data.
Joni said that training, assessing or building machine learning models could be a part of the Build phase, or the Process phase. But where does hyperparameter optimisation come in? InKyung suggested it could be within Capability Building before being used in production.
Outcome: InKyung suggested including it within the Build phase.
However, geospatial variables are not the only ones that can be used for data integration, which might also need to be processed prior to data integration.
Outcome: To add a sentence to para 75 to reflect the fact that some variables could be processed prior to others if they are required for data integration (for example geospatial variables if used for that purpose).
Add inparagraph 77 (introduction of Process phase) “It is also desirable that these phases are conducted simultaneously for the benefit of collection activities.” We believe it is important to reinforce in this way the link between collection activities and microdata analysis, to clarify the interdependence among sub-processes.
It was felt that this extra text is unnecessary given that GSBPM is not linear.
Outcome: To incorporate “processing activities may often be performed during collection” into para 77
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Present: Antti Santaharju, Carlo Vaccari, Christopher Jones, Claudia Brunini, Florian Vucko, Gabriel Gamez, Giorgia Simeoni, Inkyung Choi, Joni Karanka, Mahmoud Jlassi
o Action: InKyung to check what was the original rationale for including the term “accessibility” in version 5.1 of GSBPM.
o Action: Giorgia to develop a sentence or two about methodology for machine learning methods.
o Action: InKyung offered to revisit 5.1 in Geo GSBPM to see if there is anything that should be mentioned in 5.1 for the next version of GSBPM.
Action: InKyung to talk about whether to rename the Collect phase to Acquire at the next meeting. (This could usefully feed into the revised GSIM.)
Beta Was this translation helpful? Give feedback.
All reactions