Generating synthetic data represents an attractive solution for creating open data, enabling health research and education while preserving patient privacy. This work reproduces the research outcomes obtained on two previously published studies, which used private health data, using synthetic data generated with a method that used in this publication, called HealthGAN.
Continue reading “Synthesizing Quality Open Data Assets from Private Health Research Studies”Open Data Quality Dimensions and Metrics: State of the Art and Applied Use Cases
While the economic benefit of open data is undeniable, its use as an asset in industrial processes is still a challenge. The lack of quality is indeed a typical argument for not leveraging open data.
Continue reading “Open Data Quality Dimensions and Metrics: State of the Art and Applied Use Cases”Models for Arabic Document Quality Assessment
Digital content has been increasing rapidly. This content can be generated, accessed and used by anyone and thus the need for quality assessment of web content before usage becomes an important issue. Devising methods to assess the quality of Arabic digital content is the focus of this paper.
Continue reading “Models for Arabic Document Quality Assessment”Materia: A Data Quality Control Embedded Domain Specific Language in Python
Current solutions for data quality control (QC) in the environmental sciences are locked within propriety platforms or reliant on specialized software. This can pose a problem for data users when attempting to integrate QC into their existing workflows.
Continue reading “Materia: A Data Quality Control Embedded Domain Specific Language in Python”Enhancing the Interactive Visualisation of a Data Preparation Tool from in-Memory Fitting to Big Data Sets
In order to derive reliable insights or make evidence-based decisions, the starting point is to assess and meet a minimum quality of data, either by those that publish the data (preferably) or alternatively by those that prepare data for analysis and develop specific analytics.
Continue reading “Enhancing the Interactive Visualisation of a Data Preparation Tool from in-Memory Fitting to Big Data Sets”Analyzing OpenStreetMap Contributions at Scale: Introducing OSM-Interactions Tilesets
This demo shows how osm-interaction tilesets can be generated for specific objects such as highways or buildings to support OSM researchers, especially when conducting intrinsic data quality assessment.
Continue reading “Analyzing OpenStreetMap Contributions at Scale: Introducing OSM-Interactions Tilesets”Technical Usability of Wikidata’s Linked Data
Wikidata is an outstanding data source with potential application in many scenarios. Wikidata provides its data openly in RDF. This study aims to evaluate the usability of Wikidata as a data source for robots operating on the web of data, according to specifications and practices of linked data, the Semantic Web and ontology reasoning.
Continue reading “Technical Usability of Wikidata’s Linked Data”Semantic Data Integration and Quality Assurance of Thematic Maps in the German Federal Agency for Cartography and Geodesy
This paper presents a new concept of geospatial quality assurance that is currently planned to be implemented in the German Federal Agency of Cartography and Geodesy. Linked open data is being enriched with Semantic Web data in order to create thematic maps relevant to the population.
Continue reading “Semantic Data Integration and Quality Assurance of Thematic Maps in the German Federal Agency for Cartography and Geodesy”Evaluating the Quantity of Incident-Related Information in an Open Cyber Security Dataset
However, some questions remain over the quality and quantity of such open data. This paper presents the results of a recent case study that considers how feasible it is to answer a common question in Cyber security incident investigations, namely that “in an incident, who did what to which asset or victim, and with what result and impact”, for one such open Cyber security database.
Continue reading “Evaluating the Quantity of Incident-Related Information in an Open Cyber Security Dataset”Approach to Improving the Quality of Open Data in the Universe of Small Molecules
This study describes an approach to improving the quality and interoperability of open data related to small molecules, such as metabolites, drugs, natural products, food additives, and environmental contaminants.
Continue reading “Approach to Improving the Quality of Open Data in the Universe of Small Molecules”