Issuing and utilization of data
Last updated
Last updated
Organizations or entities that issue data can do so via verifiable credentials (VCs). This issuance can either be done individually upon receiving requests from data subjects, or in bulk by inputting the subjects' DIDs into an internal admin page. Either way, it requires transforming the data model so that internal agency data can be released in the form of VCs. Currently, VCs can be issued in two syntactic representations: JSON-LD and JWT. Alternatively, one can encrypt the data file with the data subject's secret key, upload it to distributed storage, and incorporate the file's hash value into the VC without transitioning to these data models.
Expanding the integration with systems that issue data in this manner is of utmost importance. In this regard, for example, Raredata, a clinical research data management program, could contribute to providing the emerging Hippocrat ecosystem with trustworthy, high-quality data. Medical researchers, the primary users of Raredata, produce a wide range of quality data from selected patients for credible research. In particular, most clinical research involves collaboration between researchers from different institutions, ensuring the data is representative. This collaboration inevitably results in systematic synchronization of data types, terminology, collection methods, and so on. Because the reproducibility of research outcomes is crucial, the incentive to deliberately manipulate data is greatly reduced. These mechanisms guarantee both the reliability of the original data and the distribution process via VCs.
Another significant detail included in the VC is the commission rate that will be allocated to the issuing organization's share when the issued data is sold. Up until now, data issuing organizations have had no feasible way to earn from their share of data that exits the organization. However, with VCs, all data is subject to the patient's decision to circulate, and any data that is traded for a fee with the patient's consent is automatically divided between the patient and the issuing organization. This enables the issuing organization to gain incentives from data use, besides the data issuance fee. This mechanism motivates issuing organizations to prepare and issue more reliable and usable data.
Organizations developing AI healthcare solutions, leveraging existing ones to offer data-driven healthcare services, or aiming to screen clinical trial participants, can use Hippocrat to harness trusted, consented data. The first step requires preparing a data wallet and a Decentralized Identifier (DID) for the organization intending to use the data. They must also integrate a software development kit (SDK) tailored to data utilization into their internal services, readying them for data usage.
The process of data enrichment involves linking the individual's DID with the utilizing organization's DID. Typically, the individual is asked to scan a QR code for login, authentication, or connection purposes, at which point they provide their consent. The organization doesn't need to request all data at once. Initially, it can request only the information needed for basic service usage. Further data requiring a higher level of user consent can be requested separately. This method ensures that the right data is available at each user conversion stage.
In order to obtain consent, the organization must clearly communicate to the individuals who they are, what permissions are being requested, and what data will be used under what conditions. This is similar to presenting and obtaining legal notices for data collection and use, like conventional terms and conditions and privacy policies. However, the key difference is that organizations can use standardized terms certified by a governance framework.
This approach has several advantages. Individuals aren't burdened with fine-tuning legal notices every time, as discussed in the management of consent (signatures) in the data wallet. Furthermore, organizations can adopt licenses that meet specific compliance requirements for different countries and contexts. This significantly reduces the cost and time associated with legal review, making it easier to obtain informed consent.
The features mentioned above could initially be implemented in Rarenote, a service for patients suffering from rare diseases. Along with the use of a data utilization SDK, patients could access healthcare and community services through the data stored in their personal data wallets. This SDK is freely integrable into other for-profit and non-profit products, requiring no contracts.
Rarenote cound be the initial use case for the Hippocrat ecosystem, enabling patients to take advantage of their data wallets. Rarenote offers trusted, personalized information, healthcare solutions, and community experiences to patients with rare diseases, incurable cancers, and other conditions with limited treatment options. This is achieved by using data gathered with informed consent. Additionally, patient-derived health data generated during service use, as well as clinical data submitted via the wallet, can be processed into a format that can be used by other data-requiring organizations like pharmaceutical companies. This will allow for value-added data sales. The entire process is based on patient consent, and the revenue generated serves as a shared compensation source between the patient and the data issuing organization. This promotes sustainable data compensation and usage.
To make this scenario work, conditions for data use and compensation are included in the patient's consent. Depending on the degree of privacy disclosure, data can be categorized into protected health information, de-identified health information, and limited datasets. Alternatively, it can be classified into identified health information, anonymous health information, and pseudonymous health information. Generally, the greater the privacy level and information requested, the higher the reward. However, this also increases the likelihood of a patient declining to protect their privacy. Consequently, organizations wishing to use the data will strive to obtain only the essential data required to persuade patients. Moreover, since data creation requires significant public resources, the system can be designed to distribute a larger reward percentage for public purposes, the more pseudonymous or anonymous the information is and the more it's used for scientific research. This mechanism could potentially enhance the quality of public healthcare.