How-to select the right profile for your resource
In this how-to, we will guide you through the necessary steps for you to select a Bioschemas profile that will be later used to mark up to your own resources.
Last updated
In this how-to, we will guide you through the necessary steps for you to select a Bioschemas profile that will be later used to mark up to your own resources.
Last updated
You can find the availabe Bioschemas profiles at http://bioschemas.org/specifications. There, you will be presented with a list of all the current and stable profiles. You can hover on the profile name to see a quick description. Should you need a more detailed information, just click on the profile name.
As seen on Figure 1, each profile will show you details such as current version, release date, use cases, crosswalk, tasks and issues, usage examples and live deploys.
Use cases: Used as a basis for the profile as they are the
Crosswalk: Documentation on the brainstorming and process followed by a group in order to come up with a profile specification
Tasks & issues: Link to a GitHub space where you can report issues with a profile, see the assignees, and participate of the discussion
Example: Usage examples for the profile
Live deploy: Link to live deploy for the profile
Which profile is the right one for you will depend on your resource. In the next section, we will present some hints on those profiles that have been, so far, more broadly used, i.e., mainly customizing generic types rather corresponding to specific Life Science entities.
Also knows as data repository, a data catalog commonly aggregates more than one dataset. If your resources supports only one dataset, you still could decide to markup your resource, in this case, as DataCatalog and also Dataset (this would make it easier if you are thinking on adding more datasets). However, whenever more than one dataset is provided, it totally makes sense to have your resource as a DataCatalog.
If your resource provides data and you can easily identify a common entity type for all the data contained in it, you should probably go for a Dataset profile. Let's clarify what we mean by "common type". Let's suppose you have chemical compounds including drugs, proteins and cells. If you see them all as the same thing, chemical compound, you have one Dataset, and you have found the right profile for you. However, if you actually distinguish drugs from proteins from cells and so, and (maybe even) tailor the information provided for each case, you have a data catalog and multiple datasets, you should use both, one DataCatalog and multiple Datasets.
We do not officially provide a DataRecord profile so it will be up to you which properties to use. But, please keep reading. DataRecord is a tricky type, one that could be important for some resources but not so much (or easy to grasp) for some others. DataRecord was born from the necessity of distinguishing information coming from the database model versus that one inherent to the entity represented on the database. It is getting a bit philocosphical, but bear with us. Any record on a database was created at some point in time, and was latest modifed at some other point. However, cells simply exist in nature, they do not quite correspond to something that is "created" by a conceptualization made by humans, they are a thing/entity, not a model of the thing.
So, if the distiction is important for you, go ahead and use DataRecord to represent records/entries on your Dataset. Draw the line and use any BioChemEntity profile, e.g., protein, gene or sample, to represent the "thing/entity" that DataRecord is about. If he distinction is not important for you ot it becomes too complicated to draw the line, use both at the same time to markup your records, i.e., DataRecord + a BioChemEntity profile.
Disclaimer: It could be that your dataset is not at all about a BioChemEntity. It could well be about TrainingMaterial, or Course, or so. Still, the same discussion applies to it.
Coming soon
Coming soon
Coming soon
Coming soon