Contributing
There are many ways to contribute to BDI-Kit, such as improving the codebase, reporting issues or bugs, enhancing the documentation, reviewing pull requests from other developers, adding new matching methods, or expanding support for additional standards (data models). See the instructions below to get started!
Formatting the Code
We format code using black. The CI runs for every pull request and will fail if code is not properly formatted. To make sure formatting is correct, you can do the following steps.
Make sure you have black installed:
$ pip install black
To format the code, anyone can use the command before committing your changes:
$ make format
Or you can use the black command directly:
$ black ./bdikit/ ./tests/ ./scripts/
Adding New Matching Methods
Contributors can add new methods for schema and value matching by following these steps:
Create a Python module inside the “task folder” folder (e.g., bdikit/schema_matching).
Define a class in the module to implements a base class. For schema matching, the base class could be BaseSchemaMatcher or BaseTopkSchemaMatcher. For them, you need to implement the methods match_schema() or rank_schema_matches(), respectively. For value matching, the base class could be BaseValueMatcher or BaseTopkValueMatcher. For them, you need to implement the methods match_value() or rank_value_matches(), respectively.
Add a new entry to the Enum class (e.g. SchemaMatchers) in matcher_factory.py (e.g., bdikit/schema_matching/matcher_factory.py). Make sure to add the correct import path for your module to ensure it can be accessed without errors.
Adding New Standards (Data Models)
Contributors can extend BDI-Kit to additional standards (data models) by following these steps:
Create a Python module inside the “standards” folder (bdikit/standards).
Define a class in the module to implements BaseStandard. This class should implement three methods:
get_attributes(): Returns a list of all the attributes (strings) of the standard.
get_attribute_values(): Returns a dictionary where the keys are attribute names and the values are lists of possible values for each attribute.
get_attribute_metadata(): Returns a dictionary where the keys are attribute names and the values are dictionaries containing these mandatory fields for each attribute: attribute_description, value_names, and value_descriptions. Other fields can be included as well, but their values must be strings or lists of strings.
Add a new entry to the class Standards(Enum) in bdikit/standards/standard_factory.py. Make sure to add the correct import path for your module to ensure it can be accessed without errors.