One python package I enjoyed was glances and I admired the distribution architecture someone setup:
pip install --user 'glances[action,browser,cloud,cpuinfo,docker,export,folders,gpu,graph,ip,raid,snmp,web,wifi]'
I wanted to copy it for my own package, Global-Chem. Where there is a lot of code and I need to organize it by the field the code belongs in. The back-end was pure pythonic:
pip install global-chem
All the tools can be installed with
pip install global-chem-extensions
The way I want it to be:
pip install global-chem[cheminformatics]
The way I setup the scientific field was going to be it’s own directory and/or sub-field was going to be it’s own package with it’s own dependencies. To break down the tree, the file structure I setup was:
Where each object in python is going to be the sub-package.
Each sub-package had it’s own dependencies I needed to deal with where the imports and dependencies for those packages happen one level deeper when the module actually gets imported. What the user sees, that way it can be imported like a module.
from global_chem_extensions import GlobalChemExtensions
gc = GlobalChem()
cheminformatics = GlobalChemExtensions().cheminformatics()
Turns out you pip install a package and not have all the dependencies to make it work if it never touches the module that depends on that software. This is where it gets tricky. I setup.py
file with the extras_require
option for the different dependencies based on the submodule.
extras_require={
'cheminformatics': [
'partialsmiles', 'pysmiles', 'deepsmiles',
'selfies', 'molvs', 'flask', 'plotly', 'kaleido',
'bokeh', 'molpdf', 'dimorphite_dl',
'scaffoldgraph'
],
'bioinformatics': [
'biopython', 'dna_features_viewer', 'biopandas',
'pypdb'
],
'development_operations': [''],
'quantum_chemistry': ['moly', 'kaleido', 'pyyaml==3.13'],
'forcefields': [
'rdkit-pypi', 'partialsmiles', 'pysmiles', 'deepsmiles',
'selfies', 'molvs'
],
'graphing': ['plotly'],
},
This now gets it where the user installs like so:
pip install global-chem-extensions[cheminformatics]
Which is not as clean as I would like it. And what I did next might not be as best practice where I changed the global-chem
setup file to include the different categories and install the tools via it’s extra requires.
extras_require={
'graphing': ['global-chem-extensions[graphing]'],
'forcefields': ['global-chem-extensions[forcefields]'],
'bioinformatics': ['global-chem-extensions[bioinformatics]'],
'cheminformatics': ['global-chem-extensions[cheminformatics]'],
'quantum_chemistry': ['global-chem-extensions[quantum_chemistry]'],
'development_operations': ['global-chem-extensions[development_operations]'],
'all': [
'global-chem-extensions[graphing]',
'global-chem-extensions[forcefields]',
'global-chem-extensions[bioinformatics]',
'global-chem-extensions[cheminformatics]',
'global-chem-extensions[quantum_chemistry]',
'global-chem-extensions[development_operations]',
]
},
Which worked out pretty well:
pip install global-chem[cheminformatics] --upgrade
This took some time to figure out and was tricky to implement. I don’t know how others have accomplished distribution architectures for python package but I hope this one is robust enough for now.