Let’s revisit some foundations of medicinal chemistry. Drugs can be administered through the fatty tissue under the skin (subcutaneously), directly into the veins (intravenously) or into the muscle (intramuscular).
Physiological pH tolerance for intravenous has a big tolerance range from pH = 2–11 and subcutaneously around and ideal range of ph = 5–9. The tolerance decreases under the skin to avoid discomfort. Physiological pH is around 6–8.
In formulation of a drug, the solubility of a compound is dependent on the state of ionization which can be estimated with the Henderson-Hasselbach equation:
ph = pka + log([A-]/[HA])
The pH
of a compound can be estimated with the pKa
value. If we go back to the rule of two which states.
If the pH
is at least 2.0 units above the pKa value than the proportion of the conjugate base deprotonated form for 99% of the solution and vice versa if the pH is 2 units left then it is dominantly in it’s protonated form. This is useful for solubility of a compound when it comes to formulation. We want compounds to be ionized for solubility.
The pKa
is highly dependent on the functional groups that exist within the molecule. Using that information of different pKas, can we determine which functional groups are viable that are suitable for injection?
Code
To investigate, let’s clone some code where it can estimate or retrieve pKa
values from SMILES. One programmer wrote a wrapper around the pubchempy
package.
For the python package dependencies:
python -m pip install pubchempy
python -m pip install requests
python -m pip install pandas
We are going to investigate two compounds: aniline and phenol.
To retrieve the pKa
values:
import pprint as pp
print = pp.PrettyPrinter(indent=1, width=80).pprint
phenol = 'OC1=CC=CC=C1'
print(pka_lookup_pubchem(phenol, "smiles"))
aniline = 'NC1=CC=CC=C1'
print(pka_lookup_pubchem(aniline, "smiles"))
Which gives us the result:
{'Canonical_SMILES': 'C1=CC=C(C=C1)O',
'IUPAC_Name': 'phenol',
'InChI': 'InChI=1S/C6H6O/c7-6-4-2-1-3-5-6/h1-5,7H',
'InChIKey': 'ISWSIDIOOBJBQZ-UHFFFAOYSA-N',
'Isomeric_SMILES': 'C1=CC=C(C=C1)O',
'Pubchem_CID': '996',
'Substance_CASRN': '108-95-2',
'pKa': '9.99 (at 25 °C)',
'reference': 'SERJEANT,EP & DEMPSEY,B (1979)',
'source': 'Pubchem'}
{'Canonical_SMILES': 'C1=CC=C(C=C1)N',
'IUPAC_Name': 'aniline',
'InChI': 'InChI=1S/C6H7N/c7-6-4-2-1-3-5-6/h1-5H,7H2',
'InChIKey': 'PAYRUJLWNCNPSJ-UHFFFAOYSA-N',
'Isomeric_SMILES': 'C1=CC=C(C=C1)N',
'Pubchem_CID': '6115',
'Substance_CASRN': '62-53-3',
'pKa': '4.6 (at 25 °C; aniline conjugate acid)',
'reference': 'PERRIN,DD (1972)',
'source': 'Pubchem'}
Phenol is returned with a pKa
value of 9.99 and aniline with a pKa
value of 4.6. According to our rule the ionization of phenol would happen at a pH
value of 12 and aniline would be 6.6. With a physiological pH
range from 5–9 would phenol be suitable for injection? No. Aniline however would be (could be toxic for other things though)
What if we methylate the aniline?
smiles_string = 'CN(C)C1=CC=CC=C1'
print(pka_lookup_pubchem(smiles_string, "smiles"))
The pKa
that is returned:
{'Canonical_SMILES': 'CN(C)C1=CC=CC=C1',
'IUPAC_Name': 'N,N-dimethylaniline',
'InChI': 'InChI=1S/C8H11N/c1-9(2)8-6-4-3-5-7-8/h3-7H,1-2H3',
'InChIKey': 'JLTDJTHDQAWBAV-UHFFFAOYSA-N',
'Isomeric_SMILES': 'CN(C)C1=CC=CC=C1',
'Pubchem_CID': '949',
'Substance_CASRN': '121-69-7',
'pKa': '5.07 at 25 °C',
'reference': 'Haynes, W.M. (ed.). CRC Handbook of Chemistry and Physics. 94th '
'Edition. CRC Press LLC, Boca Raton: FL 2013-2014, p. 5-100',
'source': 'Pubchem'}
According to our rule, it is acceptable. However, there is no proton to donate from the nitrogen affecting it’s aqueous solubility. The dimethyl aniline is not an effective compound for injection.
This can get quickly more complex as we move forward but it’s important to get a basic idea of solubility and how pH affects a functional group.