Although automatic speech recognition (ASR) could be considered a solved problem in the context of high-resource languages like English, ASR performance for accented speech is significantly inferior. The recent emergence of large pretrained ASR models has facilitated multiple transfer learning and domain adaptation efforts, in which performant general-purpose ASR models are fine-tuned for specific domains, such as clinical or accented speech. However, African accented clinical speech recognition remains largely unexplored. We propose a semantically aligned, domain-specific multitask learning framework (generative and discriminative) and demonstrate empirically that semantically aligned, multitask learning enhances ASR, outperforming the single-task architecture by 2.5% (relative). We discover that the generative multitask design improves generalization to unseen accents, while the discriminative multitask approach improves clinical ASR for majority and minority accents.