How AI Learns from Ultrasound: A Step-by-Step Workflow
- Carlos Jimenez
- Apr 3
- 6 min read
As a clinician, I’ve spent years interpreting musculoskeletal ultrasound images at the bedside. While this skill is crucial, I often found myself wondering: “Can we take what we’re seeing and teach a computer to recognize it too?” That question led me into the world of Python and artificial intelligence territory that felt unfamiliar at first, but quickly revealed powerful possibilities.
Ultrasound is one of the most accessible and cost-effective imaging tools in medicine, yet its interpretation remains highly dependent on operator expertise. This creates variability and limits scalability. By leveraging AI specifically computer vision models we can begin to automate parts of this process, enhance consistency, and support clinical decision-making.
But beyond that, this journey matters because it shows what’s possible when clinicians start learning tools like Python. We don’t need to wait for tech companies or engineers to build everything for us. We can be part of the innovation process bridging clinical insight with computational power.
This blog is a reflection of that journey. My goal is to demonstrate how even with a clinical background, it’s entirely possible to contribute to the AI space in a meaningful way.
The Inspiration: A Research-Backed Framework
This whole project started with a research paper I came across titled “Automatic Recognition of the Supraspinatus Tendinopathy from Ultrasound Images using Convolutional Neural Networks.” What caught my attention wasn’t just the accuracy the researchers achieved it was how approachable the framework was. They broke down a clear pipeline: from collecting ultrasound images, to preprocessing, segmentation, and finally training a model to detect pathology. It felt like something I could actually try myself.
What really pulled me in was the fact that they used standard ultrasound images, just like the ones I see every day in clinic. They weren’t relying on expensive imaging modalities or massive datasets from hospitals with AI departments they were working with the kind of data many of us already have access to.
I remember thinking, “If they can train a model to detect supraspinatus tendinopathy, why can’t I apply the same approach to something like the medial gastrocnemius?” That simple question became my starting point.
From there, I made it my mission to replicate the framework in the paper, but using a different muscle group and with my own ultrasound image dataset. The goal was to not only see if I could do it, but to learn exactly what happens under the hood when we talk about AI in medical imaging.

Tools of the Trade: From Ultrasound to Python
One of the most rewarding aspects of this project has been using widely available, open-source tools to tackle a very real clinical challenge: segmenting muscle structures from ultrasound images. My workflow was built entirely in Python, and it allowed me to build, train, and fine-tune a deep learning model for detecting the medial gastrocnemius.
My Go-To Toolkit
Here’s a quick look at what powered the project:
• TensorFlow + Keras: For model development and training
• OpenCV: To preprocess grayscale ultrasound images (resizing, CLAHE contrast enhancement, RGB conversion)
• imgaug: For powerful, randomized medical image augmentations
• Matplotlib: For real-time visualization of segmentation results
• NumPy & glob: For data manipulation and file handling
Dataset Preparation
I worked with two datasets:
• A general set of grayscale ultrasound images with masks for initial training
• A cleaner, high-resolution PNG set for fine-tuning the model
Each image was paired with a corresponding binary segmentation mask identifying regions of interest in the medial gastrocnemius (GM). All images were resized to 128×128 pixels. Masks without any segmentation (i.e., black masks) were filtered out to maintain label quality.
To enhance the contrast of the ultrasound images, I used CLAHE (Contrast Limited Adaptive Histogram Equalization), which helps improve visibility of tissue boundaries—something we clinicians instinctively do when adjusting live ultrasound settings.
Data Augmentation: Making the Model More Robust
Training a model on a limited dataset always risks overfitting. To address this, I implemented a custom data generator that introduced augmentations such as:
• Horizontal and vertical flips
• Random rotations and scaling
• Elastic deformation (to mimic soft tissue shifts)
• Gaussian noise and blur
• Contrast and brightness changes
Each batch was augmented in a deterministic way to ensure masks stayed perfectly aligned with the images critical in any medical segmentation task.
The Model: NASUNet
For the core architecture, I adapted the research paper’s pipeline and built a custom U-Net-style model using NASNetLarge as the encoder. This model was pre-trained on ImageNet, which provided a strong foundation of learned visual features.
The decoder was built using transpose convolutions, batch normalization, and skip connections from intermediate NASNet layers to preserve both fine and coarse anatomical features.
🎯 Loss Functions:
• Dice + BCE loss: For the initial training phase, balancing overlap accuracy with pixel-wise learning
• Tversky loss: For fine-tuning, especially helpful for handling class imbalance—a common challenge in medical datasets
Fine-Tuning on High-Quality PNGs
After training my base model, I fine-tuned it using a curated set of higher-quality PNG images. This phase involved:
• Loading the previously trained model
• Re-compiling it with a lower learning rate
• Switching to Tversky loss to improve performance on edge cases
• Logging training metrics to both .csv and TensorBoard
• Saving the best versions based on both loss and Dice coefficient
The result? A refined, clinically-informed model that improves with each iteration of data and feedback.
Visual Insights During Training
To make training more transparent, I built a custom visual callback that shows the model’s predictions at the end of each epoch: the original image, the true segmentation mask, and the predicted mask. This gave me quick insights into how well the model was learning—and where it still struggled.
The Clinician’s Perspective: Why Learning Python Matters
After more than a decade in sports medicine, I’ve developed a deep appreciation for the subtleties of clinical decision-making. But diving into Python and AI has given me something new: a way to extend my clinical intuition into code.
Learning to code isn’t about replacing clinical judgment it’s about enhancing it. It allows you to ask better questions like:
• How consistent is this interpretation over time?
• Can we train a model to flag early signs we might miss?
• What does “normal” really look like when visualized across hundreds of patients?
Understanding the fundamentals of how models work loss functions, segmentation logic, augmentation has changed how I view both imaging and data. I no longer see AI as a black box. Instead, I see it as a partner that I can guide, shape, and improve based on clinical context.
To any clinician out there who’s even slightly curious: you don’t need to be a full-time developer to get started. All you need is a question, a dataset, and a willingness to learn.
What’s Next: Scaling, Collaboration & Open Science
This project was a personal experiment—but it’s just the beginning. I’m already planning to:
• Collect more diverse GM samples, including different pathologies
• Explore transfer learning from other musculoskeletal structures
• Improve the clinical interpretability of my models (e.g., saliency maps, overlays)
• Possibly integrate classification + segmentation pipelines for a more comprehensive workflow
I’m also open to collaborating with clinicians, researchers, and data scientists who are exploring similar questions. If you’re interested in medical imaging, sports medicine applications, or just want to explore this space with a fellow clinician-turned-coder—reach out!
🧭 A Note on Results: Progress Over Perfection
As you can see from the graph above, my model didn’t hit perfect performance especially on the validation data. And that’s okay. This project was never about chasing a benchmark score. It was about learning the process: how to build, train, fine-tune, and evaluate a deep learning model using real ultrasound data.
Every line of code I wrote helped me better understand what’s going on under the hood of AI systems. More importantly, it helped me think more critically as a clinician. I now see the possibilities (and limitations) of AI in a much clearer way and I know exactly where I want to go next.
So if you’re reading this and wondering whether it’s worth diving in even if you’re not sure you’ll “get it right” on the first try yes, it is. The learning is the real result
Call to Action
If you’re a clinician, researcher, or student curious about how AI can actually work for you — not replace you I’d love to connect. This project isn’t just about code; it’s about learning how to ask better clinical questions and build tools that support real-world decisions.
Let’s collaborate, share insights, or even troubleshoot together.
🧠 Whether you’re just getting started or deep into the ML space, there’s room to grow together.
Comentarios