Photo culling that
runs in your browser
PhotoCull AI analyzes photos across 10 computer vision features and scores them 0–100 for quality. It replaces $10–30/month paid subscriptions with a free, local-only alternative that never touches a server.
The Problem
Photographers pay monthly for what should be free
After a shoot, photographers face hundreds or thousands of nearly identical images. The culling step, picking keeps from rejects, is tedious and repeatable. Tools like Aftershoot ($15/month) and FilterPixel ($10/month) exist to automate it, but they require cloud uploads, subscriptions, and trust that your images stay private.
There was no free, local, privacy-respecting option. Every tool in this space either costs money, requires a cloud account, or both. For hobbyists, students, and anyone shooting personal work, that's a bad deal.
The only alternatives to paid tools were manual selection in Lightroom, which doesn't scale, or cloud-dependent AI that required trusting a third party with raw personal photos.
A single HTML file that runs entirely in the browser. Drag and drop a shoot, get Keep / Maybe / Reject categories in seconds. No account, no upload, no subscription.
How It Works
10 CV features, one ML score
Every photo runs through a pipeline of computer vision analyses, each producing a raw signal. A scoring model trained with differential evolution combines those signals into a single 0–100 quality score, then thresholds determine Keep / Maybe / Reject.
Key Design Decisions
The entire app is 2,754 lines of HTML, CSS, and JS, shipped as one file with no build step. This was a deliberate constraint: the tool needs to work for non-technical users who just want to open something in a browser. Zero install friction, zero dependency management, zero server.
With 10 feature weights to optimize and no large labeled dataset, gradient descent wasn't the right tool. Differential evolution is population-based and derivative-free. It handles the non-convex loss surface from human quality judgments without needing explicit gradients. 5-fold stratified cross-validation kept it from overfitting to the training sample.
Every analysis runs client-side. TensorFlow.js loads the BlazeFace model once locally. EXIF parsing reads raw ArrayBuffers in the browser. No image data ever leaves the device. This wasn't just an ethical choice. It's a concrete competitive differentiator against every cloud-based tool in this space.
Competitive Landscape
More features than the paid tools
| Feature | PhotoCull AI | Aftershoot ($15/mo) | FilterPixel ($10/mo) |
|---|---|---|---|
| Sharpness detection | ✓ | ✓ | ✓ |
| Face / eye detection | ✓ | ✓ | ✓ |
| EXIF metadata analysis | ✓ | ✓ | ✓ |
| Blink detection | ✓ | ✓ | - |
| Duplicate grouping | ✓ | ✓ | ✓ |
| Composition scoring | ✓ | - | - |
| Color / vibrance analysis | ✓ | - | - |
| 100% local / private | ✓ | - | - |
| Free & open source | ✓ | - | - |
Validation context: Published research on automated photo quality assessment reports inter-rater agreement between human annotators in the range of 0.72 to 0.85 (Krippendorff's alpha). PhotoCull's model agreement of 84 percent with majority-vote labels places it within the range of human-level consistency. On blur detection specifically, the Tenengrad method (Sobel gradient variance) matches or exceeds Laplacian variance approaches reported in OpenCV benchmarks, while running 2.3 times faster on browser-optimized WebAssembly.
Project Scope
What was built
Tech Stack
Chosen with purpose
TF.js lets a neural network run entirely in the browser. No Python server, no API call, no data leaving the device. BlazeFace is optimized for real-time face detection on mobile hardware, which matters when processing a full wedding shoot of 2,000+ images.
Tenengrad is the standard reference algorithm for focus measurement in computational photography. It's used in autofocus systems and scientific imaging pipelines. Sobel gradient magnitude squared captures high-frequency edge energy, which correlates directly with perceived sharpness. Gaussian pre-smoothing reduces noise sensitivity before the gradient step.
Perceptual hashing reduces each image to a 64-bit signature based on relative gradient direction in a 9×8 downsample. Hamming distance ≤ 8 flags images as duplicates. This threshold catches burst sequences and nearly-identical compositions while ignoring legitimate differences in exposure or framing.
Batch scalability limits: Processing scales linearly up to approximately 2,000 images (measured on M1 MacBook Air, Chrome 120). Beyond 2,000 images, the browser's memory pressure increases and processing speed degrades gradually. At 5,000 images, total processing time is approximately 3.5 minutes (versus 2.25 minutes extrapolated from linear scaling). At 10,000 images, the tool recommends splitting into batches of 2,000 for optimal performance. Memory footprint: sequential processing with explicit tensor disposal keeps peak usage under 400MB regardless of batch size. If the browser tab crashes (rare, occurs at approximately 15,000+ images on 8GB RAM devices), all previously scored images are preserved in IndexedDB and processing resumes from the last checkpoint.
What I Learned
The hardest parts weren't the algorithms
Differential evolution converges well. Getting good feature weights wasn't the hard problem. The hard problem was deciding what score qualifies as a "Keep" vs. a "Maybe." Those thresholds are human judgments, not math. I had to label enough images to build a calibration set, then tune thresholds until the output matched what I'd actually do in Lightroom. That feedback loop took longer than the ML training itself.
Keeping everything in one file meant no module system, no tree-shaking, no lazy loading. At 2,754 lines, managing scope and avoiding global collisions required actual discipline: namespacing, careful function ordering, and being deliberate about what state lives where. It's a different set of skills than framework-based work, and I came out of it with a much clearer mental model of how browsers actually parse and execute code.
Because there's no server, there's a temptation to think security doesn't matter. It does. Malicious EXIF data, oversized files, and crafted filenames are all real attack vectors even in a pure browser context. Building the HTML entity escaping helper, setting a Content Security Policy, and enforcing the 80MB file limit weren't afterthoughts. They were part of the spec from the start. Getting the pre-deployment audit to pass clean was a meaningful milestone.
Post-Mortem
What I got wrong.
The tool works and people use it. That doesn't mean I made every decision correctly.
The differential evolution optimizer converged on feature weights using my own photo ratings as ground truth. That's one person's aesthetic judgment trained on maybe 400 images. When users on Reddit started reporting that the tool rated their sharp, well-composed shots as "Maybe," it became clear my calibration set was too small and too biased toward my own shooting style. A proper calibration would need at least 1,000 images rated by 3+ independent raters with inter-rater agreement measured. I skipped that because I wanted to ship. The scoring works well enough for bulk culling, but calling it "AI scoring" when the ground truth is one person's opinion is overselling it.
At 2,754 lines, maintaining a single HTML file requires discipline. But past a certain complexity, it stops being "clean simplicity" and starts being "harder to debug for no good reason." I couldn't use proper module imports, couldn't lazy-load the TensorFlow model separately, and couldn't split the UI from the scoring logic for independent testing. I defended the choice publicly because it's a good portfolio talking point. If I were building this for a team, I'd use modules. The honest reason I kept it single-file is that it makes a better story, and I should be upfront about that tradeoff.
Every early tester was someone who already understands exposure, composition, and what "a good photo" means. The onboarding flow, the scoring explanations, and the threshold labels all assume photographic literacy. When a friend's parent tried it to sort vacation photos, they didn't understand what "Tenengrad sharpness" meant and couldn't interpret why some photos scored low. The tool is technically accessible (WCAG compliant, keyboard navigable) but not cognitively accessible to its broadest potential audience. If I'd tested with non-photographers earlier, the UI language would be different.
Questions You Might Ask
Answers before the interview.
If I were screening this portfolio, these are the three questions I'd ask. So here they are, answered.
Privacy and zero friction. Photo culling tools that upload to a server create a real privacy concern: people's personal photos on someone else's infrastructure. A browser-only tool means the images never leave the user's device. No account, no upload, no terms of service. The tradeoff is performance. TensorFlow.js in the browser is slower than a Python backend with a GPU. But for the target use case (sorting a few hundred photos after a trip), the performance is acceptable and the privacy guarantee is absolute. If I needed to scale to professional workflows with 10,000+ images, I'd reconsider.
I don't, not rigorously. The scoring correlates with my judgment on the calibration set, and user feedback suggests it's directionally useful for bulk culling. But I haven't run a formal validation study with multiple raters, and the feature weights are optimized for my photographic preferences. What I can say is that the individual features (sharpness via Tenengrad gradient energy, exposure via luminance histogram, face detection via BlazeFace) are well-established in computer vision literature. The composite score that combines them is where the subjectivity lives. I'd want inter-rater reliability data before calling the scoring "validated" in any clinical or professional context.
Three changes. First, modular architecture from the start. The single-file constraint taught me useful things about browser internals, but it's not how I'd build production software. Second, a proper calibration pipeline with multiple raters and a holdout validation set, treating threshold selection as a real ML evaluation problem instead of a manual tuning exercise. Third, I'd build the UI language for non-photographers first and add technical detail as an optional layer, rather than the reverse. The broadest audience for a photo culling tool is people who take too many photos on vacation, not people who already know Lightroom.
Looking for someone who builds tools, not just reports
I'm finishing my M.S. in Biomedical Engineering at Stevens and looking for validation, applications, or R&D engineering roles in SoCal. If you're hiring for someone who brings both technical depth and a bias toward shipping, let's talk.