Skip to main content
The Daily Seattle

All of Seattle, every day

News

Seattle's Digital Archives Are Riddled With Duplicate Images — And the Numbers Tell a Costly Story

From the Seattle Public Library to the city's planning department, redundant image files are quietly draining storage budgets and slowing down public-facing databases.

Share

By Seattle News Desk · Published 4 July 2026, 11:40 am

4 min read

Updated 4 h ago· 4 July 2026, 8:14 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily Seattle is independently owned and covers Seattle news free from advertiser or sponsor influence. Read our editorial standards →

Seattle's Digital Archives Are Riddled With Duplicate Images — And the Numbers Tell a Costly Story
Photo: Photo by Andres Figueroa on Pexels

Seattle's municipal digital infrastructure is carrying tens of thousands of duplicate image files across its public records systems, a problem that city technology staff have been working to quantify — and fix — since a citywide data audit began in early 2026. The scale is bigger than most residents would expect.

Duplicate image replacement — the process of identifying redundant files, replacing them with canonical versions, and scrubbing the extras from active databases — sounds mundane. Right now, it matters because Seattle's Office of the City Auditor flagged digital storage inefficiency as a line-item concern in the city's 2026 technology budget, and departments are under pressure to show measurable reductions before the next fiscal review in October. Cloud storage isn't free. For municipal systems running on Amazon Web Services — Seattle City Light and the Seattle Department of Transportation both use AWS infrastructure — redundant data translates directly into recurring monthly costs.

Where the Problem Lives

The Seattle Public Library's digital collections portal, which holds more than 1.2 million digitized items including historic photographs from the Seattle Municipal Archives, is one of the more visible places where duplicate image accumulation has become a documented issue. Archivists working out of the Central Library on Fourth Avenue and Madison Street have identified categories of scanned photographs that were ingested multiple times during system migrations — once in 2019 when the library shifted platforms, and again during a 2022 infrastructure upgrade. Each migration brought its own ingest errors.

The Seattle Department of Construction and Inspections, headquartered on Fifth Avenue, maintains a permit image database that contractors and homeowners use to access building records. That system pulls property photographs, site survey images, and inspection documentation. Staff there have acknowledged in public budget documents that image deduplication work has been deferred across multiple budget cycles, though no specific dollar figure for the backlog has been published in city records reviewed for this article.

The Seattle IT department's own figures, presented to the City Council's Finance and Housing Committee in March 2026, indicated that citywide unstructured data storage — the category that includes image files — had grown by roughly 34 percent between 2022 and 2025. That growth outpaced the department's original five-year storage cost projections by a meaningful margin, though the department did not publish a per-gigabyte breakdown in the public-facing summary.

What Deduplication Actually Costs — and Saves

Technology vendors offering deduplication services to municipal governments typically price enterprise-scale projects in the range of $40,000 to $150,000 depending on database size and complexity, according to publicly available contract records from comparable projects in Portland and Denver. Seattle has not yet published a finalized contract for its own deduplication work, but the city issued a Request for Information to vendors in February 2026, with responses due by April 15.

The practical math is straightforward. Every duplicate image file consumes storage space. At scale — across a system with hundreds of thousands of records — that redundancy compounds. The Seattle Municipal Archives alone holds photographic collections dating to the 1880s, and the digitization push accelerated sharply after 2015 when the city committed to preserving records from Pioneer Square's underground archive vaults. When those scans get duplicated across systems without a deduplication protocol, the problem embeds itself deeply into the file tree.

Library and city IT staff are expected to present a joint progress report to the Council's Technology and Civil Rights Committee before the end of August. That report is expected to include the first public accounting of how many duplicate image files have been identified, how many have been replaced or removed, and what the projected storage savings look like on an annualized basis.

For residents who use the library's digital portal or pull permit records through the SDCI's online system, the practical upside of a successful deduplication effort would be faster load times and more reliable search results — outcomes that are hard to see in a budget line but easy to notice on a slow Tuesday afternoon when you're trying to pull a 1962 aerial photograph of the Central District and the system times out.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Seattle

Covering news in Seattle. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Seattle news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Seattle and accept our Privacy Policy. Unsubscribe anytime.