Seattle's Office of the City Clerk took a significant step this week in a years-long project to eliminate duplicate and redundant images clogging the city's digital archive systems, a problem that has quietly inflated storage costs and complicated public records requests across multiple departments. The effort, which city staff have been refining since a formal audit of the document management infrastructure began in 2024, reached a new phase on July 1 when technicians completed a first-pass deduplication sweep across more than 4.2 million scanned files held in the city's Laserfiche document management platform.
The timing matters. Seattle has faced steady pressure to improve the speed and accuracy of its public records responses under Washington State's Public Records Act, Chapter 42.56 RCW, which carries financial penalties for agencies that delay or improperly withhold documents. Duplicate images — sometimes three or four copies of the same scanned permit, council memo, or inspection report — slow keyword search functions and force staff to manually verify which version is the authoritative record before disclosure. That bottleneck has drawn complaints from journalists, land-use attorneys, and neighborhood advocates who regularly file records requests with the city.
What the Cleanup Actually Involves
The deduplication process is not a simple delete-and-move-on operation. City IT staff embedded with the Clerk's office must first flag suspected duplicates using hash-matching software, then route flagged files to subject-matter reviewers inside each originating department — a chain that has pulled in staff from Seattle Department of Construction and Inspections on 700 Fifth Avenue, the Seattle City Light records team in the Goat Hill garage complex on Second Avenue, and the Seattle Municipal Archives itself, which holds the permanent historical record for city government.
Across those departments, reviewers are working through a backlog that grew substantially during the 2020–2022 period, when pandemic-era remote scanning protocols produced inconsistent file-naming conventions and led to frequent accidental re-uploads. The Municipal Archives has estimated internally — according to a process document posted to the city's open data portal earlier this year — that roughly 18 percent of files ingested during that period contain at least one near-identical duplicate.
For residents trying to track development projects in Capitol Hill, South Lake Union, or the Rainier Valley, the practical effect has been search results cluttered with redundant entries that can make it hard to find the most current version of, say, a Master Use Permit or a design review decision. The Office of the City Clerk has been working with the Laserfiche vendor since March 2026 to configure automated flagging rules that catch duplicates at the point of upload rather than after the fact.
What Comes Next for the System — and for Public Access
The July 1 sweep is not the end of the project. Department reviewers now have until August 29 to validate or override the automated flags before any files are permanently removed or archived to cold storage. That review window is deliberately conservative: city policy requires that no document be deleted without a human sign-off, a rule grounded in both the Public Records Act and Seattle's own records retention schedule, which classifies many permit and inspection documents as permanent or 50-year retention items.
Once the review window closes, the Clerk's office expects to present a report to the Seattle City Council's Finance and Housing Committee, likely in September, detailing how much storage was reclaimed, how many duplicates were resolved, and what the project cost in staff hours. Storage costs for city digital systems have risen alongside broader cloud-pricing trends; the city's IT department reported in its 2025 budget documentation that enterprise storage expenses increased approximately 12 percent year-over-year.
For anyone who regularly files public records requests with Seattle — whether through the city's NextRequest portal or directly with individual departments — the practical advice is straightforward: requests filed after September 1 are likely to return cleaner, faster search results if the deduplication review stays on schedule. If you have a pending request that involves high-volume scanned documents from the 2020–2022 period, it may be worth following up with the relevant department in August to ask whether duplicate-related delays are affecting your file.