Text analytics reaches new territory

Focusing on the unique comments

When federal, state, or local governments propose changes in rules and regulations, an opportunity for public comment is frequently required. At the U.S. Fish and Wildlife Service (USFWS), adding or removing animals from the endangered species list is one example. Advocates on one side or the other often flood the agency with emails and form letters. In the case of proposing to remove some populations of wolves from the list, the agency received hundreds of thousands—but possibly more than a million—comments on various proposals. Processing the content would be impossible without text analytics.

For the past decade, USFWS has been using DiscoverText text analysis software to sort through the many of these larger batches of public comments. “The most important step is deduplication,” said Seth Willey, deputy assistant regional director for ecological services for the southwest region at USFWS. “In one case, we received about 30,000 comments from the same organization, but only three were unique. Most were form letters.” DiscoverText identifies the exact duplicates and also the unique comments from near duplicates. “After we did the analysis, we only had to read three from this particular organization,” continued Willey. “This is a huge savings in time and taxpayer money.”

The number of comments received does not factor into decision making. “Our statutory mandate is to make decisions based on the best scientific and commercial data available,” noted Willey. “We are looking for valid insights into reasons why a species should or should not remain on the list.” Withthe help of software, the final number of comments to be reviewed can be small enough that human coding can be used to categorize the data. DiscoverText has allowed the agency to meet its mandate for review within the allotted timeframes with just a small staff.

At the University of California, Los Angeles, Karen Umemoto, director of the Asian American Studies Center and Helen Morgan Chu chair, is using DiscoverText to analyze hate speech in the Twittersphere. “We took a 2-week sample of tweets relating to Asian-Americans and the perceived role of Asians in the COVID-19 virus,” said Umemoto. “We analyzed personal experiences with harassment, people’s responses to negative comments, informational tweets about the source of the virus, and other related content.”

After deduplication of the redundant tweets, they were coded by researchers into various categories, and analyzed. “We were able to get a deeper understanding of the dynamics around this issue,” commented Umemoto. “As a result, we were quickly able to produce a preliminary policy brief that will help determine how to address this ongoing issue.”

DiscoverText was the only tool that was suitable for the research, according to Umemoto. “It is easy to use, very powerful, and off-the-shelf-ready. You don’t need to be a ‘big data’ specialist or computer science expert to use it.” The center has not yet used the full power of the software, “such as intercoder reliability, sampling, and using AI for analysis, but it has already met our needs for this stage of our work,” said Umemoto.

Previous Page Next Page

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Super Early Bird Pricing for KMWorld 2026 Available for a Limited Time!
Register NOW for November 16-19. Use code SUPERSAVINGS.

Text analytics reaches new territory

Mining Business Knowledge From Unstructured Data

Checklist Report - Preparing for Agentic AI: KM Playbook

2026 State of KM & AI Report

More

Agentic AI at the Core: Building Faster, Smarter Search Experiences

Knowledge at Your Fingertips: Building Workflows with Embedded Intelligence

GenAI Without Limits: Harnessing KM for Accuracy, Trust, and Scale

The AI Knowledge Maturity Model: Assessing Readiness and Measuring Progress

More Webinars

Super Early Bird Pricing for KMWorld 2026 Available for a Limited Time!Register NOW for November 16-19. Use code SUPERSAVINGS.

Text analytics reaches new territory

Super Early Bird Pricing for KMWorld 2026 Available for a Limited Time!
Register NOW for November 16-19. Use code SUPERSAVINGS.