iEntry 10th Anniversary RSS Archive

IT Management Begins With Security
SecurityProNews > News > Security News > NPR Optimizes Audio Files For Search Engine Purposes
Search:
[ news_security_news ]

NPR Optimizes Audio Files For Search Engine Purposes



WebProNews
Staff Writer
2004-05-27

SecurityProNews: Insider Reports Insider Reports RSS Feed


National Public Radio, in an effort to make their audio content crawlable, has begun transcribing audio news streams into text files. These efforts appear to be paying off for NPR; these transcriptions have begun appearing on Google and Yahoo News, complete with links to source audio file.

In an extensive article by News.com, NPR online director Maria Thomas stated, "our site is primarily full of rich audio, and we want people to find it when it's relevant. The big search engines' technologies don't have the ability to get inside the audio or video. With the little bit of text we have on NPR, it's not always good enough to find our content, and reference the page."

The shortcomings of search engines is what caused NPR to consider audio transcription. Currently, search engine technology is geared towards finding contextual, keyword-related content. Because of this, major search engines are not capable of crawling multimedia content, unless there is a textual representation available.

Yahoo-owned AltaVista is a search engine that offers audio and video searches. However, it too crawls the text associated with these multimedia files. According to CNet, a handful of companies are attempting to create software that actually extracts portions of audio and video files in order to determine relevance.

Since NPR has begun transcribing their audio content, the site has increased its visitors in what is being referred to as "record spikes." Although, Thomas did not release specific traffic figures.

In order to accomplish the transcription, NPR is using StreamSage, a speech recognition software that was introduced last year. StreamSage also uses a contextual analyzer that parses the language into themes. It then generates a text file similar to a table of contents that can be spidered and indexed by search engines.

For accuracy purposes, NPR then replaces StreamSage's transcriptions, which can be inaccurate and garbled, with a human version.

The obvious goal would be the ability to search these files without having to wait for the text version. In News.com's article Jay Webster, chief technology officer of interactive agency Fathom Online, said, "where it gets cool is if you could search on any keyword and find it within audio and that audio would come up in search results. But I don't think we're there yet."



About the Author:
WebProNews | Breaking eBusiness News Your source for investigative ebusiness reporting and breaking news.

More news_security_news Articles

SecurityProNews: Insider Reports Insider Reports RSS Feed


Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions

iEntry Featured Services: Jayde Member Services | Forums | Freeware | Advertise with Us

Virus Warnings

Subscribe to
SecurityProNews FREE!



[ more newsletters ]

article resources
Search Articles:
[advanced search]

WebProWorld.com
Get in-touch with industry experts and leaders
Post your site for review by expert and peers
Ask Security, IT, Development and Design questions

Free Membership: Join Now!

Visit WebProWorld.com

Titan Quest Forum
The #1 Titan Quest forum
Halo 3 Forum
The best Halo, Halo 2, Halo 3 forum
Nintendo Wii
Nintendo Wii news and views
Mac Software
The best in OS X freeware
Graphics Forum
Your source for graphic tutorials
SecurityProNews.com | Breaking eBusiness News Get Your IT Questions Answered - Click Here SecurityProNews News Feeds