'Aurora' code circulated for years on English sites

Tuesday, January 26, 2010

An error-checking algorithm found in software used to attack Google and other large companies circulated for years on English-speaking websites, casting doubt on claims it provided strong evidence that the malware was written by someone inside the People's Republic of China.

The smoking gun said to tie Chinese-speaking programmers to the Hydraq trojan that penetrated Google's defenses was a cyclic redundancy check routine that used a table of only 16 constants. Security researcher Joe Stewart said the algorithm "seems to be virtually unknown outside of China," a finding he used to conclude that the code behind the attacks dubbed Aurora "originated with someone who is comfortable reading simplified Chinese."

"In my opinion, the use of this unique CRC implementation in Hydraq is evidence that someone from within the PRC authored the Aurora codebase," Stewart wrote here.

In fact, the implementation is common among English-speaking programmers of microcontrollers and other devices where memory is limited. In 2007, hardware designer Michael Karas discussed an almost identical algorithm here. Undated source code published here also bears more than a striking resemblance.

"Digging this a little deeper though, the algorithm is a variation of calculating CRC using a nibble (4 bits) instead of a byte," programmer and Reg reader Steve L. wrote in an email. "This is widely used in single-chip computers in the embedded world, as it seems. I'd hardly call this a new algorithm, or [an] obscure one, either."

Two weeks ago, Google said it was the victim of highly sophisticated attacks originating from China that targeted intellectual property and the Gmail accounts of human rights advocates. The company said similar attacks hit 20 other companies in the internet, finance, technology, media and chemical industries. Independent security researchers quickly raised the number of compromised companies to 34.

But Google provided no evidence that China was even indirectly involved in the attacks targeting its source code. During a conference call last week with Wall Street analysts, CEO Eric Schmidt said only that that world's most populous nation was "probably" behind the attacks.

One of the only other reported links between China and the attacks is that they were launched from at least six internet addresses located in Taiwan, which James Mulvenenon, the director of the Center for Intelligence Research and Analysis at Defense Group, told The Wall Street Journal is a common strategy used by Chinese hackers to mask their origin. But it just as easily could be the strategy of those trying to make the attacks appear to have originated in China.

The claim that the CRC was lifted from a paper published exclusively in simplified Chinese seemed like the hard evidence that was missing from the open-and-shut case. In an email to The Register, Stewart acknowledged the CRC algorithm on 8052.com was the same one he found in Hydraq, but downplayed the significance.

"The guy on that site says he has used the algorithm, didn't say he wrote it," Stewart explained. "I've seen dates on some of the Chinese postings of the code dating back to 2002."

Maybe. But if the 16-constant CRC routine is this widely known, it seems plausible that attackers from any number of countries could have appropriated it. And that means Google and others claiming a China connection have yet to make their case.

The lack of evidence is important. Google's accusations have already had a dramatic effect on US-China relations. If proof beyond a reasonable doubt is good enough in courts of law, shouldn't it be good enough for relations between two of the world's most powerful countries?

1 comment

Anonymous said...

Here's another example of nibble CRC16 from 2003 on IDG listserv:


February 3, 2010 at 1:54 PM

Post a Comment