What the race for AI means for data science worldwide.

Although it may still be recovering from the effects of the government shutdown, data science has received a lot of positive attention from the United States Government. Two major recent milestones include the OPEN Government Data Act, which passed in January as part of the Foundations for Evidence-Based Policymaking Act, and the American AI Initiative, which was signed as an executive order on February 11th.

So what's going on, and what does all this mean for data science?

The first thing to consider is why and more specifically who the US administration has passed these recent measures for. Although it's not mentioned in either of the documents, any political correspondent who has been following these topics could easily explain that they are intended to stake a claim against China.

China has stated its intention to become the world leader in data science and AI by 2030. And with far more government access, data sets (a benefit of China being a surveillance state) and an estimated $15 billion in machine learning, they seem to be well on their way. In contrast, the US has only $1.1 billion budgeted annually for machine learning.

So rather than compete with the Chinese government directly, the US appears to have taken the approach of convincing the rest of the world to follow their lead, and not China’s. They especially want to direct this message to the top data science companies and researchers in the world (especially Google) in order to keep their interest in American projects.

So then what do these measures do?

On the surface level, both the OPEN Government Data Act and the American AI Initiative strongly encourage government agencies to amp-up their data science efforts. The former is somewhat self-explanatory in name, as it requires agencies to publish more machine-readable publicly-available data, and requires more use of this data in improved decision making. It imposes a few minimal standards for this, and also establishes the position of Chief Data Officers at federal agencies. The latter is somewhat similar in that it orders government agencies to re-evaluate and designate more of their existing time and budgets towards AI use and development, also for the purposes of better decision making.

Critics are quick to point out that the American AI Initiative does not actually allocate more resources for its intended purpose, nor does either measure directly impose incentives or penalties. This is not much of a surprise given the general trend of cuts to science funding under the Trump administration. Thus the likelihood that government agencies will actually follow through with what these laws ‘require’ has been given skeptic estimations.

However, this is where it becomes important to remember the overall strategy from the current US administration. Both documents include copious amounts of values and standards that the US wants to uphold when it comes to data, machine learning, and artificial intelligence. These may be the key aspects that can hold up against China, having a government that receives a hefty share of international criticism for their use of surveillance and censorship. (Again, this has been a major sticking point for companies like Google.)

These are some of the major priorities brought forth in both measures: Make federal resources, especially data and algorithms, available to all data scientists and researchers; Prepare the workforce for technology changes like AI and optimization; Work internationally towards AI goals while maintaining American values; and finally, Create regulatory standards, to protect security and civil liberties in the use of data science.

So there you have it. Both countries are undeniably powerhouses for data science. China may have the numbers in their favor, but the US would like the world to know that they have American spirit.

Finally, what does this mean for everyone who's not working for the US or China?

In short, the phrase “a rising tide lifts all ships” seems to fit here. While the US and China compete for data science dominance at the government level, everyone else can stand atop this growing body of innovations and make their own.

The thing data scientists can get excited for in the short-term is the release of a lot of new data from US federal sources, or the re-release of such data in machine readable formats. The emphasis is on the public part - meaning that anyone, not just US federal employees or even citizens, can use this data. To briefly explain for those less experienced in the realm of machine learning and AI, having as much data to work with as possible helps scientist to train and test programs for more accurate predictions.

A lot of what made the government shutdown a dark period for data scientists suggest the possibility of a golden age in the near future.

US-AI vs Chin-AI