Dave Gershgorn | Quartz | March 1, 2017 | 0 Comments

Microsoft’s AI is Learning to Write Code by Itself, Not Steal It

Volodymyr Kyrylyuk/Shutterstock.com

What if instead of searching through menus within programs like Microsoft Excel, our computers could understand the problem we’re trying to solve and write the software to solve it? It’s a hyper-futuristic idea, but one that has recently seen progress from Microsoft Research and the University of Cambridge.

In a November 2016 paper, which gained notoriety after being accepted into one of the year’s largest artificial intelligence conferences, Microsoft and Cambridge built an algorithm capable of writing code that would solve simple math problems. The algorithm, named DeepCoder, would be able to augment its own ability by also looking at potential combinations of code for how a problem could be solved. (It’s a bit complicated; we’ll break it down later.)

However, this doesn’t mean it steals code, or copy and pastes it from existing software, or searches the internet for solutions, as some reports have claimed.

“We’re targeting the people who can’t or don’t want to code, but can specify what their problem is,” says Marc Brockschmidt of Microsoft Research, a co-author of the paper, likening the work to Excel formulas that could take simple commands to solve for answers without being given the mathematical equation.

The system can be broken down into two distinct parts: the code-writing algorithm, and the mechanism to search through potential code.

Automated Code

The code-writing algorithm doesn’t work simply, but here’s how to think of it in simple terms.

A math problem has inputs and outputs—or the numbers you have and the number you need to calculate. The researchers took simple problems that had been solved with very basic bits of code, and showed the algorithm the inputs, outputs, and the code used to solve them.

Think of it like a tower of blocks. Researchers showed the algorithm the building blocks, and then the finished picture of what the tower should eventually look like. But the algorithm needs to know how to line up each block’s edges and make them stand on top of each other. While that might be easy for humans to figure out, it’s a tough problem for machines, which have no idea about gravity and why large blocks should support smaller ones.

But if machines are shown how blocks stack and fit together to eventually resemble the completed tower, by seeing hundreds or thousands of towers built, then they’ll be able to build towers with similarly shaped blocks. In terms of DeepCoder, the blocks are little bits of code.

When asked to solve a new problem, instead of using code that worked before, the algorithm predicts what code would have been used to solve similar problems it has seen before, and in what order.

Entire programming languages would be far too complex for these algorithms, so the Microsoft and Cambridge team made a smaller language, called a domain-specific language, like a toddler’s set of blocks. The system was able to solve simple programming challenges online, ranging from about 3-6 lines of code. The problems were descriptions of mathematical scenarios, plus inputs.

A World of Possibilities

To augment its ability to write code, DeepCoder also has the ability to scour potential code for a solution that could work. It’s not scanning popular code repositories like StackOverflow or GitHub for a solution, although the authors would love for that to happen in the future.

“We look at all possible programs you could write with this language under a certain length,” Brockschmidt said. “Computers are very good at searching through these things.”

He likens the task to trying to build a sentence about a fox jumping over a dog with only a few words and no knowledge of language.

You might start out with “Fox fox fox fox fox fox fox fox fox,” then “Fox fox fox fox fox fox fox fox dog,” and then on and on, until the proper sentence emerges. Another way to think about this is the “infinite monkey theorem,” according to Salesforce AI researcher Stephen Merity, in which an infinite number of monkeys typing for an infinite amount of time would produce Shakespeare. It’s the same thing here, except the monkey’s keyboards type blocks of code instead of letters.

But the ability to search through all the variations of programs and find the correct one is the team’s real contribution: The algorithm predicts which bits of code are most likely to be used to solve a problem, and then looks at those solutions first. If the algorithm finds one, the problem is seen as solved and it learns a little more about what proper code looks like.

Searching for code sounds like what humans do—going online to find a few lines of code that solves their problem—but it’s really just the closest term to describe the algorithm’s process of generating and sorting code. If a human were to do what the AI does, it would be like typing every combination of code they can think of, and then copying and pasting the code they just wrote into a new file and seeing if it works.

In the future, the Microsoft and Cambridge team says they want this system to understand the nuances of complete coding languages, and be able to recognize good code online.


Thank you for subscribing to newsletters from Nextgov.com.
We think these reports might interest you:

  • Modernizing IT for Mission Success

    Surveying Federal and Defense Leaders on Priorities and Challenges at the Tactical Edge

  • Communicating Innovation in Federal Government

    Federal Government spending on ‘obsolete technology’ continues to increase. Supporting the twin pillars of improved digital service delivery for citizens on the one hand, and the increasingly optimized and flexible working practices for federal employees on the other, are neither easy nor inexpensive tasks. This whitepaper explores how federal agencies can leverage the value of existing agency technology assets while offering IT leaders the ability to implement the kind of employee productivity, citizen service improvements and security demanded by federal oversight.

  • Effective Ransomware Response

    This whitepaper provides an overview and understanding of ransomware and how to successfully combat it.

  • Forecasting Cloud's Future

    Conversations with Federal, State, and Local Technology Leaders on Cloud-Driven Digital Transformation

  • IT Transformation Trends: Flash Storage as a Strategic IT Asset

    MIT Technology Review: Flash Storage As a Strategic IT Asset For the first time in decades, IT leaders now consider all-flash storage as a strategic IT asset. IT has become a new operating model that enables self-service with high performance, density and resiliency. It also offers the self-service agility of the public cloud combined with the security, performance, and cost-effectiveness of a private cloud. Download this MIT Technology Review paper to learn more about how all-flash storage is transforming the data center.


When you download a report, your information may be shared with the underwriters of that document.