Microsoft believes its AI can accurately detect security bugs
System can distinguish between security and non-security software bugs 99 percent of the time
Microsoft has launched a new system that it says can correctly distinguish between security and non-security software bugs 99 percent of the time.
Microsoft used a data set of 13 million work items and bugs from 47,000 of its developers stored across AzureDevOps and GitHub repositories to develop a process and machine learning model that correctly distinguishes between security and non-security bugs.
The system is also able to accurately identify critical, high-priority security bugs on average 97 percent of the time.
- Microsoft reveals new code integrity feature for Linux
- All GitHub features are now free for everyone
- Microsoft to pay cash bounties on Xbox bugs
In the coming months, the company plans to open source the methodology on GitHub along with example models and other resources so that the system can be used to help support human experts.
While developing its model, security experts approved the training data and the statistical sampling that was used to provide them with a manageable amount of data to review. This data was then encoded into representations called feature vectors as researchers at Microsoft went about designing the system using a two-step process.
The model first learned to classify security and non-security bugs and then it learned to apply security labels (critical, important or low-impact) to those bugs.
Identifying security bugs
In order to make its bug predictions, Microsoft's model leverages two techniques. The first is an information retrieval approach called frequency-inverse document frequency algorithm (TF-IDF) which identifies how many times a word appears in a document and then checks how relevant the word is in a collection of titles. According to Microsoft, its bug titles are usually quite short and contain around 10 words.
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
The second technique the software giant uses is a logistic regression model that utilizes a logistic function to model the probability of a certain class or event existing.
In a blog post announcing the new system, Microsoft explained how it used machine learning models and security experts to better identify security bugs, saying:
“Every day, software developers stare down a long list of features and bugs that need to be addressed. Security professionals try to help by using automated tools to prioritize security bugs, but too often, engineers waste time on false positives or miss a critical security vulnerability that has been misclassified. To tackle this problem data science and security teams came together to explore how machine learning could help. We discovered that by pairing machine learning models with security experts, we can significantly improve the identification and classification of security bugs.”
Microsoft's new bug detecting system has already been deployed in its internal production and it is also continually retrained with data approved by the company's security experts who monitor how many bugs are generated during software development.
- Also check out our complete list of the best antivirus software
Via VentureBeat
After working with the TechRadar Pro team for the last several years, Anthony is now the security and networking editor at Tom’s Guide where he covers everything from data breaches and ransomware gangs to the best way to cover your whole home or business with Wi-Fi. When not writing, you can find him tinkering with PCs and game consoles, managing cables and upgrading his smart home.