The Security Challenges and Defense of Hidden Data

Security professionals continue to improve their organization’s defenses, causing malicious actors (MAs) to find ways to circumvent controls. One way is to hide what they are doing from antivirus and filtering solutions. Using steganography, insiders can steal large amounts of information, even when under strong security oversight. Steganography also allows the undetected downloading of malware and the post-implementation malicious command and control traffic. Steganography defense requires identifying potential exploit opportunities and implementation of controls to prevent and hinder its use.

I conducted all image and audio steganography testing with the free SilentEye applicationOpens a new window . I used Spam MimicOpens a new window for word- and line-shift testing.

What is Steganography

Steganography is the process of hiding information in some medium. The medium is known as the carrier. Figure 1 depicts how this works at its most basic level. The carrier is anything the sender can use to hide the message. The sender follows a process to embed the message in the carrier in a way that hides its existence.

In some cases, the message is encrypted before it is hidden. Because the carrier does not look like it is carrying a message, the sender can convey it in any unsecured method. Once the recipient receives the carrier, she reverses the embedding process to extract the message.

The first recorded use of steganography was circa 440 B.C. (Siper, Farley, & Lombardo, 2005). It involved shaving the head of a slave (carrier). The message was tattooed on the slave’s head; then they waited until the hair came back enough to cover the message. At this point, the slave traveled to the recipient, who shaved his head to read the message.

The Security Threat

We have come far from shaved heads. With the use of digital carriers and the internet, MAs can secretly exploit and control systems. The following are examples from the McAfee Labs Threats Report June 2017.

Sundown. The exploit code is hidden in an entirely white PNG file.
Cerber. The victim opens a Word document infected with a macro that downloads a JPEG file containing exploit code.
Vawtrak. The exploit settings are stored in favicon (.ICO) files. The settings are used to download the exploit payload.
Zbot. The configuration information is stored in a JPEG file.
Lurk. URLs used to install exploit payloads are stored in a BMP image.
Stegaloader. A PNG file containing the exploit payload is downloaded from a legitimate website.

In addition to exploit delivery, insider threats and MAs can use innocent-looking media to transfer sensitive information from the internal network to external resources. As I show later, it is almost impossible for filtering and other data loss prevention efforts to detect unauthorized data exfiltration.

Finally, MAs hide command and control content in network protocols. This, and the use of other media to hide any stolen information, circumvents most or all efforts to detect a successful exploit.

Steganography Methods

For this article, I focus on two general applications of steganography: data theft and exploit delivery/management. The following sections provide some of the approaches to achieving malicious objectives using steganography.

Data Theft

Carriers used for hiding data fall into four subcategories: Text, Metadata in Office files, Images and Audio/video

Text. Hiding information within existing, innocuous text is easy. There are several ways to do this, but I focus on two: Feature coding and spacing/shifting. Feature coding involves the manipulation of text features (Roy, 2011). For example, the end parts of some characters are elongated or shortened. Other methods adjust vertical, horizontal, or diagonal character lines.

Another way to hide data in a text file is by using spaces, tabs, and carriage returns. Figure 2 shows how this works. The text on the left contains hidden information. It looks normal. However, when you load it into Word and make formatting marks visible, you see the results on the right. Periods are spaces and arrows are tabs.

Neither of these options is suitable for hiding copious amounts of information. For example, spacing and shifting about 100 records of the type shown in Figure 3 (from the test data set I used for this article) resulted in 27 blank pages added to the end of a one-page input sample.

Metadata in Office Files. All Office files have metadata that describes the file. One of the commonly used metadata sets is shown in Figure 4. You can access these fields by going to the Summary tab in document Properties. The critical thing to understand is the ability to hide large amounts of data in the Comments field.

I used an acceptable use policy template (DOCX) to hide 5,000 test records. The file size changed from 33 KB to 236 KB. Increasing the number of hidden records to 20,000 placed in Comments resulted in a file size of 843 KB. Neither of these larger file sizes is likely to cause alerts when passing out over the network perimeter. Any employee can use this method without installing any special software or having any special skills.

Images. One of the most common carriers is the image file: all types of digital image files. MAs can hide large amounts of information in them using modern steganography tools. These tools can compress and encrypt the stolen information before mapping it to an image.

Figure 5 shows how mapping works. Each color displayed in an image is composed of three pixels: red, green, and blue, as shown on the left. When we write the letter A to this pixel set, it can look like the bits on the right. As the number of pixels increases, the amount of information that can be stored (without significantly affecting the image quality) increases.

Figure 6 shows what happens as we add information to an image. As you can see, a JPEG file quickly deteriorates. However, a BMP file can hold a large amount of information with little deterioration. One interesting thing of note is the reduction in the size of the JPEG image file as we add information. Also, the size of the BMP file only increases from 3.6 MB to 3.8 MB.

Audio/Video. It is also possible to place large amounts of information in audio and video files. As an example, I used a 60-second clip of the Pink Panther theme. The size when downloaded was 2.6 MB. I placed 1000 records from the test set in the audio. There was no noticeable difference in replay quality. However, when I added 13,000 records, the replay was mostly static. If no one listened to the audio, it would go unnoticed. The audio file remained 2.6 MB.

Attack Command and Control

I introduced this article with a list of malware that use image steganography to hide exploit payloads and configurations. Here, I describe how MAs can hide command and control (C2) communication. Use of network protocol characteristics across the OSI model is one way this is done.

Lubacz, Mazurczyk, and Szczypiorski (n.d.) provide a general model for how this is possible.

C1 â€“ some functions of communication protocols are modified
C2 â€“ the modification pertains to
C2a â€“ functions of the protocols introduced to manage the intrinsic imperfection of communication channels (errors, delays, etc.)
C2b â€“ functions of the protocols introduced to define the type of information exchange (query/response, file transfer, etc.) or to adapt the form of messages (fragmentation, segmentation, etc.)
C3 â€“ the modifications are used by the communicating parties to make the observable effects of modifications challenging to discover but usable when knowing where or how to look

Examples of this model in practice include (Mazurczyk & Szczypiorski, 2014)

The MA divides an original IP packet into a predefined number of packets, where an even number might equal 1, and an odd number equals 0
The MA modulates the values placed into the Fragment Offset field in the IP datagram (see Figure 7), where an even number equals 1, and an odd number equals 0
The MA uses a legitimate fragment with hidden data within the payload

Figure 7, an IP datagram header, shows how the network layer of the OSI model is also used. The highlighted fields are used to exchange C2 and other information.

In addition to changing traffic characteristics, MAs can embed hidden information in HTTP, POP3, ICMP, and SMTP headers (Wendzel, Mazurcyk, Caviglione, & Meier, 2014). MAs can also use the NS, CNAME, and TXT records in DNS servers to hide information (Roy, 2011).

Steganography Defense

Detecting and managing malicious use of steganography already happening on a network is very difficult. According to Barwise (2018),

â€œIf an adversary is to able to penetrate a network successfully and unsuspectingly install malware onto a system that uses digital steganography to hide its presence, then the network and all associated data contained therein should be considered entirely compromised (Theoretical Framework).â€

This is a good description of how difficult it is to detect and respond to the use of hidden data techniques against your information resources. Antivirus and IPS are not likely to detect malicious content in images or audio. It is difficult to detect network-based steganography with monitoring solutions. Consequently, the best approach to steganography defense is the implementation of known ways to prevent infiltration of malware and unwanted utility software.

Prevention

The first step is the identification of ways steganographic tools and infected carriers can find their ways onto your network. The next step is to block them. In addition to implementing antivirus, IPS, and firewalls according to current best practices,

Remove local admin access from all day-to-day accounts
Only allow installation of whitelisted applications
Strictly enforce least privilege and need-to-know
Segment the network and prevent access to database servers to anything but application servers and strictly manage traffic entering and leaving the segments by using explicit allows
Ensure all applications that access database servers have strong input validation
Prohibit or strictly manage script and macro execution
Consider blocking or alerting on suspicious movement of certain file types, including stripping them from all email messages: image files, audio files, video files, larger than normal Office applications (normal for your organization)
Only download and install applications or other media from the internet that includes a valid hash value you can check
Block general use of USB storage
Train users not to download images, songs, video, and other media from the internet, especially from social networking sites

How an organization approaches these controls depends on its unique operating environment and management’s willingness to deal with the potential employee frustration. It is all about risk and management’s appetite for risk.

Deep Secure developed a novel approach to prevention. Their content threat removal tools assume all content is compromised. Original content is not delivered to the recipient. Instead, obvious business/functional information is stripped and placed into a new document/file. This reconstructed document/file is delivered, and the original is dropped.

Monitoring and Detection

As always, assume malicious actors find ways to circumvent your prevention controls.

Monitor network behavior for anomalous packet traffic such as that described in the section, Attack Command and Control
Monitor user behavior for unusual access and large data transfers
Scan all computers, especially user devices, for steganography tools
Periodically use forensics tools to test all or a meaningful sample of potential carriers found on the network to determine if they might contain hidden information

Conclusion

Steganography is used by malicious actors to infect networks, steal sensitive information, and execute command and control tasks. Detection ranges from difficult to impossible. The best approach today is to identify potential threats and vulnerabilities related to steganography and manage them with prevention controls. Monitoring controls will not find all instances that make it onto your network.

Much research is being done to come up with solutions to these challenges. Hopefully, solutions will eventually emerge to help identify steganography activity in real time. Until then, a thorough understanding of what we face, and how to manage what we can, is the best approach.

Works Cited

Barwise, I. (2018). Digital Steganography as an Advanced Malware Detection Evasion Technique. Retrieved May 2019, from Medium: https://medium.com/@z3roTrust/digital-steganography-as-an-advanced-malware-detection-evasion-technique-40d4eeb19830

Lubacz, J., Mazurczyk, W., & Szczypiorski, K. (n.d.). Principles and Overview of Network Steganography. Warsaw, Poland: Institute of Telecommunications, Warsaw University of Technology.

Mazurczyk, W., & Szczypiorski, K. (2014, July). Steganography in Handling Oversized IP Packets. Retrieved May 2019, from Researchgate: https://www.researchgate.net/publication/45860059

Roy, S. (2011, January). A novel approach to format based text steganography. Retrieved May 2019, from Researchgate: https://www.researchgate.net/publication/220846511

Siper, A., Farley, R., & Lombardo, C. (2005). The Rise of Steganography. Proceedings of Student/Faculty Research Day, CSIS, Pace University.

Wendzel, S., Mazurcyk, W., Caviglione, L., & Meier, M. (2014). Hidden and Uncontrolled â€” On the Emergence of Network Steganographic Threats. ISSE 2014 Securing Electronic Business Processes, doi:10.1007/978-3-658-06708-3, pp. 123-133.