Tuesday, June 27, 2023

CVE-2023-35845: Anaconda3 creates numerous world-writable files on install

Background

Well over a year ago I installed Anaconda3 on my laptop for some work. When the installation completed I noticed that numerous files had been installed world-writable. This is less then ideal obviously but it was my laptop so I fixed the issue and moved on. Fast-forward to September 2022 and I'm left cleaning up these issues after every upgrade or install at a customer site. This is where I become annoyed that this issue still hasn't been resolved so I figured I better report it.

After an unacceptably long delay to try to get resolution of this issue I am publishing the details.

Technical Details

Installing any version of the Anaconda or Miniconda on Linux results in numerous world-writable files. For example, below I took the latest version from the website and performed the batch (no prompt) install to $HOME/anaconda3:

[jeremy@test1 ~]$ bash Anaconda3-2023.03-1-Linux-x86_64.sh -b -p $HOME/anaconda3
PREFIX=/home/jeremy/anaconda3
Unpacking payload ...

Installing base environment...

Downloading and Extracting Packages

Downloading and Extracting Packages

Preparing transaction: done
Executing transaction: |

    Installed package of scikit-learn can be accelerated using scikit-learn-intelex.
    More details are available here: https://intel.github.io/scikit-learn-intelex

    For example:

        $ conda install scikit-learn-intelex
        $ python -m sklearnex my_application.py

done
installation finished.

With the installation finished I ran a quick check of the permissions for files and directories with the other writable bit set. As you can see from the two commands below they are all files. Previously I have found directories as well.

[jeremy@test1 ~]$ find /home/jeremy/anaconda3 -perm /0002 $ -type f -o -type d $ | wc
606 606 58235
[jeremy@test1 ~]$ find /home/jeremy/anaconda3 -perm /0002 -type f | wc
606 606 58235

To see what those files look like permission wise you can run a find command filtering for files that are world-writable with the "-perm /0002" option:

[jeremy@test1 ~]$ find /home/jeremy/anaconda3 -perm /0002 -type f -ls
3773199     92 -rwxrwxrwx   2 jeremy   jeremy      91648 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/w32.exe
3773231    108 -rwxrwxrwx   2 jeremy   jeremy     108032 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/t64.exe
3773242    168 -rwxrwxrwx   2 jeremy   jeremy     168448 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/w64-arm.exe
3773208     96 -rwxrwxrwx   2 jeremy   jeremy      97792 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/t32.exe
3773215    100 -rwxrwxrwx   2 jeremy   jeremy     101888 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/w64.exe
3773253    180 -rwxrwxrwx   2 jeremy   jeremy     182784 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/distlib/t64-arm.exe
3773149      4 -rw-rw-rw-   2 jeremy   jeremy        469 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/vendor.txt
3773176    280 -rw-rw-rw-   2 jeremy   jeremy     286370 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
3773161      4 -rw-rw-rw-   2 jeremy   jeremy        286 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/py.typed
3773169      4 -rw-rw-rw-   2 jeremy   jeremy       4072 Dec 9 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip-22.3.1-py3.10.egg-info/PKG-INFO
...
3713940      4 -rwxrwxrwx   1 jeremy   jeremy        740 Aug 31 2020 /home/jeremy/anaconda3/pkgs/libllvm10-10.0.1-hbcb73fb_5/info/recipe/parent/install_llvm.bat
3714015      4 -rwxrwxrwx   1 jeremy   jeremy       1387 Aug 31 2020 /home/jeremy/anaconda3/pkgs/libllvm10-10.0.1-hbcb73fb_5/info/recipe/parent/install_llvm.sh
3713958      4 -rwxrwxrwx   1 jeremy   jeremy        726 Sep 4 2020 /home/jeremy/anaconda3/pkgs/libllvm10-10.0.1-hbcb73fb_5/info/recipe/parent/xcode-select
3714009      4 -rwxrwxrwx   1 jeremy   jeremy       1387 Aug 31 2020 /home/jeremy/anaconda3/pkgs/libllvm10-10.0.1-hbcb73fb_5/info/recipe/install_llvm.sh
3657118      4 -rwxrwxrwx   1 jeremy   jeremy       3364 Jan 5 2021 /home/jeremy/anaconda3/pkgs/jupyterlab_widgets-1.0.0-pyhd3eb1b0_1/info/recipe/recipe_log.txt
3657112      4 -rwxrwxrwx   1 jeremy   jeremy        828 Jan 5 2021 /home/jeremy/anaconda3/pkgs/jupyterlab_widgets-1.0.0-pyhd3eb1b0_1/info/recipe/meta.yaml.template
3279942      4 -rwxrwxrwx   1 jeremy   jeremy       1923 Jun 13 2019 /home/jeremy/anaconda3/pkgs/backports.weakref-1.0.post1-py_1/info/recipe/recipe_log.txt
3279943      4 -rwxrwxrwx   1 jeremy   jeremy       1091 Jun 12 2019 /home/jeremy/anaconda3/pkgs/backports.weakref-1.0.post1-py_1/info/recipe/meta.yaml.template
3709519      8 -rwxrwxrwx   1 jeremy   jeremy       4402 May 4 2020 /home/jeremy/anaconda3/pkgs/atomicwrites-1.4.0-py_0/info/recipe/recipe_log.txt
3709528      4 -rwxrwxrwx   1 jeremy   jeremy        774 May 4 2020 /home/jeremy/anaconda3/pkgs/atomicwrites-1.4.0-py_0/info/recipe/meta.yaml.template

I'm not sure how frequently if ever these files are used but one file stuck out to me when I reviewed the list. The cacert.pem file used by the python certify module under their pip installation. Since certify provides the trusted root certificate authorities, any manipulation of these could allow untrusted content to pass through SSL connections via Man-in-the-middle attack. You can see below there are two files found but since they are the same inode number we know they are actually a hardlink:

[jeremy@test1 ~]$ find /home/jeremy/anaconda3 -type f -perm /0002 -name cacert.pem -ls
1986492 280 -rw-rw-rw- 2 jeremy jeremy 286370 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
1986492 280 -rw-rw-rw- 2 jeremy jeremy 286370 Nov 5 2022 /home/jeremy/anaconda3/pkgs/pip-22.3.1-py310h06a4308_0/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem

For illustrations purposes here I created and used a different account (test) to truncate the cacert.pem files and show they have been truncated to 0 bytes.

[test@test1 ~]$ find /home/jeremy/anaconda3/ -type f -perm /0002 -name cacert.pem -ls
1986492    280 -rw-rw-rw-   2 jeremy   jeremy     286370 Nov 5 2022 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
1986492    280 -rw-rw-rw-   2 jeremy   jeremy     286370 Nov 5 2022 /home/jeremy/anaconda3/pkgs/pip-22.3.1-py310h06a4308_0/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
[test@test1 ~]$ truncate --size=0 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
[test@test1 ~]$ find /home/jeremy/anaconda3/ -type f -perm /0002 -name cacert.pem -ls
1986492      0 -rw-rw-rw-   2 jeremy   jeremy          0 Jun 5 17:28 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
1986492      0 -rw-rw-rw-   2 jeremy   jeremy          0 Jun 5 17:28 /home/jeremy/anaconda3/pkgs/pip-22.3.1-py310h06a4308_0/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem

Moving back to my normal user, if these files are used by the Anaconda3 packaged version of pip this should result in SSL error:

[jeremy@test1 ~]$ /home/jeremy/anaconda3/bin/pip install tensorflow==
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))': /simple/tensorflow/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))': /simple/tensorflow/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))': /simple/tensorflow/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))': /simple/tensorflow/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))': /simple/tensorflow/
Could not fetch URL https://pypi.org/simple/tensorflow/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/tensorflow/ (Caused by SSLError(SSLError(136, '[X509: NO_CERTIFICATE_OR_CRL_FOUND] no certificate or crl found (_ssl.c:4123)'))) - skipping
ERROR: Could not find a version that satisfies the requirement tensorflow== (from versions: none)
ERROR: No matching distribution found for tensorflow==

As you can see we do in fact get an error that no valid cert is found. This is due to the truncated root CA certs file. Clearly there are significant risks besides a Denial of Service (DoS) from Anaconda's creation of these world-writable files.

Disclosure Timeline:

Sep 22, 2022 - Reported world-writable vulnerability including list of files
Sep 22, 2022 - Confirmation from Anaconda
Sep 23, 2022 - Provided an easy way to reproduce to Anaconda
Sep 23, 2022 - Anaconda confirms reproducer and says "We are working on a fix for this issue and will notify you. once a fix is available"
Nov 16, 2022 - I request status update
Nov 16, 2022 - Anaconda responds "Our security team is taking a look at this issue this week and we can give you a better time frame estimation once this has been completed. "
Nov 23, 2022 - Anaconda updates "Our security group has reviewed this issue, and are prioritizing it for a future sprint cycle"
Jan 31, 2023 - With no response from Anaconda I test and the issue still exists. At this point I research if a CVE is filed and find similar (though not the sames) issues in Windows reported as: CVE-2022-26526.
Feb 20, 2023 - Anaconda responds that CVE-2022-26526 is fixed but the issue I reported "is unfortunately not a straightforward fix".
Feb 28, 2023 - I respond that any proper fix should take into account the user's umask during installation.
Feb 28, 2023 - Anaconda acknowledges my umask email
Jun 1, 2023 - I inquire again for the status of a fix
Jun 1, 2023 - Anaconda erroneously believes "fixes for this implemented in Anaconda 2022.05 and Miniconda3 4.12.0 and beyond"
Jun 2, 2023 - I verify Linux is still vulnerable and email them to let them know
Jun 2, 2023 - I finally file for a CVE with MITRE as the CNA-LR
Jun 18, 2023 - MITRE rerves CVE-2023-35845 for this issue
Jun 22, 2023 - I provide one more email to Anaconda notifying them of the impending CVE and receive no response as of this writing on Jun 27.

Solutions

Given this information it's clear that there are risks and possibly significant risks besides a Denial of Service (DoS) from Anaconda's creation of these world-writable files. Luckily if you already upgraded the pip packaged with Anaconda it appears to fix this world-writable cacert.pem file limiting the one illustrated path to exploitation here:

[jeremy@test1 ~]$ $HOME/anaconda3/bin/pip install --upgrade pip
Requirement already satisfied: pip in ./anaconda3/lib/python3.10/site-packages (22.3.1)
Collecting pip
Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
    Found existing installation: pip 22.3.1
    Uninstalling pip-22.3.1:
      Successfully uninstalled pip-22.3.1
Successfully installed pip-23.1.2

You can see there does exist one of the cacert.pem file but this is not the one used by the Anaconda3 pip (a utility detailed below confirmed that for me):

[jeremy@test1 ~]$ find $HOME/anaconda3 -type f -perm /0002 -name cacert.pem -ls
3773176 280 -rw-rw-rw- 1 jeremy jeremy 286370 Nov 5 2022 /home/jeremy/anaconda3/pkgs/pip-22.3.1-py310h06a4308_0/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem

Regardless of the pip solution the best strategy for now appears to be to strip all "other" write permissions from your Anaconda3 installation directory any time you install or update software:

[jeremy@test1 ~]$ find $HOME/anaconda3 -type f -perm /0002 -exec chmod o-w {} \;

Detection of Similar Problems

Finally, if your interested in a utility to capture information about what opens, executes, or changes permissions to create world-writable I've create an eBPF program to help identify that and stored in my github repo here:

https://github.com/jfilizetti/ebpf-security-tools

Here is an example capture of the cacert.pem described above:

[root@test1 jeremy]# ./world_writable_monitor.py -f regular,directory | grep cacert.pem
chmod 666 1000:1000 93877 conda.exe 0 -100 /home/jeremy/anaconda3/pkgs/pip-22.3.1-py310h06a4308_0/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem
open 666 1000:1000 93910 pip 0 -100 /home/jeremy/anaconda3/lib/python3.10/site-packages/pip/_vendor/certifi/cacert.pem

Monday, June 26, 2023

Modern NVIDIA GPU Transcoding Comparison

There is nothing quite like new hardware. I'm always looking for ways to speed up my daily video processing so GPUs are always top of mind. Unfortunately I've never seen the kind of numbers I want from NVIDIA to know whether or not it's worth buying the latest and greatest of their data center class cards for encoding/decoding/transcoding/etc. When the marketing info comes out for a new card like the fairly new NVIDIA L4 you are left wondering how it stacks up against your particular challenge. I don't have 8 - L4s or a need to handle 1040 simultaneous 720p30 streams. What I do have is a dozen 4K streams that I need reduced to 1080p for longer term storage. I'd like to know how well a new L4 or even A2 card works for my purposes compared to my older cards like a GTX 1660 Ti or ancient GTX 1080.

Now that I've dumped the money into those cards I was surprised to see that the performance was not all that different compared to my oldest card which is a Pascal generation GTX 1080. At this point it was unclear if it was just the fact that the GTX 1080 had dual encoder chips or maybe because I was still running FFMPEG 4.3.2. I finally decided to pull and build the latest version of FFMPEG which was 5.1.2 at that time. Rerunning all tests it became clear to me that what I once thought was my "flagship" card for encoding was actually pretty outdated with the new software. All of my cards performed much better. The following graphic illustrates those numbers.

Note: For all the data below the FFMPEG command is roughly:

ffmpeg -vsync vfr -hwaccel_device 0 -i in.mp4 -vf "scale_npp=iw/2:ih/2" -vcodec hevc_nvenc -b:v 2M -r 15 out.mp4

It's probably worth a little explanation of the workload. The data is comprised of 496 video files that are 5 minutes in length. The breakdown is approximately:

100 - 4K 30 FPS HEVC videos
300 - 4K 15 FPS HEVC videos
100 - 5 MP 20 FPS H264 videos

Outside of the gains of the Turing, Ampere, and Ada Lovelace generation cards it's noteworthy that performance didn't really change for the Pascal generation GTX 1080 that I once consider my "flagship" cards.

Getting the most with the new version of FFMPEG

With the L4 card having 2 encoders and 4 decoders I found it necessary to throw more than one simultaneous transcode to get the most performance. It seemed that 2 would be optimal but I've found that when processing H264 video that the utilization of the card would drop as the nvenc/nvdec seem to be able to handle more where HEVC would saturate the nvenc/nvdec chip. You can see the stats during the processing of the prior mentioned dataset using a single transcode at a time. All stats are for utilization are from "nvidia-smi dmon -d 1".

The drop around 4000 seconds and 6000 seconds are the prior mentioned H264 datasets. Looking at two simultaneous transcodes you can see the processing time is reduced but not in half.

Finally I ran a test with 4 simultaneous transcodes and found the timeline dropped to about half of the original.

Given the above "optimal" processing on the L4 I produced some histograms of the FPS that were achieved for all of the prior mentioned cards as well as the various runs on the L4.

It's clear with 2 and 4 in parallel transcodes on the L4 there is more variability and the average FPS is lower but the overall dataset processes faster.

Conclusion

1. With NVIDIA chipsets is wise to keep upgrading as FFMPEG and the NVIDIA Video Codec SDK is always making performance improvements.

2. An optimal amount of simultaneous transcode processes depends on the codec so some trial and error is necessary.

3. Most nvenc/nvdec chips process roughly around the same rate. I've always thought this would not be the case. In the future I may try to dig deeper on this.

External Resources

You can always consult the matrix they have to know how many encoder and decoder chips are available per card at: https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new

Saturday, December 31, 2022

Deficiency Analysis of the Defense Information Systems Agency Security Technical Implementation Guide for Redhat Enterprise Linux 7

About a two and a half years ago I took a class on technical writing to wrap up the requirements for my multi-decade batchelor's degree. For that technical writing class my idea was to detail some of the problems related to cyber security in the federal government. Since securing systems in the government space has been a significant portion of my job for most of the 21st century I wanted to bring light to places we can improve and create more effective cyber security. I wrote this paper on what I called deficiencies with the RHEL7 STIG. Since this time I've tried my best to address many issues through official channels but I've officially given up trying to get DISA to embrace technical merit. My hope is that putting this information here for others can start a wider discussion about real security rather than relying on the DISA STIG bureaucratic process.

A link to the PDF that is roughly what I originally wrote is available here. However a slightly trimmed version is available below.

Deficiency Analysis of the Defense Information Systems Agency Security Technical Implementation Guide for Redhat Enterprise Linux 7

Abstract

The Department of Defense spends countless hours into security configuration, assessment, and documentation. These security configurations are often derived from Security Technical Implementation Guides (STIGs) that are published for various hardware and software. One of the most time consuming and problematic STIGs is related to operating systems for the Redhat Enterprise Linux system that is created by the Defense Information Systems Agency. This STIG is of vast complexity due to the complex nature of the operating system. As a result, there is often outdated information incorporated into this document. In addition, many changes recommended in this STIG have far reaching effects on performance and usability. In this report many deficiencies with the current DISA RHEL 7 STIG are investigated and approximately 40% of them require additional consideration.

Introduction

Cyber security is the fastest growing focus area of information technology systems in the Department of Defense. The goal of security is to provide Confidentiality, Integrity, and Availability also known in the security community as the CIA triad. To achieves these goals operating systems must continually be evaluated as new features are integrated to ensure that the security is not compromised. In addition, security often creates hurdles for usability and performance and therefore most operating systems are not shipped in a “secure by default” configuration. To mitigate this problem the DoD creates Security Technical Implementation Guides with coordination from operating system vendors to provide a security posture they consider to be secure. In this report I plan to evaluate the STIG for the Redhat Enterprise Linux (RHEL) 7 operating system to determine areas that have failed to properly account for usability, performance, Denial of Service (DoS), and other issues. This is report is based off of the latest STIG published by the Defense Information Systems Agency Version 2 release 7 dated April 24, 2020 referenced as [1]. This report is targeted at technical evaluators and decision makers to bring attention to the numerous issues created by the process of implementing security guidance without full understanding of the impacts. It is often joked that the only secure system is the one that isn’t plugged in. We must strive to provide effective use of tax payer dollars by squeezing all the performance we can get out of hardware and without hindering our operations and maintenance staff with burdensome requirements that affect their ability to do their jobs. I this assessment I will evaluate the following areas as it applies to the DISA STIG for RHEL 7 [1]:

Outdated Items
Performance Impacts
Potential DoS
Usability Impacts
Other Issues

There are several areas in this report where small excerpts of source code are provided from the RHEL/Centos 7.8 packages. These are included to provide assistance to technical staff who would like to validate the details in this report. Each inclusion of source code will have several surrounding lines to provide context, the line numbers, and source file identified. Also, every item detailed in this report also details the STIG ID numbers from [1] at the end of the section. This is to provide clarity and also assistance to any further review necessary

Outdated Items

Password Outdated Items

The current release for the RHEL 7 STIG [1] was released on April 24, 2020. Despite this recent release it is still out of date with the guidance from the National Institute of Standards and Technology (NIST) Special Publication 800-63 Digital Identity Guidelines [2]. There are several criteria that have been outdated with the latest recommendations and are detailed below. These guidelines apply to human passwords and not stored machine level passwords. Password Expiration No Longer Recommended According to [3] users tended to choose weaker passwords when they know password changes were imminent in the near future. The password changes often used “common transformations such as increasing a number in the password” [3]. The result is that security is not increased by these changes. While recommendations for temporal password expirations are no longer recommended event-based expirations still remain in place. Event-based password expirations are the result of some sort of breach of the password database.

Affected STIG IDs:

RHEL-07-010230, RHEL-07-010240, RHEL-07-010250, RHEL-07-010260

Password Composition Rules No Longer Recommended

The current requirements in [1] specify the following password composition rules:

At least one each of class (uppercase letters, lowercase letters, numbers, and symbols)
No more than 3 repeating characters
No more than 4 repeating characters in the same class

All these requirements have been removed as of [2]. The reasoning is again based on human factors. Users tended to use predictable methods of fulfilling these requirements which minimized any security gain. Another determined weakness of the old scheme is that these rules encouraged people to use the same password across multiple systems. This is particularly damaging for DoD users where accounts can exist on multiple networks. These multiple networks are very common in DoD environments where users may have an account on unclassified, secret, and top-secret systems.

Affected STIG IDs:

RHEL-07-010120, RHEL-07-010130, RHEL-07-010140, RHEL-07-010150, RHEL-07-010160, RHEL-07-010170, RHEL-07-010180, RHEL-07-010190, RHEL-07-010280

Additional Password Restrictions

The new mechanism of preventing weak passwords is to utilize a blacklist of common passwords. A specific list of potential candidates is included below from [2]:

Passwords obtained from previous breach corpuses
Dictionary words
Repetitive or sequential characters (e.g. ‘aaaaaa’, ‘1234abcd’).
Context-specific words, such as the name of the service, the username, and derivatives thereof

Password databases are included with the cracklib package that comes with the operating system. Any of these databases of blacklisted passwords can be added to the pam_pwquality module.

Password Based Key Derivation Functions (PBKDFs)

The updated recommendation for memorized secret verifiers should use time and memory hard key derivation functions. The only NIST evaluated algorithm that supports this requirement is Balloon [3]. This particular rule only affects one item in [1] related to PBKDFs which is for the grub2 boot loaded. Unfortunately, there is no support in grub2 for the Balloon algorithm. Given the lack of a current technical implementation this item should be prioritized to develop a solution.

Mount Options for /dev/shm

The temporary file system used for shared-memory applications is setup by the system init process, systemd. Several STID IDs require this file system to be mounted with the nodev and nosuid options. However, these options are already defined in the source code for systemd used by the operating system. The specific lines are highlighted below: Note: Some white-space has been trimmed from the following source code for readability.

Source file: systemd-219/src/core/mount-setup.c
 76 static const MountPoint mount_table[] = {
 77 { "sysfs", "/sys", "sysfs", NULL, MS_NOSUID|MS_NOEXEC|MS_NODEV,
 78   NULL,   MNT_FATAL|MNT_IN_CONTAINER },
 79 { "proc", "/proc", "proc", NULL,  MS_NOSUID|MS_NOEXEC|MS_NODEV,
 80   NULL,          MNT_FATAL|MNT_IN_CONTAINER },
 81 { "devtmpfs","/dev", "devtmpfs", "mode=755", MS_NOSUID|MS_STRICTATIME,
 82   NULL,          MNT_FATAL|MNT_IN_CONTAINER },
 83 { "securityfs", "/sys/kernel/security", "securityfs", NULL, MS_NOSUID|MS_NOEXEC|MS_NODEV,
 84   NULL,          MNT_NONE                   },
 85 #ifdef HAVE_SMACK
 86 { "smackfs", "/sys/fs/smackfs", "smackfs", "smackfsdef=*", MS_NOSUID|MS_NOEXEC|MS_NODEV,
 87   mac_smack_use, MNT_FATAL                  },
 88 { "tmpfs", "/dev/shm", "tmpfs","mode=1777,smackfsroot=*", MS_NOSUID|MS_NODEV|MS_STRICTATIME,
 89   mac_smack_use, MNT_FATAL                  },
 90 #endif
 91 { "tmpfs", "/dev/shm", "tmpfs","mode=1777", MS_NOSUID|MS_NODEV|MS_STRICTATIME,

Affected STIG IDs:

RHEL-07-021022, RHEL-07-021023

Disable Promiscuous Mode

The current recommendation to disable promiscuous mode is unnecessary as enabling it through the init scripts has been deprecated in RHEL 7 as detailed in [4].

Affected STIG ID: RHEL-07-040670

Performance Impacts

Auditing Performance

Auditing accounts for 71 of the 248 STIG items in [1]. Given the large footprint there should be some significant research into the impact of auditing on performance. However, there is minimal information to document the impact but the most comprehensive appears to be [5] from 2015 and [6] from 2018. [5] discussed the impact of audit overhead based on the frequency of audit events per second and is captured in Figure 1. [6] details the overhead of common programs and breaks down the performance by area of the audit system. Figure 2 details their findings

Figure 1 Audit Performance Overhead Source: [5]

Figure 2 Audit Overhead by Program Source: [6]

It is clear from both figures that auditing can add unacceptable overhead. In Figure 2 the overhead added by the kernel is minimal but the Netlink socket used to flush data to the audit daemon, and the audit daemon overhead itself is significant for various programs. Especially affected are programs like Firefox and Apache that generate many audit records. Further analysis should be conducted to determine how auditing performance affects scalability of a single system.

Affected STIG IDs:

RHEL-07-030000, RHEL-07-030010, RHEL-07-030300, RHEL-07-030310, RHEL-07-030320, RHEL-07-030330, RHEL-07-030340, RHEL-07-030350, RHEL-07-030360, RHEL-07-030370, RHEL-07-030380, RHEL-07-030390, RHEL-07-030400, RHEL-07-030410, RHEL-07-030420, RHEL-07-030430, RHEL-07-030440, RHEL-07-030450, RHEL-07-030460, RHEL-07-030470, RHEL-07-030480, RHEL-07-030490, RHEL-07-030500, RHEL-07-030510, RHEL-07-030520, RHEL-07-030530, RHEL-07-030540, RHEL-07-030550, RHEL-07-030560, RHEL-07-030570, RHEL-07-030580, RHEL-07-030590, RHEL-07-030610, RHEL-07-030620, RHEL-07-030630, RHEL-07-030640, RHEL-07-030650, RHEL-07-030660, RHEL-07-030670, RHEL-07-030680, RHEL-07-030690, RHEL-07-030700, RHEL-07-030710, RHEL-07-030720, RHEL-07-030740, RHEL-07-030750, RHEL-07-030760, RHEL-07-030770, RHEL-07-030780, RHEL-07-030800, RHEL-07-030810, RHEL-07-030820, RHEL-07-030830, RHEL-07-030840, RHEL-07-030870, RHEL-07-030880, RHEL-07-030890, RHEL-07-030900, RHEL-07-030910, RHEL-07-030920, RHEL-07-030321, RHEL-07-030871, RHEL-07-030872, RHEL-07-030873, RHEL-07-030874, RHEL-07-030819, RHEL-07-030821, RHEL-07-030200, RHEL-07-030201, RHEL-07-030210, RHEL-07-030211

Firewall Performance

The firewall functionality of RHEL is implemented using iptables. There have been numerous performance issues identified in iptables and so careful attention must be paid to the number of rules used. However, [1] requires “that the firewall must be configured to “system access control program must be configured to grant or deny system access to specific hosts and services”. As hardware continues to grow in capability it can support more and more services. As the number of services supported from a single machine grows so will the to enforce these requirements. Figure 3 shows the impact on throughput as the number of rules increase while Figure 4 details the impact on throughput of supporting more ports. The purple line in these graphs represent iptables used by RHEL 7.

Figure 3: iptables performance impact of rules Source: [8]

Figure 4 iptables performance impact with increasing ports Source [8]

Affected STIG ID:

RHEL-07-040100, RHEL-07-040520, RHEL-07-040810

Potential Denial of Service (DoS) Changes

There are several recommendations from [1] that open possible avenues of DoS.

Login DoS

[1] requires that accounts be locked out for a minimum of 15 minutes after three unsuccessful logon attempts within a 15-minute timeframe. Given the weakened security posture from using outdated password guidelines may be some justification for this rule but this setting opens user accounts to a simple DoS. Furthermore, this rule ultimately compromises availability of the CIA triad. The PCI-DSS standard [7] requires a user id to be locked out after no more than 6 attempts. The PCI-DSS standard seems more reasonable of a balance between confidentiality and availability. Another rule from [1] requires the root (superuser) account to also be locked out for 15 minutes after three unsuccessful logon attempts within a 15-minute time frame. This policy creates the possibility of a DoS to all system users with no ability to resolve it from within since only a superuser can unlock accounts.

Rather than lockout the root account, the pam_securetty module can be used to limit access to a console through out-of-band access using a Backplane Management Console (BMC) or Integrated Lights Out (ILO). With this additional security measure, the root account could be exempted from account lock out while still preserving security.

Affected STIG ID:

RHEL-07-010320, RHEL-07-010330, RHEL-07-010430, RHEL-07-040670

System DoS

There are several requirements in [1] that require home directories, /var, and /tmp file systems to all exist on separation partitions. The justification for this requirement is that it could protect the system if the partition “became full or failed”. The failure of the /var or /tmp file systems would prevent many system level daemons from functioning properly. Failures in /home would only cause user access DoS. However, these fail to account for the impact to system performance as a disk is partitioned up. On Solid State Devices (SSDs) there is no significant impact, but when utilizing Hard Disk Drives (HDDs) the impact can be dramatic. HDDs only support approximately 100-150 Input Output Operations (IOPs) per second where SSDs tend to support around 4000+ IOPS. When disks are partitioned many data transfers that could have been merged into a single IOP can no longer be merged due to their non-contiguous nature. The result is that a system disk can become saturated causing multiple second delays in responsiveness. Therefore, this recommendation should be considered only when the partitions are on SSDs or separate HDDs.

Affected STIG ID:

RHEL-07-021310, RHEL-07-021320, RHEL-07-021340

Other Items

There are several rules from [1] that are inaccurate in their justification and therefore should be corrected or removed.

Removing System Accounts

[1] states that “If the accounts on the system do not match the provided documentation, or accounts that do not support an authorized system function are present, this is a finding”. The recommendation is to remove several system accounts. However, these accounts come suitably protected by providing an unusable login shell and having a locked password. Therefore, there is no risk of unauthorized use of these accounts. In addition, these conflict with vendor guidance in [5] that states: It is recommended to keep UIDs/GIDs of the system service accounts as default. The UIDs/GIDs of some system service accounts are hard-coded in the application itself, changing the uids/gids may potentially break the functionality of an application. These accounts exist for compliance with the Linux Standard Base [9].

Affected STIG ID:

RHEL-07-020270

Interactive Users Must Have Home Directory in /etc/passwd

The discussion in [1] states that “This could create a Denial of Service because the user would not be able to access their logon configuration files, and it may give them visibility to system files they normally would not be able to access”. The first part of the statement is true but there is no risk of visibility into system files they would not normally have. This is because the Linux Discretionary Access Control (DAC) is still in use. As can be seen below from the login source code that if changing directory to the user’s home directory fails then the “/” directory is used:

util-linux-2.23.2/login-utils/login.c
1330         /* wait until here to change directory! */
1331         if (chdir(pwd->pw_dir) < 0) {
1332                 warn(_("%s: change directory failed"), pwd->pw_dir);
1333 
1334                 if (!getlogindefs_bool("DEFAULT_HOME", 1))
1335                         exit(0);
1336                 if (chdir("/"))
1337                         exit(EXIT_FAILURE);
1338                 pwd->pw_dir = "/";
1339                 printf(_("Logging in with home = \"/\".\n"));
1340         }

Also, relevant is the DEFAULT_HOME attribute can be set to “no” in the /etc/login.defs file to prevent login in the event that the user’s home directory is inaccessible.

Affected STIG ID:

RHEL-07-020620

Usability Items

Impact on Usability

Security must be a balance between the impact on performance and usability. Different groups have different numbers of personnel to be able to operate and maintain their systems and so a discussion is necessary about the STIG rules that impact efficient system administration. Efficient administration is the ability for a small number of administrators to operate and maintain a large number of systems. The following items detail various STIG rules and their impact on efficient administration. Delay between failed login attempts

[1] requires the addition of a 4 second delay be added to the Pluggable Authentication Modules (PAM) configuration for pam_faillock. This restriction however, ignores the fact that configuration provided by the vendor already provides a two second delay on failure on local password failures through the pam_unix module. The following source code section captures the respective section for pam_unix and highlights the fail delay setting.

Linux-PAM-1.1.8/modules/pam_unix/support.c
 721 int _unix_verify_password(pam_handle_t * pamh, const char *name
 722                           ,const char *p, unsigned int ctrl)
 723 {
 724         struct passwd *pwd = NULL;
 725         char *salt = NULL;
 726         char *data_name;
 727         int retval;
 728 
 729 
 730         D(("called"));
 731 
 732 #ifdef HAVE_PAM_FAIL_DELAY
 733         if (off(UNIX_NODELAY, ctrl)) {
 734                 D(("setting delay"));
 735                 (void) pam_fail_delay(pamh, 2000000);   /* 2 sec delay for on failure */

This setting combined with the idle connection closing only creates a nuisance for users and administrators logging into the system and has a negligible impact on security. To capture the additional security, see the following two approximate equations where the values are represented by the following:

900 is the number of seconds an account is locked after 3 failures

3 failures before an account is locked

1000 attempts to guess the password

6 is the number of seconds with the pam_faildelay requirement

2 is the number of seconds removing the pam_faildelay requirement

time no fail delay=((1000/3)*900)+((1000/3)*2)=300366 seconds or 3.476 days

time with fail delay=((1000/3*900)+((1000/3)*6)=301698 seconds or 3.49 days

The addition of the failure delay is clearly insubstantial and therefore should be removed from the requirements.

Affected STIG ID:

RHEL-07-010430

Remote privileged user access without password prompts

There are number of tools developed by High Performance Computing (HPC) administrators to effectively administer a large number of systems efficiently. These tools are often developed based on the need to administer thousands of systems by a small administration team. Two of the most common of these tools are Parallel Distributed Shell (pdsh) and Clustered Shell (clush). These tools work by opening secure shell connections to each server and running a command across hundreds of thousands of servers in just seconds. In order to make administrative changes efficiently administrators must be able to login as root (the superuser) or have the ability to run commands with superuser privileges using su or sudo. [1] requires that both remote root logins be disabled and su/sudo be disabled without a password. Preventing direct access to the root account is a considered a good security practice. This prevents users from logging in directly as a user other than their own and preserving non-repudiation. However, additional security restrictions can be added to restrict su/sudo to only users of a specific group. Using these productions allow privileged escalation while maintaining non-repudiation and increased security. To make these changes the su command can be restricted to the wheel group by adding highlighted line to /etc/pam.d/su.

#%PAM-1.0
auth		sufficient	pam_rootok.so
auth		required	pam_wheel.so use_uid group=wheel
auth		substack	system-auth
auth		include		postlogin
account		sufficient	pam_succeed_if.so uid = 0 use_uid quiet
account		include		system-auth
password	include		system-auth
session		include		system-auth
session		include		postlogin
session		optional	pam_xauth.so

Restricting sudo access to the wheel group can be done by adding the following to the /etc/sudoers file.

%wheel        ALL=(ALL)       NOPASSWD: ALL

Affected STIG ID:

RHEL-07-010340, RHEL-07-010350

Conclusion

Summary of findings

There are many issues identified in this report which should cause alterations of the DISA RHEL 7 STIG. To highlight the percentages of issues we have discussed here the following chart shows the percentage of those issues based on the 248 Rules in [1]. It is clear here that much consideration needs to be addressed to ensure that the Confidentiality, Integrity, and Availability are evaluated in a balanced manner. It appears as though the Availability portion of the three is not adequately being considered in many of the STIG rules identified here. Figure 5 captures the percentage of issues and acceptable rules identified in [1].

Figure 5 DISA RHEL 7 STIG Rules Breakdown

Recommendations

Based on the findings from source code reviews, updated guidance, and the various performance issues the following actions are recommended for updating the DISA RHEL 7 STIG:

Update all outdated password rules
Develop a memory and time hard PBKDF implementation (Balloon) for the GRUB2 bootloader
Develop a tool to monitor performance impacts of compliant auditing
Develop a tool to monitor performance impacts of compliant firewall rules
Utilize provided information for DoS issues to fix STIG rules
Remove the password fail delay requirement
Allow documented exceptions for privileged escalation without password for admin groups utilizing the documented security enhancements here

References

[1] Red Hat Enterprise Linux 7 Security Technical Implementation Guide, Version 2 Release 7, 2020.

[2] Digital Identity Guidelines, NIST SP 800-63, 2017.

[3] NIST SP 800-63 Digital Identity Guidelines-FAQ, pages.nist.gov. [Online]. Available: https://pages.nist.gov/800-63-FAQ/. [Accessed: 26-Apr-2020].

[4] “How do you set an interface to permanent promiscuous mode in RHEL 7?,” Red Hat Customer Portal, 18-Mar-2019. [Online]. Available: https://access.redhat.com/solutions/3525641. [Accessed: 09-May-2020].

[5] H. Chen, Y. Xiao, L. Zeng, “Auditing overhead, auditing adaptation, and benchmark evaluation in Linux,” Security and Communication Networks 2015, pp. 3523-2534, June 2015.

[6] S. Ma, J. Zhai, Y. Kwon, K. H. Lee, X. Zhang, G. Ciocarlie, A. Gehani, V. Yegneswaran, D. Xu, and S. Jha, “Kernel-supported cost-effective audit logging for causality tracking,” in Proc. 2018 USENIX Annual Technical Conference (ATC), Boston, MA, Jul. 2018

[7] “Official PCI Security Standards Council Site - Verify PCI Compliance, Download Data Security and Credit Card Security Standards,” PCI Security Standards Council. [Online]. Available: https://www.pcisecuritystandards.org/document_library. [Accessed: 09-May-2020].

[8] P. Sutter, “Benchmarking nftables,” Red Hat Developer, 18-Oct-2018. [Online]. Available: https://developers.redhat.com/blog/2017/04/11/benchmarking-nftables/. [Accessed: 08-May-2020].

[9] “Is it safe to remove/change system user account on Red Hat Enterprise Linux ?,” Red Hat Customer Portal, 22-Oct-2019. [Online]. Available: https://access.redhat.com/solutions/31669. [Accessed: 09-May-2020].

[10] Users and Groups, Linux Standard Base Core Specification 5.0, 2015.