apiVersion: fileintegrity.openshift.io/v1alpha1
kind: FileIntegrity
metadata:
name: worker-fileintegrity
namespace: openshift-file-integrity
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
config: {}
The File Integrity Operator is an OKD Operator that continually runs file integrity checks on the cluster nodes. It deploys a daemon set that initializes and runs privileged advanced intrusion detection environment (AIDE) containers on each node, providing a status object with a log of files that are modified during the initial run of the daemon set pods.
Currently, only Fedora CoreOS (FCOS) nodes are supported. |
An instance of a FileIntegrity
custom resource (CR) represents a set of continuous file integrity scans for one or more nodes.
Each FileIntegrity
CR is backed by a daemon set running AIDE on the nodes matching the FileIntegrity
CR specification.
Create the following example FileIntegrity
CR named worker-fileintegrity.yaml
to enable scans on worker nodes:
apiVersion: fileintegrity.openshift.io/v1alpha1
kind: FileIntegrity
metadata:
name: worker-fileintegrity
namespace: openshift-file-integrity
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
config: {}
Apply the YAML file to the openshift-file-integrity
namespace:
$ oc apply -f worker-fileintegrity.yaml -n openshift-file-integrity
Confirm the FileIntegrity
object was created successfully by running the following command:
$ oc get fileintegrities -n openshift-file-integrity
NAME AGE
worker-fileintegrity 14s
The FileIntegrity
custom resource (CR) reports its status through the .status.phase
subresource.
To query the FileIntegrity
CR status, run:
$ oc get fileintegrities/worker-fileintegrity -o jsonpath="{ .status.phase }"
Active
Pending
- The phase after the custom resource (CR) is created.
Active
- The phase when the backing daemon set is up and running.
Initializing
- The phase when the AIDE database is being reinitialized.
The scan results of the FileIntegrity
CR are reported in another object called FileIntegrityNodeStatuses
.
$ oc get fileintegritynodestatuses
NAME AGE
worker-fileintegrity-ip-10-0-130-192.ec2.internal 101s
worker-fileintegrity-ip-10-0-147-133.ec2.internal 109s
worker-fileintegrity-ip-10-0-165-160.ec2.internal 102s
It might take some time for the |
There is one result object per node. The nodeName
attribute of each FileIntegrityNodeStatus
object corresponds to the node being scanned. The
status of the file integrity scan is represented in the results
array, which holds scan conditions.
$ oc get fileintegritynodestatuses.fileintegrity.openshift.io -ojsonpath='{.items[*].results}' | jq
The fileintegritynodestatus
object reports the latest status of an AIDE run and exposes the status as Failed
, Succeeded
, or Errored
in a status
field.
$ oc get fileintegritynodestatuses -w
NAME NODE STATUS
example-fileintegrity-ip-10-0-134-186.us-east-2.compute.internal ip-10-0-134-186.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-150-230.us-east-2.compute.internal ip-10-0-150-230.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-169-137.us-east-2.compute.internal ip-10-0-169-137.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-180-200.us-east-2.compute.internal ip-10-0-180-200.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-194-66.us-east-2.compute.internal ip-10-0-194-66.us-east-2.compute.internal Failed
example-fileintegrity-ip-10-0-222-188.us-east-2.compute.internal ip-10-0-222-188.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-134-186.us-east-2.compute.internal ip-10-0-134-186.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-222-188.us-east-2.compute.internal ip-10-0-222-188.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-194-66.us-east-2.compute.internal ip-10-0-194-66.us-east-2.compute.internal Failed
example-fileintegrity-ip-10-0-150-230.us-east-2.compute.internal ip-10-0-150-230.us-east-2.compute.internal Succeeded
example-fileintegrity-ip-10-0-180-200.us-east-2.compute.internal ip-10-0-180-200.us-east-2.compute.internal Succeeded
These conditions are reported in the results array of the corresponding FileIntegrityNodeStatus
CR status:
Succeeded
- The integrity check passed; the files and directories covered by the AIDE check have not been modified since the database was last initialized.
Failed
- The integrity check failed; some files or directories covered by the AIDE check have been modified since the database was last initialized.
Errored
- The AIDE scanner encountered an internal error.
[
{
"condition": "Succeeded",
"lastProbeTime": "2020-09-15T12:45:57Z"
}
]
[
{
"condition": "Succeeded",
"lastProbeTime": "2020-09-15T12:46:03Z"
}
]
[
{
"condition": "Succeeded",
"lastProbeTime": "2020-09-15T12:45:48Z"
}
]
In this case, all three scans succeeded and so far there are no other conditions.
To simulate a failure condition, modify one of the files AIDE tracks. For example, modify /etc/resolv.conf
on one of the worker nodes:
$ oc debug node/ip-10-0-130-192.ec2.internal
Creating debug namespace/openshift-debug-node-ldfbj ...
Starting pod/ip-10-0-130-192ec2internal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.130.192
If you don't see a command prompt, try pressing enter.
sh-4.2# echo "# integrity test" >> /host/etc/resolv.conf
sh-4.2# exit
Removing debug pod ...
Removing debug namespace/openshift-debug-node-ldfbj ...
After some time, the Failed
condition is reported in the results array of the corresponding FileIntegrityNodeStatus
object. The previous Succeeded
condition is retained, which allows you to pinpoint the time the check failed.
$ oc get fileintegritynodestatuses.fileintegrity.openshift.io/worker-fileintegrity-ip-10-0-130-192.ec2.internal -ojsonpath='{.results}' | jq -r
Alternatively, if you are not mentioning the object name, run:
$ oc get fileintegritynodestatuses.fileintegrity.openshift.io -ojsonpath='{.items[*].results}' | jq
[
{
"condition": "Succeeded",
"lastProbeTime": "2020-09-15T12:54:14Z"
},
{
"condition": "Failed",
"filesChanged": 1,
"lastProbeTime": "2020-09-15T12:57:20Z",
"resultConfigMapName": "aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed",
"resultConfigMapNamespace": "openshift-file-integrity"
}
]
The Failed
condition points to a config map that gives more details about what exactly failed and why:
$ oc describe cm aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed
Name: aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed
Namespace: openshift-file-integrity
Labels: file-integrity.openshift.io/node=ip-10-0-130-192.ec2.internal
file-integrity.openshift.io/owner=worker-fileintegrity
file-integrity.openshift.io/result-log=
Annotations: file-integrity.openshift.io/files-added: 0
file-integrity.openshift.io/files-changed: 1
file-integrity.openshift.io/files-removed: 0
Data
integritylog:
------
AIDE 0.15.1 found differences between database and filesystem!!
Start timestamp: 2020-09-15 12:58:15
Summary:
Total number of files: 31553
Added files: 0
Removed files: 0
Changed files: 1
---------------------------------------------------
Changed files:
---------------------------------------------------
changed: /hostroot/etc/resolv.conf
---------------------------------------------------
Detailed information about changes:
---------------------------------------------------
File: /hostroot/etc/resolv.conf
SHA512 : sTQYpB/AL7FeoGtu/1g7opv6C+KT1CBJ , qAeM+a8yTgHPnIHMaRlS+so61EN8VOpg
Events: <none>
Due to the config map data size limit, AIDE logs over 1 MB are added to the failure config map as a base64-encoded gzip archive. In this case, you want to pipe the output of the above command to base64 --decode | gunzip
. Compressed logs are indicated by the presence of a file-integrity.openshift.io/compressed
annotation key in the config map.
Transitions in the status of the FileIntegrity
and FileIntegrityNodeStatus
objects are logged by events. The creation time of the event reflects the latest transition, such as Initializing
to Active
, and not necessarily the latest scan result. However, the newest event always reflects the most recent status.
$ oc get events --field-selector reason=FileIntegrityStatus
LAST SEEN TYPE REASON OBJECT MESSAGE
97s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Pending
67s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Initializing
37s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Active
When a node scan fails, an event is created with the add/changed/removed
and config map information.
$ oc get events --field-selector reason=NodeIntegrityStatus
LAST SEEN TYPE REASON OBJECT MESSAGE
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-134-173.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-168-238.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-169-175.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-152-92.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-158-144.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-131-30.ec2.internal
87m Warning NodeIntegrityStatus fileintegrity/example-fileintegrity node ip-10-0-152-92.ec2.internal has changed! a:1,c:1,r:0 \ log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed
Changes to the number of added, changed, or removed files results in a new event, even if the status of the node has not transitioned.
$ oc get events --field-selector reason=NodeIntegrityStatus
LAST SEEN TYPE REASON OBJECT MESSAGE
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-134-173.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-168-238.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-169-175.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-152-92.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-158-144.ec2.internal
114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-131-30.ec2.internal
87m Warning NodeIntegrityStatus fileintegrity/example-fileintegrity node ip-10-0-152-92.ec2.internal has changed! a:1,c:1,r:0 \ log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed
40m Warning NodeIntegrityStatus fileintegrity/example-fileintegrity node ip-10-0-152-92.ec2.internal has changed! a:3,c:1,r:0 \ log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed