Queries
You can query KubeHound data stored in the JanusGraph database by using the Gremlin query language.
Basic queries
g.V().has("class","Container").outE().inV().has("class","Node").path()
g.V().has("class","Volume").has("type", "HostPath").groupCount().by("sourcePath")
g.E().has("class","EXPLOIT_HOST_READ", "EXPLOIT_HOST_WRITE").outV().groupCount().by("sourcePath")
// Leveraging the "EndpointExposureType" enum value to filter only on services
// c.f. https://github.com/DataDog/KubeHound/blob/main/pkg/kubehound/models/shared/constants.go
g.V().has("class","Endpoint").has("exposure", 3).groupCount().by("serviceEndpoint")
Basic attack paths
g.V().has("class","Endpoint").repeat(out().simplePath()).until(has("class","Node")).path()
g.V().has("class","Container").repeat(out().simplePath()).until(has("class","Node").or().loops().is(5)).has("class","Node").path()
g.V().has("class","Identity").repeat(out().simplePath()).until(has("critical", true).or().loops().is(6)).has("critical", true).path().limit(5)
Attack paths from compromised assets
Containers
g.V().has("class","Container").has("name", "nsenter-pod").repeat(out().simplePath()).until(has("critical", true).or().loops().is(10)).has("critical", true).path()
g.V().has("class","Container").has("image", TextP.containing("malicious-image")).repeat(out().simplePath()).until(has("critical", true).or().loops().is(10)).has("critical", true).path()
Credentials
g.V().has("class","Identity").has("name", "compromised-sa").repeat(out().simplePath()).until(has("critical", true).or().loops().is(10)).has("critical", true).path()
Endpoints
g.V().has("class","Endpoint").repeat(out().simplePath()).until(has("critical", true).or().loops().is(6)).has("critical", true).path().limit(5)
g.V().has("class","Endpoint").has("portName", "jmx").repeat(out().simplePath()).until(has("critical", true).or().loops().is(6)).has("critical", true).path().limit(5)
Risk assessment
g.V().has("class","Endpoint").has("exposure", gte(3)).repeat(out().simplePath()).until(has("critical", true).or().loops().is(7)).has("critical", true).path().count(local).min()
// Leveraging the "EndpointExposureType" enum value to filter only on services
// c.f. https://github.com/DataDog/KubeHound/blob/main/pkg/kubehound/models/shared/constants.go
// Base case
g.V().has("class","Endpoint").has("exposure", gte(3)).count()
// Has a critical path
g.V().has("class","Endpoint").has("exposure", gte(3)).where(repeat(out().simplePath()).until(has("critical", true).or().loops().is(10)).has("critical", true).limit(1)).count()
CVE impact assessment
You can also use KubeHound to determine if workloads in your cluster may be vulnerable to a specific vulnerability.
First, evaluate if a known vulnerable image is running in the cluster:
g.V().has("class","Container").has("image", TextP.containing("elasticsearch")).groupCount().by("image")
Then, check any exposed services that could be affected and have a path to a critical asset. This helps prioritizing patching and remediation.
g.V().has("class","Container").has("image", "dockerhub.com/elasticsearch:7.1.4").where(inE("ENDPOINT_EXPLOIT").outV().has("exposure", gte(3))).where(repeat(out().simplePath()).until(has("critical", true).or().loops().is(10)).has("critical", true).limit(1))
Assessing the value of implementing new security controls
To verify concrete impact, this can be achieved by comparing the difference in the key risk metrics above, before and after the control change. To simulate the impact of introducing a control (e.g to evaluate ROI), we can add conditions to our path queries. For example if we wanted to evaluate the impact of adding a gatekeeper rule that would deny the use of hostPID
we can use the following:
// Calculate the base case
g.V().has("class","Endpoint").has("exposure", gte(3)).repeat(out().simplePath()).until(has("critical", true).or().loops().is(6)).has("critical", true).path().count()
// Calculate the impact of preventing CE_NSENTER attack
g.V().has("class","Endpoint").has("exposure", gte(3)).repeat(outE().not(has("class","CE_NSENTER")).inV().simplePath()).emit().until(has("critical", true).or().loops().is(6)).has("critical", true).path().count()
// We count the number of instances of unique attack paths using
g.V().has("class","Container").repeat(outE().inV().simplePath()).emit()
.until(has("critical", true).or().loops().is(6)).has("critical", true)
.path().by(label).groupCount().order(local).by(select(values), desc)
/* Sample output:
{
"path[Container, IDENTITY_ASSUME, Identity, PERMISSION_DISCOVER, PermissionSet, TOKEN_LIST, Identity, PERMISSION_DISCOVER, PermissionSet, TOKEN_LIST, Identity, PERMISSION_DISCOVER, PermissionSet]" : 191,
"path[Container, CE_SYS_PTRACE, Node, VOLUME_EXPOSE, Volume, TOKEN_STEAL, Identity, PERMISSION_DISCOVER, PermissionSet, TOKEN_LIST, Identity, PERMISSION_DISCOVER, PermissionSet]" : 48,
"path[Container, IDENTITY_ASSUME, Identity, PERMISSION_DISCOVER, PermissionSet, TOKEN_BRUTEFORCE, Identity, PERMISSION_DISCOVER, PermissionSet, TOKEN_LIST, Identity, PERMISSION_DISCOVER, PermissionSet]" : 48,
...
}
*/
Threat modelling
g.V().has("class","Container", "Identity")
.repeat(out().simplePath())
.until(has("name", "cluster-admin").or().loops().is(5))
.has("name", "cluster-admin").has("class","Role").path().as("p").by(label).dedup().select("p").path()
g.V().has("class","Container", "Identity")
.repeat(out().simplePath())
.until(has("critical", true).or().loops().is(5))
.has("critical", true).path().as("p").by(label).dedup().select("p").path()
Tips for writing queries
To get started with Gremlin, have a look at the following tutorials:
- Gremlin basics by Daniel Kuppitz
- Gremlin advanced by Daniel Kuppitz
For large clusters it is recommended to add a limit()
step to all queries where the graph output will be examined in the UI to prevent overloading it. An example looking for attack paths possible from a sample of 5 containers would look like:
Additional tips:
- For queries to be displayed in the UI, try to limit the output to 1000 elements or less
- Enable
large cluster optimizations
via configuration file if queries are returning too slowly - Try to filter the initial element of queries by namespace/service/app to avoid generating too many results, for instance
g.V().has("class","Container").has("namespace", "your-namespace")