Saturday, August 13, 2022

WHOIS, TLS, and Recon

While doing reconnaissance against web applications, I wanted to speed up the process of finding new attack surfaces that some subdomain tools might miss.

At first I was interested in quickly verifying particular records with WHOIS. So, I naïvely wrote some rust to do faster WHOIS lookups. The toy model I drafted initially looked something like this:

use whois_rust::*;
use std::io::prelude::*;
use std::fs::File;
use std::io::BufReader;
use futures::stream::StreamExt;


// todo: add parser for file and key input
// use clap::Parser;
// use clap::Arg;
// bash time: 2.79s user 5.84s system 1% cpu 7:15.93 total
// rust time: 1.14s user 0.58s system 6% cpu 25.685 total


fn read_batch(path: &str) -> std::io::Result<Vec<String>> {
    let file = File::open(path)?;
    let read = BufReader::new(file);
    Ok(
        read.lines().filter_map(Result::ok).collect()
    )
}


#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>>  {
    let hay: Vec = read_batch("batch.txt")?;
    let who = WhoIs::from_path_async("whois/servers.json").await.unwrap();
    let key = "SEEK LIMITED";
    let hay = futures::stream::iter(
    hay.into_iter().map(|h| {
        let res = who.lookup_async(WhoIsLookupOptions::from_string(&h).unwrap());
        async move {
            match res.await {
                Ok(res) if res.contains(&key) => {
                    println!("Found {} in record for {}", key, h);
                },
                _ => {
                //   println!("Error on {}", h);
            }
        }
    }
})).buffer_unordered(100).collect::>();
    hay.await;
    Ok(())
}

Then after a few runs, I remembered something I'd almost forgotten—most WHOIS servers impose strict rate limits. This is unfortunate because WHOIS is a great protocol for reconnaissance. One trick I've seen my peers do is spin up a bunch of servers using something like Axiom, to do lookups while spreading the workload across several machines in the cloud, therefore effectively getting around rate limits.

But after pausing to think about it for a moment, I considered maybe it'd be better to take an alternative approach. Instead of trying to identify server ownership and do recon with WHOIS or dig records, why not look for attack surface we might have missed via TLS certificates? Transport Layer Security certificates are very similar to WHOIS records. Structured around the x.509 standard, TLS certificates contain vast amounts of metadata about servers and organizations, as well as additional arrays of associated domain names that some subdomain tools might miss. While Transport Layer Security is useful for securing connections, it's also useful from a reconnaissance standpoint.

So, I was going to write a TLS scraper in rust. But in the interest of saving time, decided to glance around on GitHub for pre-existing TLS utilities. Then I noticed that Project Discovery actually has a tool known as tlsx. It's a modest and speedy utility written in Go for parsing Transport Layer Security certificates. A snippet relevant to our cause:

// a snippet from Project Discovery's tlsx tool written in golang

func (c *Client) convertCertificateToResponse(hostname string, cert *x509.Certificate) *clients.CertificateResponse {
	response := &clients.CertificateResponse{
		SubjectAN:  cert.DNSNames,
		Emails:     cert.EmailAddresses,
		NotBefore:  cert.NotBefore,
		NotAfter:   cert.NotAfter,
		Expired:    clients.IsExpired(cert.NotAfter),
		SelfSigned: clients.IsSelfSigned(cert.AuthorityKeyId, cert.SubjectKeyId),
		MisMatched: clients.IsMisMatchedCert(hostname, append(cert.DNSNames, cert.Subject.CommonName)),
		IssuerCN:   cert.Issuer.CommonName,
		IssuerOrg:  cert.Issuer.Organization,
		SubjectCN:  cert.Subject.CommonName,
		SubjectOrg: cert.Subject.Organization,
		FingerprintHash: clients.CertificateResponseFingerprintHash{
			MD5:    clients.MD5Fingerprint(cert.Raw),
			SHA1:   clients.SHA1Fingerprint(cert.Raw),
			SHA256: clients.SHA256Fingerprint(cert.Raw),

Perhaps in the near future I'll write a rust version of this, as well as a WHOIS utility, and publish them both to Github. Anyways, we can use TLS certificates to potentially glean attack surfaces we might overlook with other tools, and blazingly fast, since it's only pulling the certificates rather than parsing entire HTTP responses. For example, we first pull down an initial list of subdomains. We can then pipe that list to tlsx. We'll use the -j flag to get JSON output, as well as the -tc flag to grab the full certificate chain. And last, we'll pipe this to jq to get a nice, formatted output.

$ cat subdomains.txt | tlsx -silent -tc -j | jq
{
  "timestamp": "2022-08-13T12:41:43.416781-04:00",
  "host": "attacksurface1.com",
  "ip": "51.112.140.109",
  "port": "443",
  "probe_status": true,
  "tls_version": "tls12",
  "cipher": "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
  "not_before": "2022-04-04T00:00:00Z",
  "not_after": "2023-04-24T23:59:59Z",
  "subject_dn": "CN=attacksurface1.com, O=Attack Surfaces\\, Inc., L=Dallas, ST=Texas, C=US",
  "subject_cn": "attacksurface1.com",
  "subject_org": [
    "Attack Surface, Inc."
  ],
  "subject_an": [
    "weird-yet-related-surface-we-may-have-missed.com",
    "attacksurface1.com",
    "attacksurface2.com",
    "attacksurface3.com"
  ],
  "issuer_dn": "CN=DigiCert TLS RSA SHA256 2020 CA1, O=DigiCert Inc, C=US",
  "issuer_cn": "DigiCert TLS RSA SHA256 2020 CA1",
  "issuer_org": [
    "DigiCert Inc"
  ],
  "fingerprint_hash": {
    "md5": "1b1ee212cafa166eb0bf7d6a38587304",
    "sha1": "e3bb3275b48c685a9c1e6c47fcce6d510c9d1881",
    "sha256": "5966a2b933bb1633714454442e86e93c72b093018ec6ca11f26f9d88b8793bcc"
  },
  "tls_connection": "ctls",
  "sni": "attacksurface1.com"
}

But the records we're really interested in are values in the subject_an field, Subject Alternative Names used by TLS certs for when hosts use a single certificate to secure multiple domains. Here, we might find domains relevant to our interests that we might not see from our initial analysis. But we need to use jq, our favorite JSON parser, to grab just the subject_an field—then clean entries that might contain wildcard prefixes whilst preserving the suffix—and lastly, filter for only unique values to avoid duplicate entries. A nice one-liner to pipe targets into:

tlsx -tc -j -silent | jq -r '.subject_an[]?' | awk 'gsub(/\*./,X) 1' | sort -u

No comments:

Post a Comment