Now that we have created a new telemetry we can see how we can add new enrichments to that telemetry. In this exercise we will be looking at adding a whois enrichment to the Squid telemetry we setup in the previous entry. Whois data is expensive so we will not be providing it. Instead I wrote a basic whois scraper (out of context for this exercise) that produces a CSV format for whois data as follows:
google.com, "Google Inc.", "US", "Dns Admin",874306800000
work.net, "", "US", "PERFECT PRIVACY, LLC",788706000000
capitalone.com, "Capital One Services, Inc.", "US", "Domain Manager",795081600000
cnn.com, "Turner Broadcasting System, Inc.", "US", "Domain Name Manager",748695600000
news.com, "CBS Interactive Inc.", "US", "Domain Admin",833353200000
espn.com, "ESPN, Inc.", "US", "ESPN, Inc.",781268400000
pravda.com, "Internet Invest, Ltd. dba Imena.ua", "UA", "Whois privacy protection service",806583600000
hortonworks.com, "Hortonworks, Inc.", "US", "Domain Administrator",1303427404000
microsoft.com, "Microsoft Corporation", "US", "Domain Administrator",673156800000
yahoo.com, "Yahoo! Inc.", "US", "Domain Administrator",790416000000
cisco.com, "Cisco Technology Inc.", "US", "Info Sec",547988400000
rackspace.com, "Rackspace US, Inc.", "US", "Domain Admin",903092400000
The schema of this enrichment is domain|owner|registeredCountry|retisteredTimestamp. The first thing we need to do is setup the enrichment source. In order to do this we first need to setup an enrichment ingest config as so:
{
"config" : {
"columns" : {
"domain" : 0
,"owner" : 1
,"home_country" : 2
,"registrar": 3
,"domain_created_timestamp": 4
}
,"indicator_column" : "domain"
,"type" : "whois"
,"separator" : ","
}
,"extractor" : "CSV"
}