AWS Lambda for Flow Logs Processing | By Bo Bayles.
A previous post described how to use a pipeline with CloudWatch Logs, Kinesis, and Lambda to process virtual private cloud (VPC) flow logs. This post will describe a simple Lambda function for processing VPC flow logs.
Amazon Web Services (AWS) provides a good walkthrough for writing Lambda functions for processing data delivered by Amazon Kinesis. Building on that example, we can develop a function that can read VPC flow logs in the following few steps.
This function will take each IP address in the flow logs’ stream, count how many other IP addresses each saw in a time period, and post the results to an web endpoint:
At CSWC, we use Python extensively, and since Lambda’s newest supported language is also Python, we’ll write this function in that language.
We’ll start the process by interpreting the data Kinesis has provided. Each event contains a list of records. Those records contain Base64-encoded data, which represents a zip-compressed JSON stream.
We want to wind up with a Python dictionary, so we decode, uncompress, and parse using functions from the Python standard library:
records = event.get('Records', [ ])
for record in records:
compressed_json = b64decode(record['kinesis']['data'])
uncompressed_json = decompress(compressed_json, 16 + MAX_WBITS)
input_data = loads(uncompressed_json)
After parsing the record’s data, we can separate and read the VPC flow log event details. We can parse them according to the AWS docs, or use the flowlogs-reader library.
For this function, we’ll parse them directly, pulling out the IP addresses and timestamp:
When all the data has been collected, we can post the results to our web application:
# Declared earlier:
# ENDPOINT = 'https://example.com/connection_sets/{bucket_time:}'
for bucket_time in sorted(connection_data.iterkeys()):
output_url = ENDPOINT.format(bucket_time=bucket_time)
output_data = {k: len(v) for k, v in connection_data[bucket_time]}
requests.post(output_url, data=dumps(output_data))
Our web application will receive JSON data grouped by 10-minute intervals. A message might look like this:
Experience Dynamic Endpoint Modeling on your own network
Getting better visibility into your network and improving your security couldn’t be easier. Sign up for a free, no-risk trialof Observable’s Endpoint Modeling solution today, and change the way you see security.
Cisco Stealthwatch Cloud | 230 S. Bemiston Ave, Suite 420 | Saint Louis, MO 63105 | swatchc-support@cisco.com
Cisco Stealthwatch Cloud | 230 S. Bemiston Ave, Suite 420 | Saint Louis, MO 63105 | (314) 899-9284