Uncategorized – Cyber Security - Strategy and Innovation

August 28, 2024

Leadership | Technology | Spirit

Category: Uncategorized — Raffael Marty @ 7:41 am

Who knows, I might just pick up my blogging again at some point. For now, I posted a short leadership related post on my Leadership | Technology | Spirit blog. Check it out.

Comments (0)

November 27, 2022

*NIX Command Line Foo

Category: Uncategorized,UNIX Scripting — Raffael Marty @ 11:28 am

Well, not one of my normal blog posts, but I hope some of you geeks out there will find this useful anyways. I will definitely use this post as a reference frequently.

I have been using various flavors of UNIX and their command lines from ksh to bash and zsh for over 25 years and there is always something new to learn to make me faster at the jobs I am doing. One tool that I keep using (despite my growing command of Excel), is VIM coupled with UNIX command line tools. It saves me hours and hours of work all the time.

Well, here are some new things I learned and want to remember from the Well, here are some new things I learned and want to remember from the art of command line github repo:

CTRL-W on the command line deletes the last word
pgrep to search for processes rather than doing the longer version with awk
lsof -iTCP -sTCP:LISTEN -P -n processes listening on TCP ports
Diff two json files: diff <(jq --sort-keys . < file1.json) <(jq --sort-keys . < file2.json) | colordiff | less -R
I totally forgot about csvkit – brew install csvkit
- in2csv file1.xls > file1.csv
- csvstat data.csv
- csvsql --query "select name from data where age > 30" data.csv > old.csv

I just found some additional command son OSX that I wish I had known earlier:

ditto copies one or more source files or directories to a destination directory. If the destination directory does not exist it will be created before the first source is copied. If the destination directory already exists then the source directories are merged with the previous contents of the destination.
pbcopy past data from command line into the clipboard
qlmanage quick view from the command line

This is a great repo as well for great OSX commands.

Comments (0)

April 14, 2015

Rockstars Use a Good Text Editor – I Use VIM

Category: Uncategorized — Raffael Marty @ 9:13 am

Those of you who know me most likely know that I am quite the VIM fan. At any time, there is at least one VIM window open on my computer. I just like the speed of editing and the flexibility it offers. I even use VI bindings in my UNIX shells (set -o vi). And yes, I did write my book in VIM.

Anyways, here is a command from my .vimrc file that I use a lot:

command F set guifont=Monaco:h13

Basically, if I type “:F”, it makes my font larger. I know, not earth shattering, but really useful.

Here are a couple esthetic things I like to make my VIM look nice:

set background=dark
colorscheme solarized
set guioptions=-m

This is my complete .vimrc file.

Comments (0)

February 16, 2015

Big Data Lake – Leveraging Big Data Technologies To Build a Common Data Repository For Security

Category: Uncategorized — Raffael Marty @ 11:50 am

Information security has been dealing with terabytes of data for over a decade; almost two. Companies of all sizes are realizing the benefit of having more data available to not only conduct forensic investigations, but also pro-actively find anomalies and stop adversaries before they cause any harm.

UPDATE: Download the paper here

I am finalizing a paper on the topic of the security big data lake. I should have the full paper available soon. As a teaser, here are the first two sections:

What Is a Data Lake?

The term data lake comes from the big data community and starts appearing in the security field more often. A data lake (or a data hub) is a central location where all security data is collected and stored. Sounds like log management or security information and event management (SIEM)? Sure. Very similar. In line with the Hadoop big data movement, one of the objectives is to run the data lake on commodity hardware and storage that is cheaper than special purpose storage arrays, SANs, etc. Furthermore, the lake should be accessible by third-party tools, processes, workflows, and teams across the organization that need the data. Log management tools do not make it easy to access the data through standard interfaces (APIs). They also do not provide a way to run arbitrary analytics code against the data.

Just because we mentioned SIEM and data lakes in the same sentence above does not mean that a data lake is a replacement for a SIEM. The concept of a data lake merely covers the storage and maybe some of the processing of data. SIEMs are so much more.

Why Implementing a Data Lake?

Security data is often found stored in multiple copies across a company. Every security product collects and stores its own copy of the data. For example, tools working with network traffic (e.g., IDS/IPS, DLP, forensic tools) monitor, process, and store their own copies of the traffic. Behavioral monitoring, network anomaly detection, user scoring, correlation engines, etc. all need a copy of the data to function. Every security solution is more or less collecting and storing the same data over and over again, resulting in multiple data copies.

The data lake tries to rid of this duplication by collecting the data once and making it available to all the tools and products that need it. This is much simpler said than done. The core of this document is to discuss the issues and approaches around the lake.

To summarize, the four goals of the data lake are:

One way (process) to collect all data
Process, clean, enrich the data in one location
Store data once
Have a standard interface to access the data

One of the main challenges with this approach is how to make all the security products leverage the data lake instead of collecting and processing their own data. Mostly this means that products have to be rebuilt by the vendors to do so.

Have you implemented something like this? Email me or put a comment on the blog. I’d love to hear your experience. And stay tuned for the full paper!

Comments (0)

January 19, 2014

A New and Updated Field Dictionary for Logging Standards

Category: Uncategorized — Raffael Marty @ 2:51 pm

If you have been interested and been following event interchange formats or logging standards, you know of CEF and CEE. Problem is that we lost funding for CEE, which doesn’t mean that CEE is dead! In fact, I updated the field dictionary to accommodate some more use-cases and data sources. The one currently published by CEE is horrible. Don’t use it. Use my new version!

Whether you are using CEE or any other logging standard for your message formatting, you will need a naming schema; a set of field names. In CEE we call that a field dictionary.

The problem with the currently published field dictionary of CEE is that it’s inconsistent, has duplicate field names, and is missing a bunch of field names that you commonly need. I updated and cleaned up the dictionary (see below or download it here.) Please email me with any feedback / updates / additions! This is by no means complete, but it’s a good next iteration to keep improving on! If you know and use CEF, you can use this new dictionary with it. The problem with CEF is that it has to use ArcSight’s very limited field schema. And you have to overload a bunch of fields. So, try using this schema instead!

I was emailing with my friend Jose Nazario the other day and realized that we never really published anything decent on the event taxonomy either. That’s going to be my next task to gather whatever I can find in notes and such to put together an updated version of the taxonomy with my latest thinking; which has emerged quite a bit in the last 12 years that I have been building event taxonomies (starting with the ArcSight categorization schema, Splunk’s Common Information Model, and then designing the CEE taxonomy). Stay tuned for that.

For reference purposes. Here are some spin-offs from CEE which have field dictionaries as well:

Project Lumberjack which has some field names.
SyslogNG PatternDB has a bunch of patterns and they also have a Schema.

Here is the new field dictionary:

Object	Field	Type	Description
	action	STRING	Action taken
	bytes_received	NUMBER	Bytes received
	bytes_sent	NUMBER	Bytes sent
	category	STRING	Log source assigned category of message
	cmd	STRING	Command
	duration	NUMBER	Duration in seconds
	host	STRING	Hostname of the event source
	in_interface	STRING	Inbound interface
	ip_proto	NUMBER	IP protocol field value (8=UDP, …)
	msg	STRING	The event message
	msgid	STRING	The event message identifier
	out_interface	STRING	Outbound interface
	packets_received	NUMBER	Number of packets received
	packets_sent	NUMBER	Number of packets sent
	reason	STRING	Reason for action taken or activity observed
	rule_number	STRING	Number of rule – firewalls, for example
	subsys	STRING	Application subsystem responsible for generating the event
	tcp_flags	STRING	TCP flags
	tid	NUMBER	Numeric thread ID associated with the process generating the event
	time	DATETIME	Event Start Time
	time_logged	DATETIME	Time log record was logged
	time_received	DATETIME	Time log record was received
	vend	STRING	Vendor of the event source application
app	name	STRING	Name of the application that generated the event
app	session_id	STRING	Session identifier from application
app	vend	STRING	Application vendor
app	ver	STRING	Application version
dst	country	STRING	Country name of the destination
dst	host	STRING	Network destination hostname
dst	ipv4	IPv4	Network destination IPv4 address
dst	ipv6	IPv6	Network destination IPv6 address
dst	nat_ipv4	IPv4	NAT IPv4 address of destination
dst	nat_ipv6	IPv6	NAT IPv6 destination address
dst	nat_port	NUMBER	NAT port number for destination
dst	port	NUMBER	Network destination port
dst	zone	STRING	Zone name for destination – examples: Bldg1, Europe
file	line	NUMBER	File line number
file	md5	STRING	File MD5 Hash
file	mode	STRING	File mode flags
file	name	STRING	File name
file	path	STRING	File system path
file	perm	STRING	File permissions
file	size	NUMBER	File size in bytes
http	content_type	STRING	MIME content type within HTTP
http	method	STRING	HTTP method – GET \| POST \| HEAD \| …
http	query_string	STRING	HTTP query string
http	request	STRING	HTTP request URL
http	request_protocol	STRING	HTTP protocol used
http	status	NUMBER	Return code in HTTP response
palo_alto	actionflags	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	config_version	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	cpadding	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	domain	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	log_type	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	padding	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	seqno	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	serial_number	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	threat_content_type	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	virtual_system	STRING	Palo Alto Networks Firewall Specific Field
proc	id	STRING	Process ID (pid)
proc	name	STRING	Process name
proc	tid	NUMBER	Thread identifier of the process
src	country	STRING	Country name of the source
src	host	STRING	Network source hostname
src	ipv4	IPv4	Network source IPv4 address
src	ipv6	IPv6	Network source IPv6 address
src	nat_ipv4	IPv4	NAT IPv4 address of source
src	nat_ipv6	IPv6	NAT IPVv6 address
src	nat_port	NUMBER	NAT port number for source
src	port	NUMBER	Network source port
src	zone	STRING	Zone name for source – examples: Bldg1, Europe
syslog	fac	NUMBER	Syslog facility value
syslog	pri	NUMBER	Syslog priority value
syslog	pri	STRING	Event priority (ERROR\|WARN\|DEBUG\|CRIT)
syslog	sev	NUMBER	Event severity
syslog	tag	STRING	Syslog Tag value
syslog	ver	NUMBER	Syslog Protocol version (0=legacy/RFC3164; 1=RFC5424)
user	auid	STRING	Source User login authentication ID (login id)
user	domain	STRING	User account domain (NT Domain)
user	eid	STRING	Source user effective ID (euid)
user	gid	STRING	Group ID (gid)
user	group	STRING	Group name
user	id	STRING	User account ID (uid)
user	name	STRING	User account name

Comments (3)

November 11, 2010

Applied Security Visualization – Book Video

Category: Uncategorized — Raffael Marty @ 3:57 pm

It’s been a while since I wrote “Applied Security Visualization“. Here is an older video that I just came about. A good overview of the book. Enjoy!

Comments (1)

September 4, 2010

Logging Formats and Standards

Category: Uncategorized — Raffael Marty @ 11:24 am

cee working group I have discussed the topic of logging standards multiple times on this blog. Some recent developments in the logging space urged me to give an update and provide my opinion:

Yet another vendor just released a “standard” log format (note the quotes around standard). It’s called UCF, the Universal Collection Framework™ (UCF). This is how the vendor describes it:

UCF is the first WAN-aware, store-and-forward, encrypted, compressed IT data transport. It allows customers to gather IT data, increase resilience, reduce network chatter and encrypt from almost any device, anywhere, quickly and easily. UCF leverages a new transport and store protocol that LogLogic intends to open source in the near future.

Sounds a whole lot like syslog. (syslog-ng and rsyslog seem to support exactly this!) Okay, let’s just look at this description: WAN aware? What the heck is that supposed to mean? You mean it won’t work well on a LAN? Does that mean it knows the Internets? That’s just a strange description to start with. Oh, and it’s the first property mentioned! The rest of the description sounds like a transport protocol. Interesting. Why not stick with syslog that is well known, has proven to work, and has integration libraries built already. I never understood why vendors implemented their own transport protocols. They are hard (very hard) to implement and even harder for producers and consumers to adopt to. Oh well.

When people talk about UCF, they keep bringing up ArcSight’s CEF. Well, I am greatly responsible for that specification. But guess what? It’s not a transport protocol! It’s a syntax definition. It tells a log producer how to format their log file. Not how to transport it. Because, there is always syslog that a lot of machines have installed already and it’s easy to use. (And in newer versions you get encryption, caching, etc.).

Now, my last point about standards. Why do vendors keep trying to come up with standards by themselves? It just doesn’t make any sense. How is going to adapt it? At ArcSight, about 4 years ago, we came up with CEF because CEE didn’t move fast enough and we wanted something that our partners could easily use. An analyst wrote that ArcSight is planning to take CEF to the IETF. I hope they are not going to do that. I don’t have any control over that anymore, but that would be stupid. We rather push CEE through IETF. If you have a chance, compare the CEE syntax proposal with CEF. Notice something? Yes. It’s very similar. Again, I might have had something to do with that. Anyways. Vendors should not define logging standards!

On a good note: CEE is moving forward and just released the architecture overview for public commentary. Check them out!

Comments (1)

June 28, 2010

All the Data That’s Fit to Visualize

Category: Log Analysis,Security Information Management,Uncategorized,Visualization — Raffael Marty @ 10:29 am

Last week I posted the introductionary video for a talk that I gave at Source Boston in 2008. I just found the entire video of that talk. Enjoy:

Talk by Raffael Marty:

With the ever-growing amount of data collected in IT environments, we need new methods and tools to deal with them. Event and Log Analysis is becoming one of the main tools for analysts to investigate and comprehend the state of their networks, hosts, applications, and business processes. Recent developments, such as regulatory compliance and an increased focus on insider threat have increased the demand for analytical tools to help in the process. Visualization is offering a new, more effective, and simpler approach to data analysis. To date, security visualization, has mostly failed to deliver effective tools and methods. This presentation will show what the New York Times has to teach us about effective visualizations. Visualization for the masses and not visualization for the experts. Insider Threat, Governance, Risk, and Compliance (GRC), and Perimeter Threat all require effective visualization methods and they are right in front of us – in the newspaper.

Comments (1)

December 1, 2009

Applied Security Visualization Book seen in Singapore

Category: Uncategorized — Raffael Marty @ 5:50 pm

A friend just sent me couple of pictures he took in a bookstore in Singapore.

singapore_1

singapore_2

Have you seen the book Applied Security Visualization on the shelf at your local book store? If so, send me a picture and I will post it…

Comments (1)

February 17, 2009

Security Visualization and Log Analysis Workshop – Sign up now!

Category: Uncategorized — Raffael Marty @ 10:32 pm

“Log Analysis and Security Visualization” is a two-day training class held on March 9th and 10th 2009 in Boston during the SOURCE Boston conference that addresses the data management and analysis challenges of today’s IT environments.
Students will leave this class with the knowledge to visualize and manage their own IT data. They will learn the basics of log analysis, learn about common data sources, get an overview of visualization techniques, and learn how to generate visual representations of IT data for a number of different use-cases from DoS and worm detection to compliance reporting. The training is filled with hands-on exercises utilizing DAVIX, the open-source data analysis and visualization platform.

Register today to secure your spot.

Comments (0)

Categories

Links

Archives

Search:

RSS-Feeds

Leadership | Technology | Spirit

*NIX Command Line Foo

Rockstars Use a Good Text Editor – I Use VIM

Big Data Lake – Leveraging Big Data Technologies To Build a Common Data Repository For Security

What Is a Data Lake?

Why Implementing a Data Lake?

A New and Updated Field Dictionary for Logging Standards

Applied Security Visualization – Book Video

Logging Formats and Standards

All the Data That’s Fit to Visualize

Applied Security Visualization Book seen in Singapore

Security Visualization and Log Analysis Workshop – Sign up now!