A New and Updated Field Dictionary for Logging Standards

January 19, 2014

Category: Uncategorized — Raffael Marty @ 2:51 pm

If you have been interested and been following event interchange formats or logging standards, you know of CEF and CEE. Problem is that we lost funding for CEE, which doesn’t mean that CEE is dead! In fact, I updated the field dictionary to accommodate some more use-cases and data sources. The one currently published by CEE is horrible. Don’t use it. Use my new version!

Whether you are using CEE or any other logging standard for your message formatting, you will need a naming schema; a set of field names. In CEE we call that a field dictionary.

The problem with the currently published field dictionary of CEE is that it’s inconsistent, has duplicate field names, and is missing a bunch of field names that you commonly need. I updated and cleaned up the dictionary (see below or download it here.) Please email me with any feedback / updates / additions! This is by no means complete, but it’s a good next iteration to keep improving on! If you know and use CEF, you can use this new dictionary with it. The problem with CEF is that it has to use ArcSight’s very limited field schema. And you have to overload a bunch of fields. So, try using this schema instead!

I was emailing with my friend Jose Nazario the other day and realized that we never really published anything decent on the event taxonomy either. That’s going to be my next task to gather whatever I can find in notes and such to put together an updated version of the taxonomy with my latest thinking; which has emerged quite a bit in the last 12 years that I have been building event taxonomies (starting with the ArcSight categorization schema, Splunk’s Common Information Model, and then designing the CEE taxonomy). Stay tuned for that.

For reference purposes. Here are some spin-offs from CEE which have field dictionaries as well:

Project Lumberjack which has some field names.
SyslogNG PatternDB has a bunch of patterns and they also have a Schema.

Here is the new field dictionary:

Object	Field	Type	Description
	action	STRING	Action taken
	bytes_received	NUMBER	Bytes received
	bytes_sent	NUMBER	Bytes sent
	category	STRING	Log source assigned category of message
	cmd	STRING	Command
	duration	NUMBER	Duration in seconds
	host	STRING	Hostname of the event source
	in_interface	STRING	Inbound interface
	ip_proto	NUMBER	IP protocol field value (8=UDP, …)
	msg	STRING	The event message
	msgid	STRING	The event message identifier
	out_interface	STRING	Outbound interface
	packets_received	NUMBER	Number of packets received
	packets_sent	NUMBER	Number of packets sent
	reason	STRING	Reason for action taken or activity observed
	rule_number	STRING	Number of rule – firewalls, for example
	subsys	STRING	Application subsystem responsible for generating the event
	tcp_flags	STRING	TCP flags
	tid	NUMBER	Numeric thread ID associated with the process generating the event
	time	DATETIME	Event Start Time
	time_logged	DATETIME	Time log record was logged
	time_received	DATETIME	Time log record was received
	vend	STRING	Vendor of the event source application
app	name	STRING	Name of the application that generated the event
app	session_id	STRING	Session identifier from application
app	vend	STRING	Application vendor
app	ver	STRING	Application version
dst	country	STRING	Country name of the destination
dst	host	STRING	Network destination hostname
dst	ipv4	IPv4	Network destination IPv4 address
dst	ipv6	IPv6	Network destination IPv6 address
dst	nat_ipv4	IPv4	NAT IPv4 address of destination
dst	nat_ipv6	IPv6	NAT IPv6 destination address
dst	nat_port	NUMBER	NAT port number for destination
dst	port	NUMBER	Network destination port
dst	zone	STRING	Zone name for destination – examples: Bldg1, Europe
file	line	NUMBER	File line number
file	md5	STRING	File MD5 Hash
file	mode	STRING	File mode flags
file	name	STRING	File name
file	path	STRING	File system path
file	perm	STRING	File permissions
file	size	NUMBER	File size in bytes
http	content_type	STRING	MIME content type within HTTP
http	method	STRING	HTTP method – GET \| POST \| HEAD \| …
http	query_string	STRING	HTTP query string
http	request	STRING	HTTP request URL
http	request_protocol	STRING	HTTP protocol used
http	status	NUMBER	Return code in HTTP response
palo_alto	actionflags	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	config_version	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	cpadding	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	domain	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	log_type	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	padding	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	seqno	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	serial_number	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	threat_content_type	STRING	Palo Alto Networks Firewall Specific Field
palo_alto	virtual_system	STRING	Palo Alto Networks Firewall Specific Field
proc	id	STRING	Process ID (pid)
proc	name	STRING	Process name
proc	tid	NUMBER	Thread identifier of the process
src	country	STRING	Country name of the source
src	host	STRING	Network source hostname
src	ipv4	IPv4	Network source IPv4 address
src	ipv6	IPv6	Network source IPv6 address
src	nat_ipv4	IPv4	NAT IPv4 address of source
src	nat_ipv6	IPv6	NAT IPVv6 address
src	nat_port	NUMBER	NAT port number for source
src	port	NUMBER	Network source port
src	zone	STRING	Zone name for source – examples: Bldg1, Europe
syslog	fac	NUMBER	Syslog facility value
syslog	pri	NUMBER	Syslog priority value
syslog	pri	STRING	Event priority (ERROR\|WARN\|DEBUG\|CRIT)
syslog	sev	NUMBER	Event severity
syslog	tag	STRING	Syslog Tag value
syslog	ver	NUMBER	Syslog Protocol version (0=legacy/RFC3164; 1=RFC5424)
user	auid	STRING	Source User login authentication ID (login id)
user	domain	STRING	User account domain (NT Domain)
user	eid	STRING	Source user effective ID (euid)
user	gid	STRING	Group ID (gid)
user	group	STRING	Group name
user	id	STRING	User account ID (uid)
user	name	STRING	User account name

Comments (3)

« Using Impala and Parquet to Analyze Network Traffic – VAST 2013 Challenge AfterGlow 1.6.5 – Edge Labels »

3 Comments »

Instead of trying to fit many use cases into a complex dictionary: Why not define a really small set of required fields like a source, a message/description and a timestamp format and then letting people add anything they want as totally free fields?

What is the big benefit of a big dictionary?

Comment by Lennart Koopmann — January 20, 2014 @ 1:03 pm
The purpose of the dictionary is parsing and field semantics. If you let them chose things, you don’t know what the meaning of a fields is. And interoperability. If I chose “src_ip” and you chose “source_ip” and someone else chooses “ip_source_address”. Then we have to first normalize and make sure the semantics of all the fields is the same, etc.

In short: Interoperability!

Comment by Raffael Marty — January 20, 2014 @ 1:09 pm
+1 for interoperability , but is that actually required out there? (I really don’t know) Are there tools that are so bound to static field names?

Requiring field names from a dictionary sounds like either missing configuration options or a too static user interface.

Of course there is nothing wrong with having something like CEE in place – I am just afraid of making static UIs even more static by making them [protocol/dictionary]-compliant.

Comment by Lennart Koopmann — January 20, 2014 @ 1:15 pm

RSS feed for comments on this post. | TrackBack URI

Categories

Links

Archives

Search:

RSS-Feeds

A New and Updated Field Dictionary for Logging Standards

3 Comments »

Leave a comment