January 19, 2014

A New and Updated Field Dictionary for Logging Standards

Category: Uncategorized — Raffael Marty @ 2:51 pm

If you have been interested and been following event interchange formats or logging standards, you know of CEF and CEE. Problem is that we lost funding for CEE, which doesn’t mean that CEE is dead! In fact, I updated the field dictionary to accommodate some more use-cases and data sources. The one currently published by CEE is horrible. Don’t use it. Use my new version!

Whether you are using CEE or any other logging standard for your message formatting, you will need a naming schema; a set of field names. In CEE we call that a field dictionary.

The problem with the currently published field dictionary of CEE is that it’s inconsistent, has duplicate field names, and is missing a bunch of field names that you commonly need. I updated and cleaned up the dictionary (see below or download it here.) Please email me with any feedback / updates / additions! This is by no means complete, but it’s a good next iteration to keep improving on! If you know and use CEF, you can use this new dictionary with it. The problem with CEF is that it has to use ArcSight’s very limited field schema. And you have to overload a bunch of fields. So, try using this schema instead!

I was emailing with my friend Jose Nazario the other day and realized that we never really published anything decent on the event taxonomy either. That’s going to be my next task to gather whatever I can find in notes and such to put together an updated version of the taxonomy with my latest thinking; which has emerged quite a bit in the last 12 years that I have been building event taxonomies (starting with the ArcSight categorization schema, Splunk’s Common Information Model, and then designing the CEE taxonomy). Stay tuned for that.

For reference purposes. Here are some spin-offs from CEE which have field dictionaries as well:

Here is the new field dictionary:

Object Field Type Description
action STRING Action taken
bytes_received NUMBER Bytes received
bytes_sent NUMBER Bytes sent
category STRING Log source assigned category of message
cmd STRING Command
duration NUMBER Duration in seconds
host STRING Hostname of the event source
in_interface STRING Inbound interface
ip_proto NUMBER IP protocol field value (8=UDP, …)
msg STRING The event message
msgid STRING The event message identifier
out_interface STRING Outbound interface
packets_received NUMBER Number of packets received
packets_sent NUMBER Number of packets sent
reason STRING Reason for action taken or activity observed
rule_number STRING Number of rule – firewalls, for example
subsys STRING Application subsystem responsible for generating the event
tcp_flags STRING TCP flags
tid NUMBER Numeric thread ID associated with the process generating the event
time DATETIME Event Start Time
time_logged DATETIME Time log record was logged
time_received DATETIME Time log record was received
vend STRING Vendor of the event source application
app name STRING Name of the application that generated the event
app session_id STRING Session identifier from application
app vend STRING Application vendor
app ver STRING Application version
dst country STRING Country name of the destination
dst host STRING Network destination hostname
dst ipv4 IPv4 Network destination IPv4 address
dst ipv6 IPv6 Network destination IPv6 address
dst nat_ipv4 IPv4 NAT IPv4 address of destination
dst nat_ipv6 IPv6 NAT IPv6 destination address
dst nat_port NUMBER NAT port number for destination
dst port NUMBER Network destination port
dst zone STRING Zone name for destination – examples: Bldg1, Europe
file line NUMBER File line number
file md5 STRING File MD5 Hash
file mode STRING File mode flags
file name STRING File name
file path STRING File system path
file perm STRING File permissions
file size NUMBER File size in bytes
http content_type STRING MIME content type within HTTP
http method STRING HTTP method – GET | POST | HEAD | …
http query_string STRING HTTP query string
http request STRING HTTP request URL
http request_protocol STRING HTTP protocol used
http status NUMBER Return code in HTTP response
palo_alto actionflags STRING Palo Alto Networks Firewall Specific Field
palo_alto config_version STRING Palo Alto Networks Firewall Specific Field
palo_alto cpadding STRING Palo Alto Networks Firewall Specific Field
palo_alto domain STRING Palo Alto Networks Firewall Specific Field
palo_alto log_type STRING Palo Alto Networks Firewall Specific Field
palo_alto padding STRING Palo Alto Networks Firewall Specific Field
palo_alto seqno STRING Palo Alto Networks Firewall Specific Field
palo_alto serial_number STRING Palo Alto Networks Firewall Specific Field
palo_alto threat_content_type STRING Palo Alto Networks Firewall Specific Field
palo_alto virtual_system STRING Palo Alto Networks Firewall Specific Field
proc id STRING Process ID (pid)
proc name STRING Process name
proc tid NUMBER Thread identifier of the process
src country STRING Country name of the source
src host STRING Network source hostname
src ipv4 IPv4 Network source IPv4 address
src ipv6 IPv6 Network source IPv6 address
src nat_ipv4 IPv4 NAT IPv4 address of source
src nat_ipv6 IPv6 NAT IPVv6 address
src nat_port NUMBER NAT port number for source
src port NUMBER Network source port
src zone STRING Zone name for source – examples: Bldg1, Europe
syslog fac NUMBER Syslog facility value
syslog pri NUMBER Syslog priority value
syslog pri STRING Event priority (ERROR|WARN|DEBUG|CRIT)
syslog sev NUMBER Event severity
syslog tag STRING Syslog Tag value
syslog ver NUMBER Syslog Protocol version (0=legacy/RFC3164; 1=RFC5424)
user auid STRING Source User login authentication ID (login id)
user domain STRING User account domain (NT Domain)
user eid STRING Source user effective ID (euid)
user gid STRING Group ID (gid)
user group STRING Group name
user id STRING User account ID (uid)
user name STRING User account name


  1. Instead of trying to fit many use cases into a complex dictionary: Why not define a really small set of required fields like a source, a message/description and a timestamp format and then letting people add anything they want as totally free fields?

    What is the big benefit of a big dictionary?

    Comment by Lennart Koopmann — January 20, 2014 @ 1:03 pm

  2. The purpose of the dictionary is parsing and field semantics. If you let them chose things, you don’t know what the meaning of a fields is. And interoperability. If I chose “src_ip” and you chose “source_ip” and someone else chooses “ip_source_address”. Then we have to first normalize and make sure the semantics of all the fields is the same, etc.

    In short: Interoperability!

    Comment by Raffael Marty — January 20, 2014 @ 1:09 pm

  3. +1 for interoperability , but is that actually required out there? (I really don’t know) Are there tools that are so bound to static field names?

    Requiring field names from a dictionary sounds like either missing configuration options or a too static user interface.

    Of course there is nothing wrong with having something like CEE in place – I am just afraid of making static UIs even more static by making them [protocol/dictionary]-compliant.

    Comment by Lennart Koopmann — January 20, 2014 @ 1:15 pm

RSS feed for comments on this post. | TrackBack URI

Leave a comment

XHTML ( You can use these tags): <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> .