Yara is a powerful engine that helps to identify malware (mostly).
Below is the short description of the engine by its developers taken from their github:
YARA is a tool aimed at (but not limited to) helping malware researchers to identify and classify malware samples. With YARA you can create descriptions of malware families (or whatever you want to describe) based on textual or binary patterns. Each description, a.k.a rule, consists of a set of strings and a boolean expression which determine its logic.
In the Screener, Yara is used to analyze *.eml files.
In case if the received report is *.msg, it'll be automatically converted into *.eml.
The most interesting part of the page is the Rule Constructor. It is separated into Condition Bloks. Each block has:
|Field||Condition||Value or Field|
|Delivered To||Equals To||Delivered To|
|Subject||Not Equals To||Subject|
|Actual Sender Email||Actual Sender Email|
|Visible Sender Email||Visible Sender Email|
|Recipient Email||Recipient Email|
|Origin Server Hostname||It can be configured to have any string or digital value|
In case if multiple expressions are set in the same block. a "Rule Operator" option will appear.
The created Conditional Block will assign some spam score to an incident only if any of the conditions matched\all of the conditions matched.
The rule will work only if both Conditional Blocks are triggered, since it is set to "All".
Conditional Block #1 will be triggered if any of the conditions are triggered, which are Subject equals to "Ipad" and the email is not from "email@example.com".
Conditional Block #2 will be triggered if the only condition is true, the email is purposed to a guy with email "firstname.lastname@example.org".
As the result, the constucture has generated the following Yara rule:
( eml.subject == "Ipad" or eml.from != "email@example.com") and ( eml.to == "firstname.lastname@example.org")
Let us add a Yara rule that would be pretty useful.
In this example, we will create a dictionary of trusted domain names, then we will use the dictionary to give a negative spam score to an incident.
Set it to be Enabled, Set the name, and input the Yara Rule below.
for any i in (0..fr.lines["MyCompany.dict"].lines_count) : (fr.lines["MyCompany.dict"].data[i] == eml.get_domain_from_email(eml.actual_sender_email))
This means that for (cycle) any line from 0 to maximum of existing lines in the dictionary (lines_count) there should be a comparison, if a line matches with the domain name in the sender email domain (eml.get_domain_from_email(eml.actual_sender_email)))
Where MyCompany.dict is the name of the dictionary.
The important trick, set the score to be a negative value, for example -20.
With this configuration set all of the emails from domains that are in the dictionary will be considered as safe and the negative value will cover the false positives. The category is up to you. It'll be assigned to the incident in the overview section.
It can be seen that the category is assigned according to the internal rules, including the "safe" one.
Preconfigured Yara rule in step 2 gave the incident -20 scores.
So from now on it is enough to add a domain name into the dictionary we created for it automatically considered as safe.
We use the "eml" module for the Yara engine. It is designed to process eml files. It can be found in the Yara rule example below:
eml.get_domain_from_email(eml.headers["message-id"]) != eml.get_domain_from_email(eml.headers["from"])
It is possible to build your own complex Yara rules with its models and endpoints available in the table below:
|delivered_to||string||Extracts information from the header "Delivered_to"|
|message_id||string||Extracts information from the header "Message_id"|
|subject||string||Extracts information from the header "Subject|
|date||string||Extracts information from the header "Date"|
|x_mailer||string||Extracts information from the header "X_mailer"|
|from||string||Extracts information from the header "From"|
|to||string||Extracts information from the header "To"|
|cc||string||Extracts information from the header "CC"|
|bcc||string||Extracts information from the header "bcc|
|actual_sender_email||string||Extracts information from the email regarding the True Sender|
|visible_sender_email||string||Extracts information about the Visible Sender Email|
|recipient_email||string||Extracts information regarding the recipient email address|
|origin_server_hostname||string||Extracts information regarding server hostname|
|headers||string_dictionary||example: eml.headers["headername"]="custom text"|
|number_of_domains||integer||Number of domain names in the email|
|domains||string_array||The domain names in the email|
|number_of_ip_addresses||integer||Number of IP addresses in the email|
|ip_addresses||string_array||The IP addresses in the email|
|number_of_http_links||integer||Number of HTTP links in the email|
|http_links||struct (link, text)||example: eml.http_links.text="domain" and eml.http_links.link="https://domain.com". It can be used together or separately|
|"capture"||string||Example: eml.capture("hello (world)", "hey hello world yeah") == "world", Example 2: eml.capture("hello (world)") == "world"||Parses a particular text from an email|
|"get_domain_from_link"||String||Example: eml.get_domain_from_link("https://google.com/test") == "google.com" and eml.get_domain_from_link("http://amazon.com") == "amazon.com"||Parse a domain name from a link inside of email|
|"get_domain_from_email"||String||Example: eml.get_domain_from_email("email@example.com") == "gmail.com" and eml.get_domain_from_email("firstname.lastname@example.org") == "live.com"||Parse domain name from the email|
|"get_ip"||String||eml.get_ip("localhost") == "127.0.0.1" and eml.get_ip("google.com") == "220.127.116.11"||The function returns 1 IP from the resolve request|
|"domain_registration_days"||Integer||eml.domain_registration_days("google.com") != 0||Gets the amount of days that has passed after the domain registration date|
We also use fr (file-read) module that is designed to process dictionaries in the Screener.
|files_count||integer||The number of active dictionaries|
|files||string array||Names of the active dictionaries|
|lines||struct||fr.lines["Dictionary_name.dict"].lines_count == 10 and fr.lines["Dictionary_name_2.dict"].lines_count == 11
Or, it is possible to look for a particular line
fr.lines["Dictionary_name.dict"].data == "value_on_1st_string" and fr.lines["Dictionary_name.dict"].data == "value_on_a_string_9" and fr.lines["Dictionary_name.dict"].data == "value_on_a_string_4"