-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Rules] Azure OpenAI #3701
base: main
Are you sure you want to change the base?
[New Rules] Azure OpenAI #3701
Conversation
rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml
Outdated
Show resolved
Hide resolved
rules/integrations/azure_openai/azure_openai_denial_of_ml_service_detection.toml
Outdated
Show resolved
Hide resolved
interval = "10m" | ||
language = "esql" | ||
license = "Elastic License v2" | ||
name = "Azure OpenAI Insecure Output Handling Detection" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name = "Azure OpenAI Insecure Output Handling Detection" | |
name = "Azure OpenAI Insecure Output Handling" |
interval = "10m" | ||
language = "esql" | ||
license = "Elastic License v2" | ||
name = "Potential Azure OpenAI Model Theft Detection" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name = "Potential Azure OpenAI Model Theft Detection" | |
name = "Potential Azure OpenAI Model Theft" |
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ChatCompletions_Create" | ||
| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id | ||
| where count > 1000 OR avg_request_size > 5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer lowercase, but its purely stylistic at this point, but at a minimum, we should be consistent where possible
| where count > 1000 OR avg_request_size > 5000 | |
| where count > 1000 or avg_request_size > 5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion : add a comment to explain the size is it in kb or bytes, also use >=
(to trigger also on 1000 and 5000)
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit" | ||
| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name | ||
| where count > 100 OR max_data_transferred > 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| where count > 100 OR max_data_transferred > 1000000 | |
| where count > 100 or max_data_transferred > 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 100 and 100000 are the max threshold then maybe use >=
vs >
also add a comment to expland data size unit
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ChatCompletions_Create" | ||
| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id | ||
| where count > 1000 OR avg_request_size > 5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion : add a comment to explain the size is it in kb or bytes, also use >=
(to trigger also on 1000 and 5000)
query = ''' | ||
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ChatCompletions_Create" | ||
| stats count = count(), avg_request_size = avg(azure.open_ai.properties.request_length) by azure.resource.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is azure.resource.id
is specific/unique to the user/source of the API calls ? would be ideal to aggregate by a field that can be used for attribution/further investigations to triage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ also helps with FP from collisions/RC
1001 users making 1 request each in an hour may be normal
query = ''' | ||
from logs-azure_openai.logs-* | ||
| where azure.open_ai.properties.response_length == 0 and azure.open_ai.result_signature == "200" and azure.open_ai.operation_name == "ChatCompletions_Create" | ||
| stats count = count() by azure.resource.id, azure.open_ai.operation_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a field that can be used in the by
aggregation to attribute it to a specific user.id or equivalent ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also are there ECS compatible alternatives for any of these? Can you share some data / docs of these events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also are there ECS compatible alternatives for any of these? Can you share some data / docs of these events
|
||
https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/stream-monitoring-data-event-hubs | ||
""" | ||
severity = "low" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this behavior is rare than maybe bump up severity
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit" | ||
| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name | ||
| where count > 100 OR max_data_transferred > 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if 100 and 100000 are the max threshold then maybe use >=
vs >
also add a comment to expland data size unit
query = ''' | ||
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ListKey" and azure.open_ai.category == "Audit" | ||
| stats count = count(), max_data_transferred = max(azure.open_ai.properties.response_length) by azure.open_ai.properties.model_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by
should ideally include a field that can be used by the user to further investigate (e.g. the unique user ID or source of the API calls) to triage FPs and add exclusions.
[rule] | ||
author = ["Elastic"] | ||
description = """ | ||
Detects patterns indicative of Denial of Service attacks on ML models, focusing on unusually high volume and frequency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detects patterns indicative of Denial of Service attacks on ML models, focusing on unusually high volume and frequency | |
Detects patterns indicative of Denial-of-Service (DoS) attacks on machine learning (ML) models, focusing on unusually high volume and frequency |
"Domain: LLM", | ||
"Data Source: Azure OpenAI", | ||
"Data Source: Azure Event Hubs", | ||
"Use Case: Insecure Output Handling" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific MITRE ATLAS tag here?
[rule] | ||
author = ["Elastic"] | ||
description = """ | ||
Monitors for suspicious activities that may indicate theft or unauthorized duplication of ML models, such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monitors for suspicious activities that may indicate theft or unauthorized duplication of ML models, such as | |
Monitors for suspicious activities that may indicate theft or unauthorized duplication of machine learning (ML) models, such as |
Converting to draft until the integration includes the advanced logging fields. |
|
||
query = ''' | ||
from logs-azure_openai.logs-* | ||
| where azure.open_ai.operation_name == "ChatCompletions_Create" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to Self
azure.open_ai.properties.operation_id == "ChatCompletions_Create" in the latest Mappings of the integration azure.open_ai.operation_name is mapped to "Microsoft.ApiManagement/GatewayLogs"
azure.open_ai.properties.request_length is not mapped but we do have azure.open_ai.properties.response_length in the document mapping.
Issues
Related to
Summary
Here is round 2 of our detection engineering within the LLM and AI ecosystem feature Azure OpenAI. The elastic/integrations#9706 just merged and so we can start to develop detection rules. Note: This experimental integration does not yet include the Advanced Logging feature that includes fields like prompt and completion, nor does it yet include our proposed
gen_ai.*
fields. They are expected to be added later.Details
Here are three ESQL rules highlighting the available dataset.