Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce archive.tira.io to prepare for usage in pyterrier-artifacts #671

Open
wants to merge 162 commits into
base: main
Choose a base branch
from

Conversation

mam10eks
Copy link
Member

@mam10eks mam10eks commented Dec 5, 2024

No description provided.

return JsonResponse(ret[0], safe=False)
else:
return HttpResponseNotFound(
json.dumps({"status": 1, "message": f"Could not find a software '{software}' by user '{user_id}'."})

Check warning

Code scanning / CodeQL

Reflected server-side cross-site scripting Medium

Cross-site scripting vulnerability due to a
user-provided value
.
Cross-site scripting vulnerability due to a
user-provided value
.

Copilot Autofix AI about 1 month ago

To fix the problem, we need to escape the user_id and software parameters before including them in the JSON response message. This can be done using the html.escape() function from the standard library to ensure that any special characters are properly escaped, preventing XSS attacks.

We will modify the code in the software_details function to escape the user_id and software parameters before including them in the JSON response message. This change will be made in the file application/src/tira_app/endpoints/v1/_systems.py.

Suggested changeset 1
application/src/tira_app/endpoints/v1/_systems.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/application/src/tira_app/endpoints/v1/_systems.py b/application/src/tira_app/endpoints/v1/_systems.py
--- a/application/src/tira_app/endpoints/v1/_systems.py
+++ b/application/src/tira_app/endpoints/v1/_systems.py
@@ -1,2 +1,3 @@
 import json
+import html
 
@@ -51,3 +52,3 @@
         return HttpResponseNotFound(
-            json.dumps({"status": 1, "message": f"Could not find a software '{software}' by user '{user_id}'."})
+            json.dumps({"status": 1, "message": f"Could not find a software '{html.escape(software)}' by user '{html.escape(user_id)}'."})
         )
EOF
@@ -1,2 +1,3 @@
import json
import html

@@ -51,3 +52,3 @@
return HttpResponseNotFound(
json.dumps({"status": 1, "message": f"Could not find a software '{software}' by user '{user_id}'."})
json.dumps({"status": 1, "message": f"Could not find a software '{html.escape(software)}' by user '{html.escape(user_id)}'."})
)
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options


def download_mirrored_resource(url: str, name: str):
response = requests.get(url)

Check failure

Code scanning / CodeQL

Full server-side request forgery Critical

The full URL of this request depends on a
user-provided value
.

Copilot Autofix AI 11 days ago

To fix the problem, we need to ensure that the URL provided by the user is properly validated and sanitized before being used in the requests.get call. One way to achieve this is to maintain a list of authorized URLs on the server and choose from that list based on the input provided. Alternatively, we can perform proper validation of the input to ensure it adheres to a specific format and domain.

In this case, we will implement a whitelist of allowed domains and validate the URL against this list before making the request. This will ensure that only URLs from trusted sources are used.

Suggested changeset 1
application/src/tira_app/endpoints/v1/_datasets.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/application/src/tira_app/endpoints/v1/_datasets.py b/application/src/tira_app/endpoints/v1/_datasets.py
--- a/application/src/tira_app/endpoints/v1/_datasets.py
+++ b/application/src/tira_app/endpoints/v1/_datasets.py
@@ -5,2 +5,3 @@
 import markdown
+from urllib.parse import urlparse
 import requests
@@ -101,3 +102,11 @@
 
+def is_valid_url(url: str) -> bool:
+    allowed_domains = ["zenodo.org"]
+    parsed_url = urlparse(url)
+    return any(domain in parsed_url.netloc for domain in allowed_domains)
+
 def download_mirrored_resource(url: str, name: str):
+    if not is_valid_url(url):
+        raise ValueError(f"Invalid URL: {url}")
+
     response = requests.get(url)
EOF
@@ -5,2 +5,3 @@
import markdown
from urllib.parse import urlparse
import requests
@@ -101,3 +102,11 @@

def is_valid_url(url: str) -> bool:
allowed_domains = ["zenodo.org"]
parsed_url = urlparse(url)
return any(domain in parsed_url.netloc for domain in allowed_domains)

def download_mirrored_resource(url: str, name: str):
if not is_valid_url(url):
raise ValueError(f"Invalid URL: {url}")

response = requests.get(url)
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add EntryPoints compatible with the (currently in alpha development) PyTerrier Artifacts API
3 participants